Data Governance

The set of principles, questions, and design decisions that determine how data is collected, stored, processed, shared, and deleted — and who is responsible when things go wrong.


What is it?

Data governance is the discipline of making deliberate, informed decisions about data before you write a line of code. It sits at the intersection of law, ethics, and architecture. Every application that touches data — which is every application — makes governance decisions, whether consciously or not.

Most developers encounter governance as a list of regulations to comply with: GDPR, FADP, CCPA. But regulations are just the formalised expression of deeper principles. The principles came first. Understanding them means you can navigate any jurisdiction, including ones whose laws haven’t been written yet.

The core insight is this: data is not neutral. Every piece of data has an origin, a subject, a purpose, and a lifecycle. Governance is the practice of respecting all four. When you store a user’s email, you’re not just writing a string to a database — you’re entering a relationship with obligations attached.1

For a developer building applications, data governance answers one meta-question: “Should I be doing this with this data, and if so, how?”

In plain terms

Data governance is like building codes for data. Just as a building code ensures a house is safe, accessible, and up to standard before anyone moves in, data governance ensures your application handles data safely, transparently, and lawfully before any user trusts it with their information.


At a glance


How does it work?

Data governance operates through a set of interlocking principles. Each principle asks a different question about your relationship with data.

The five foundational questions

Before building any feature that touches data, a developer should be able to answer:

#QuestionPrinciple
1Can I collect this?personal-data-protection — Is there a legal basis?
2How little can I collect?privacy-by-design — Minimise by default
3Where does it come from?data-provenance — Source, licence, rights chain
4Could it identify someone?re-identification-risk — Even “anonymous” data can expose people
5What happens when it goes wrong?intermediary-liability, ai-content-liability — Who is responsible?

Think of it like a supply chain

Just as a restaurant must know where its ingredients come from (provenance), store them at the right temperature (protection), serve safe portions (minimisation), and take responsibility if someone gets sick (liability) — a developer must apply the same rigour to data.

The regulatory landscape

These principles are codified differently across jurisdictions, but the underlying logic is remarkably consistent:

PrincipleEU (GDPR)Switzerland (nDSG)USAInternational
Legal basis for processingArt. 6Art. 4, 7Varies by stateCoE Convention Art. 5
Data minimisationArt. 5(1)(c)Art. 4CCPA (limited)OECD Guidelines
TransparencyArt. 13-14Art. 13FTC Act § 5CoE Convention Art. 8
Privacy by designArt. 25Art. 22ISO 31700
Impact assessmentArt. 35Art. 16CoE Convention Art. 10

Concept to explore

See personal-data-protection for a deep dive into how different legal frameworks implement the same core principles.

From principles to architecture

Governance principles translate directly into architectural decisions:

PrincipleArchitectural implication
Data minimisationDon’t store what you don’t need. Design schemas around purpose, not convenience.
Purpose limitationSeparate data by purpose. A marketing database and an analytics database may need different access controls.
Storage limitationBuild retention policies into your data layer, not as an afterthought.
TransparencyDesign APIs that can explain what data you hold about a person and why.
AccountabilityLog processing activities. Build audit trails from day one.

Why do we use it?

Key reasons

1. Legal compliance. Data protection laws exist in virtually every jurisdiction. Building without governance awareness means building liability into your product.

2. Trust. Users share data with applications they trust. A transparent, privacy-respecting design earns that trust. A data breach or misuse destroys it.

3. Better architecture. Governance constraints produce cleaner systems. Data minimisation leads to simpler schemas. Purpose limitation leads to better separation of concerns. Retention policies prevent unbounded storage growth.

4. Future-proofing. Regulations are tightening worldwide. An application built on governance principles today won’t need a costly retrofit when the next law passes.


When do we use it?

  • When designing any feature that collects, stores, or processes user data
  • When aggregating data from external sources (APIs, public datasets, scraped content)
  • When building AI features that generate content or make recommendations
  • When your application acts as an intermediary between users (messaging, sharing, forwarding)
  • When combining datasets that could, together, identify individuals
  • When entering a new market or jurisdiction with different data protection laws

Rule of thumb

If your feature touches data that a person could care about — their name, their location, their behaviour, their words — governance applies. The question is never “does governance apply?” but “which principles are most relevant here?”


How can I think about it?

The building inspector analogy

Imagine you’re constructing a building. You wouldn’t wait until tenants move in to check if the wiring is safe or if the fire exits work. A building inspector reviews plans before construction begins and checks compliance during the build.

Data governance is your building inspector for software. The principles (fire safety = data protection, structural integrity = data quality, accessibility = transparency) are checked at design time, not after launch. A DPIA is your pre-construction safety review. Privacy by design is your building code.

The trust contract analogy

Every time a user gives you data, they’re signing an invisible contract: “I trust you with this for a specific reason.” Data governance is the practice of honouring that contract.

  • Purpose limitation = “I gave you my email for login, not for marketing.”
  • Data minimisation = “I told you my city, not my street address — don’t ask for more than you need.”
  • Storage limitation = “When I close my account, delete my data — don’t keep it ‘just in case’.”
  • Transparency = “Tell me what you’re doing with my data in language I can understand.”

Breaking any clause breaks trust. And unlike a legal contract, users don’t need to sue you — they just leave.


Concepts to explore next

ConceptWhat it coversStatus
personal-data-protectionLegal bases for processing, data subject rights, GDPR/nDSG principlescomplete
privacy-by-designBuilding privacy into architecture from the startcomplete
data-provenanceTracking data origin, rights, and licensingcomplete
re-identification-riskWhen “anonymous” data can identify individualscomplete
intermediary-liabilityWhen platforms become responsible for user contentcomplete
ai-content-liabilityWho is responsible when AI generates harmful or wrong contentcomplete
algorithmic-transparencyMaking recommendation systems explainable and faircomplete
data-protection-impact-assessmentFormal risk assessment before processing sensitive datacomplete

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.


Check your understanding


Where this concept fits

Position in the knowledge graph

graph TD
    A[Privacy and Security] --> B[Data Governance]
    B --> C[Personal Data Protection]
    B --> D[Privacy by Design]
    B --> E[Data Provenance]
    B --> F[Re-identification Risk]
    B --> G[Intermediary Liability]
    B --> H[AI Content Liability]
    B --> I[Algorithmic Transparency]
    B --> J[DPIA]
    style B fill:#4a9ede,color:#fff

Related concepts:

  • software-architecture — governance principles constrain and improve architectural decisions
  • databases — where most governed data physically lives
  • apis — the interfaces through which data enters and leaves your system
  • separation-of-concerns — governance reinforces separating data by purpose

Sources


Further reading

Resources

Footnotes

  1. Snowflake. (2026). Data Governance Best Practices & Principles. Snowflake.