Data Governance
The set of principles, questions, and design decisions that determine how data is collected, stored, processed, shared, and deleted — and who is responsible when things go wrong.
What is it?
Data governance is the discipline of making deliberate, informed decisions about data before you write a line of code. It sits at the intersection of law, ethics, and architecture. Every application that touches data — which is every application — makes governance decisions, whether consciously or not.
Most developers encounter governance as a list of regulations to comply with: GDPR, FADP, CCPA. But regulations are just the formalised expression of deeper principles. The principles came first. Understanding them means you can navigate any jurisdiction, including ones whose laws haven’t been written yet.
The core insight is this: data is not neutral. Every piece of data has an origin, a subject, a purpose, and a lifecycle. Governance is the practice of respecting all four. When you store a user’s email, you’re not just writing a string to a database — you’re entering a relationship with obligations attached.1
For a developer building applications, data governance answers one meta-question: “Should I be doing this with this data, and if so, how?”
In plain terms
Data governance is like building codes for data. Just as a building code ensures a house is safe, accessible, and up to standard before anyone moves in, data governance ensures your application handles data safely, transparently, and lawfully before any user trusts it with their information.
At a glance
The data governance mental model (click to expand)
graph TD A[Data Governance] --> B[Personal Data Protection] A --> C[Privacy by Design] A --> D[Data Provenance] A --> E[Re-identification Risk] A --> F[Intermediary Liability] A --> G[AI Content Liability] A --> H[Algorithmic Transparency] A --> I[DPIA] style A fill:#4a9ede,color:#fffKey: Each child represents a principle that informs architectural decisions. Together they form a mental checklist for any data-touching feature.
How does it work?
Data governance operates through a set of interlocking principles. Each principle asks a different question about your relationship with data.
The five foundational questions
Before building any feature that touches data, a developer should be able to answer:
| # | Question | Principle |
|---|---|---|
| 1 | Can I collect this? | personal-data-protection — Is there a legal basis? |
| 2 | How little can I collect? | privacy-by-design — Minimise by default |
| 3 | Where does it come from? | data-provenance — Source, licence, rights chain |
| 4 | Could it identify someone? | re-identification-risk — Even “anonymous” data can expose people |
| 5 | What happens when it goes wrong? | intermediary-liability, ai-content-liability — Who is responsible? |
Think of it like a supply chain
Just as a restaurant must know where its ingredients come from (provenance), store them at the right temperature (protection), serve safe portions (minimisation), and take responsibility if someone gets sick (liability) — a developer must apply the same rigour to data.
The regulatory landscape
These principles are codified differently across jurisdictions, but the underlying logic is remarkably consistent:
| Principle | EU (GDPR) | Switzerland (nDSG) | USA | International |
|---|---|---|---|---|
| Legal basis for processing | Art. 6 | Art. 4, 7 | Varies by state | CoE Convention Art. 5 |
| Data minimisation | Art. 5(1)(c) | Art. 4 | CCPA (limited) | OECD Guidelines |
| Transparency | Art. 13-14 | Art. 13 | FTC Act § 5 | CoE Convention Art. 8 |
| Privacy by design | Art. 25 | Art. 22 | — | ISO 31700 |
| Impact assessment | Art. 35 | Art. 16 | — | CoE Convention Art. 10 |
Concept to explore
See personal-data-protection for a deep dive into how different legal frameworks implement the same core principles.
From principles to architecture
Governance principles translate directly into architectural decisions:
| Principle | Architectural implication |
|---|---|
| Data minimisation | Don’t store what you don’t need. Design schemas around purpose, not convenience. |
| Purpose limitation | Separate data by purpose. A marketing database and an analytics database may need different access controls. |
| Storage limitation | Build retention policies into your data layer, not as an afterthought. |
| Transparency | Design APIs that can explain what data you hold about a person and why. |
| Accountability | Log processing activities. Build audit trails from day one. |
For example: a contact directory feature
If you’re building a feature that aggregates public officials’ contact information:
- Can I collect this? — Public officials’ data is publicly available, but processing still needs a legal basis (legitimate interest, public interest).
- How little? — Store official contact details, not personal ones. Prefer office emails over private emails.
- Where from? — Document the exact source (parliament.ch, admin.ch) and its terms of use.
- Could it identify? — Individual contact records are inherently identifying — handle deletion requests.
- What if it’s wrong? — Display “last verified” dates. Implement correction workflows.
Why do we use it?
Key reasons
1. Legal compliance. Data protection laws exist in virtually every jurisdiction. Building without governance awareness means building liability into your product.
2. Trust. Users share data with applications they trust. A transparent, privacy-respecting design earns that trust. A data breach or misuse destroys it.
3. Better architecture. Governance constraints produce cleaner systems. Data minimisation leads to simpler schemas. Purpose limitation leads to better separation of concerns. Retention policies prevent unbounded storage growth.
4. Future-proofing. Regulations are tightening worldwide. An application built on governance principles today won’t need a costly retrofit when the next law passes.
When do we use it?
- When designing any feature that collects, stores, or processes user data
- When aggregating data from external sources (APIs, public datasets, scraped content)
- When building AI features that generate content or make recommendations
- When your application acts as an intermediary between users (messaging, sharing, forwarding)
- When combining datasets that could, together, identify individuals
- When entering a new market or jurisdiction with different data protection laws
Rule of thumb
If your feature touches data that a person could care about — their name, their location, their behaviour, their words — governance applies. The question is never “does governance apply?” but “which principles are most relevant here?”
How can I think about it?
The building inspector analogy
Imagine you’re constructing a building. You wouldn’t wait until tenants move in to check if the wiring is safe or if the fire exits work. A building inspector reviews plans before construction begins and checks compliance during the build.
Data governance is your building inspector for software. The principles (fire safety = data protection, structural integrity = data quality, accessibility = transparency) are checked at design time, not after launch. A DPIA is your pre-construction safety review. Privacy by design is your building code.
The trust contract analogy
Every time a user gives you data, they’re signing an invisible contract: “I trust you with this for a specific reason.” Data governance is the practice of honouring that contract.
- Purpose limitation = “I gave you my email for login, not for marketing.”
- Data minimisation = “I told you my city, not my street address — don’t ask for more than you need.”
- Storage limitation = “When I close my account, delete my data — don’t keep it ‘just in case’.”
- Transparency = “Tell me what you’re doing with my data in language I can understand.”
Breaking any clause breaks trust. And unlike a legal contract, users don’t need to sue you — they just leave.
Concepts to explore next
| Concept | What it covers | Status |
|---|---|---|
| personal-data-protection | Legal bases for processing, data subject rights, GDPR/nDSG principles | complete |
| privacy-by-design | Building privacy into architecture from the start | complete |
| data-provenance | Tracking data origin, rights, and licensing | complete |
| re-identification-risk | When “anonymous” data can identify individuals | complete |
| intermediary-liability | When platforms become responsible for user content | complete |
| ai-content-liability | Who is responsible when AI generates harmful or wrong content | complete |
| algorithmic-transparency | Making recommendation systems explainable and fair | complete |
| data-protection-impact-assessment | Formal risk assessment before processing sensitive data | complete |
Some cards don't exist yet
A broken link is a placeholder for future learning, not an error.
Check your understanding
Test yourself (click to expand)
- Explain — Why is data governance relevant to a solo developer building a small web application, not just to large corporations?
- Name — List the five foundational questions a developer should ask before building a data-touching feature.
- Distinguish — What is the difference between data protection (a legal requirement) and data governance (a design discipline)?
- Interpret — You’re building a feature that shows “12 people near you are interested in this topic.” Which governance principles are most relevant, and why?
- Connect — How does the principle of data minimisation relate to the architectural concept of separation of concerns?
Where this concept fits
Position in the knowledge graph
graph TD A[Privacy and Security] --> B[Data Governance] B --> C[Personal Data Protection] B --> D[Privacy by Design] B --> E[Data Provenance] B --> F[Re-identification Risk] B --> G[Intermediary Liability] B --> H[AI Content Liability] B --> I[Algorithmic Transparency] B --> J[DPIA] style B fill:#4a9ede,color:#fffRelated concepts:
- software-architecture — governance principles constrain and improve architectural decisions
- databases — where most governed data physically lives
- apis — the interfaces through which data enters and leaves your system
- separation-of-concerns — governance reinforces separating data by purpose
Sources
Further reading
Resources
- Data Governance Best Practices & Principles — Snowflake’s overview of governance fundamentals and how they apply at scale
- Data Governance: The Key to Smarter Software Development — Why governance is the missing link in modern dev workflows
- A Complete Guide to Data Governance Principles in 2026 — Comprehensive reference covering all major governance principles
- 6 Data Governance Principles You Need to Know — Practical breakdown with implementation focus
Footnotes
-
Snowflake. (2026). Data Governance Best Practices & Principles. Snowflake. ↩
