Data Governance

The set of principles, questions, and design decisions that determine how data is collected, stored, processed, shared, and deleted — and who is responsible when things go wrong.

What is it?

Data governance is the discipline of making deliberate, informed decisions about data before you write a line of code. It sits at the intersection of law, ethics, and architecture. Every application that touches data — which is every application — makes governance decisions, whether consciously or not.

Most developers encounter governance as a list of regulations to comply with: GDPR, FADP, CCPA. But regulations are just the formalised expression of deeper principles. The principles came first. Understanding them means you can navigate any jurisdiction, including ones whose laws haven’t been written yet.

The core insight is this: data is not neutral. Every piece of data has an origin, a subject, a purpose, and a lifecycle. Governance is the practice of respecting all four. When you store a user’s email, you’re not just writing a string to a database — you’re entering a relationship with obligations attached.¹

For a developer building applications, data governance answers one meta-question: “Should I be doing this with this data, and if so, how?”

In plain terms

Data governance is like building codes for data. Just as a building code ensures a house is safe, accessible, and up to standard before anyone moves in, data governance ensures your application handles data safely, transparently, and lawfully before any user trusts it with their information.

At a glance

The data governance mental model (click to expand)
graph TD
    A[Data Governance] --> B[Personal Data Protection]
    A --> C[Privacy by Design]
    A --> D[Data Provenance]
    A --> E[Re-identification Risk]
    A --> F[Intermediary Liability]
    A --> G[AI Content Liability]
    A --> H[Algorithmic Transparency]
    A --> I[DPIA]
    style A fill:#4a9ede,color:#fff
Key: Each child represents a principle that informs architectural decisions. Together they form a mental checklist for any data-touching feature.

How does it work?

Data governance operates through a set of interlocking principles. Each principle asks a different question about your relationship with data.

The five foundational questions

Before building any feature that touches data, a developer should be able to answer:

#	Question	Principle
1	Can I collect this?	personal-data-protection — Is there a legal basis?
2	How little can I collect?	privacy-by-design — Minimise by default
3	Where does it come from?	data-provenance — Source, licence, rights chain
4	Could it identify someone?	re-identification-risk — Even “anonymous” data can expose people
5	What happens when it goes wrong?	intermediary-liability, ai-content-liability — Who is responsible?

Think of it like a supply chain

Just as a restaurant must know where its ingredients come from (provenance), store them at the right temperature (protection), serve safe portions (minimisation), and take responsibility if someone gets sick (liability) — a developer must apply the same rigour to data.

The regulatory landscape

These principles are codified differently across jurisdictions, but the underlying logic is remarkably consistent:

Principle	EU (GDPR)	Switzerland (nDSG)	USA	International
Legal basis for processing	Art. 6	Art. 4, 7	Varies by state	CoE Convention Art. 5
Data minimisation	Art. 5(1)(c)	Art. 4	CCPA (limited)	OECD Guidelines
Transparency	Art. 13-14	Art. 13	FTC Act § 5	CoE Convention Art. 8
Privacy by design	Art. 25	Art. 22	—	ISO 31700
Impact assessment	Art. 35	Art. 16	—	CoE Convention Art. 10

Concept to explore

See personal-data-protection for a deep dive into how different legal frameworks implement the same core principles.

From principles to architecture

Governance principles translate directly into architectural decisions:

Principle	Architectural implication
Data minimisation	Don’t store what you don’t need. Design schemas around purpose, not convenience.
Purpose limitation	Separate data by purpose. A marketing database and an analytics database may need different access controls.
Storage limitation	Build retention policies into your data layer, not as an afterthought.
Transparency	Design APIs that can explain what data you hold about a person and why.
Accountability	Log processing activities. Build audit trails from day one.

For example: a contact directory feature

If you’re building a feature that aggregates public officials’ contact information:

Can I collect this? — Public officials’ data is publicly available, but processing still needs a legal basis (legitimate interest, public interest).

How little? — Store official contact details, not personal ones. Prefer office emails over private emails.

Where from? — Document the exact source (parliament.ch, admin.ch) and its terms of use.

Could it identify? — Individual contact records are inherently identifying — handle deletion requests.

What if it’s wrong? — Display “last verified” dates. Implement correction workflows.

Why do we use it?

Key reasons

1. Legal compliance. Data protection laws exist in virtually every jurisdiction. Building without governance awareness means building liability into your product.

2. Trust. Users share data with applications they trust. A transparent, privacy-respecting design earns that trust. A data breach or misuse destroys it.

3. Better architecture. Governance constraints produce cleaner systems. Data minimisation leads to simpler schemas. Purpose limitation leads to better separation of concerns. Retention policies prevent unbounded storage growth.

4. Future-proofing. Regulations are tightening worldwide. An application built on governance principles today won’t need a costly retrofit when the next law passes.

When do we use it?

When designing any feature that collects, stores, or processes user data
When aggregating data from external sources (APIs, public datasets, scraped content)
When building AI features that generate content or make recommendations
When your application acts as an intermediary between users (messaging, sharing, forwarding)
When combining datasets that could, together, identify individuals
When entering a new market or jurisdiction with different data protection laws

Rule of thumb

If your feature touches data that a person could care about — their name, their location, their behaviour, their words — governance applies. The question is never “does governance apply?” but “which principles are most relevant here?”

How can I think about it?

The building inspector analogy

Imagine you’re constructing a building. You wouldn’t wait until tenants move in to check if the wiring is safe or if the fire exits work. A building inspector reviews plans before construction begins and checks compliance during the build.

Data governance is your building inspector for software. The principles (fire safety = data protection, structural integrity = data quality, accessibility = transparency) are checked at design time, not after launch. A DPIA is your pre-construction safety review. Privacy by design is your building code.

The trust contract analogy

Every time a user gives you data, they’re signing an invisible contract: “I trust you with this for a specific reason.” Data governance is the practice of honouring that contract.

Purpose limitation = “I gave you my email for login, not for marketing.”

Data minimisation = “I told you my city, not my street address — don’t ask for more than you need.”

Storage limitation = “When I close my account, delete my data — don’t keep it ‘just in case’.”

Transparency = “Tell me what you’re doing with my data in language I can understand.”

Breaking any clause breaks trust. And unlike a legal contract, users don’t need to sue you — they just leave.

Concepts to explore next

Concept	What it covers	Status
personal-data-protection	Legal bases for processing, data subject rights, GDPR/nDSG principles	complete
privacy-by-design	Building privacy into architecture from the start	complete
data-provenance	Tracking data origin, rights, and licensing	complete
re-identification-risk	When “anonymous” data can identify individuals	complete
intermediary-liability	When platforms become responsible for user content	complete
ai-content-liability	Who is responsible when AI generates harmful or wrong content	complete
algorithmic-transparency	Making recommendation systems explainable and fair	complete
data-protection-impact-assessment	Formal risk assessment before processing sensitive data	complete

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.

Check your understanding

Test yourself (click to expand)

Explain — Why is data governance relevant to a solo developer building a small web application, not just to large corporations?

Name — List the five foundational questions a developer should ask before building a data-touching feature.

Distinguish — What is the difference between data protection (a legal requirement) and data governance (a design discipline)?

Interpret — You’re building a feature that shows “12 people near you are interested in this topic.” Which governance principles are most relevant, and why?

Connect — How does the principle of data minimisation relate to the architectural concept of separation of concerns?

Where this concept fits

Position in the knowledge graph
graph TD
    A[Privacy and Security] --> B[Data Governance]
    B --> C[Personal Data Protection]
    B --> D[Privacy by Design]
    B --> E[Data Provenance]
    B --> F[Re-identification Risk]
    B --> G[Intermediary Liability]
    B --> H[AI Content Liability]
    B --> I[Algorithmic Transparency]
    B --> J[DPIA]
    style B fill:#4a9ede,color:#fff
Related concepts:

software-architecture — governance principles constrain and improve architectural decisions

databases — where most governed data physically lives

apis — the interfaces through which data enters and leaves your system

separation-of-concerns — governance reinforces separating data by purpose

Explorer

Data Governance

Data Governance

What is it?

At a glance

How does it work?

The five foundational questions

The regulatory landscape

From principles to architecture

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Graph View

Table of Contents

Backlinks

Explorer

Data Governance

Data Governance

What is it?

At a glance

How does it work?

The five foundational questions

The regulatory landscape

From principles to architecture

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Footnotes

Graph View

Table of Contents

Backlinks