Knowledge Granularity

The design question of how finely to break knowledge into pieces so that each piece preserves enough meaning to be useful — for both human learners and machine retrieval systems.

What is it?

Every knowledge system faces the same fundamental question: how big should the pieces be? A textbook chapter is too large to retrieve precisely — if someone asks about one formula, they should not get thirty pages. A single word is too small to mean anything on its own — “velocity” without a sentence around it tells you nothing useful. Somewhere between these extremes lies the right level of decomposition, and finding it is the problem of knowledge granularity.

This question matters in two domains that turn out to be deeply connected. In pedagogy, teachers must decide how to break a curriculum into teachable units. Too fine and students drown in disconnected facts. Too coarse and they face cognitive overload from compound concepts they cannot parse.¹ In machine retrieval — particularly Retrieval-Augmented Generation (rag) — engineers must decide how to split documents into chunks for embedding and search. Too fine and the chunk loses the context that gives it meaning. Too coarse and irrelevant material dilutes the useful content.²

The parallel is not a coincidence. In both cases, the atom of knowledge is the claim — the smallest statement that can stand alone and be evaluated as true or false (see claims-and-propositions). A chunk that contains exactly one claim is precise. A chunk that contains half a claim is meaningless. The challenge is that claims do not come neatly labelled — they are embedded in paragraphs, surrounded by context, dependent on definitions established earlier in the document.³

Knowledge granularity is not about finding the single correct chunk size. It is about understanding the trade-offs at each level of decomposition and choosing deliberately based on the use case.

In plain terms

Knowledge granularity is like slicing bread. Too thick and you cannot make a proper sandwich — the filling falls out. Too thin and the slice crumbles in your hand. The right thickness depends on what you are making: a delicate canape needs thinner slices than a hearty club sandwich. The bread itself has not changed — only the cut.

At a glance

The granularity spectrum (click to expand)
graph LR
    A[Whole Document] -->|Split| B[Sections]
    B -->|Split| C[Paragraphs]
    C -->|Split| D[Sentences]
    D -->|Split| E[Phrases]
    A -.->|"More context<br>Less precision"| F[ ]
    E -.->|"More precision<br>Less context"| F
    style F fill:none,stroke:none
Key: Moving left to right, chunks get smaller. Smaller chunks are more precise (they match specific queries better) but carry less context (they may lose the surrounding meaning that makes them interpretable). The design problem is choosing where on this spectrum to cut.

How does it work?

1. The too-fine problem: lost context

A sentence ripped from its paragraph can be meaningless or misleading. Consider: “This approach reduced errors by 40%.” Without the surrounding context, you do not know what approach, what errors, what baseline, or what domain. The sentence is a valid claim, but it is not a self-contained claim — it depends on antecedents established in prior sentences.²

In RAG systems, this manifests as retrieval that is technically relevant but practically useless. The system finds a chunk that matches the query, but the chunk lacks enough context for the language model to generate a meaningful answer. The model may then hallucinate the missing context, producing a confident but incorrect response.⁴

In teaching, the equivalent is drilling isolated facts without connecting them to a framework. Students can recite “mitochondria are the powerhouse of the cell” without understanding what a mitochondrion does, what a cell needs power for, or why this matters.

Think of it like...

Tearing a page out of a novel and handing it to someone. They can read the words, but without the chapters before it, they do not know who the characters are, what they want, or why this scene matters. The page is too fine a grain to carry the story.

2. The too-coarse problem: lost precision

At the other extreme, a chunk that contains an entire chapter will match many queries — but most of the content will be irrelevant to any given question. When a RAG system retrieves a 5,000-word section to answer a question about one specific fact, it burns context window tokens on irrelevant material, drowns the relevant fact in noise, and forces the language model to find a needle in a haystack.²

In teaching, the coarse equivalent is the lecture that tries to cover everything at once. Students leave with a vague sense of the topic but cannot articulate any specific claim. The material was not wrong — it was just too much, undifferentiated.

Think of it like...

Searching a library catalogue and getting back “Science Section, Floor 3” when you asked for the boiling point of water. Technically correct. Practically useless.

3. Chunking strategies for RAG

The engineering field has developed several approaches to the granularity problem, each making different trade-offs:⁵

Fixed-size chunking splits documents into blocks of a set token count (e.g., 256 or 512 tokens), often with an overlap window so that boundary sentences appear in two adjacent chunks. Simple and predictable, but blind to the document’s natural structure — a chunk boundary might fall mid-sentence or split a paragraph’s claim from its evidence.⁵

Recursive chunking starts with large separators (chapter breaks, double newlines) and progressively splits into smaller units (paragraphs, sentences) only when chunks exceed a size threshold. This respects the document’s hierarchy better than fixed-size, making it a practical default for many use cases.⁵

Semantic chunking uses embeddings to detect where the topic shifts. It computes the similarity between consecutive sentences; when similarity drops below a threshold, it places a boundary. This aligns chunks with thematic boundaries rather than arbitrary token counts.⁶

Agentic chunking uses a language model to determine where boundaries should fall, treating each chunk as a standalone proposition or set of closely related propositions. This is the most expensive approach but produces the most semantically coherent units.⁶

Concept to explore

See semantic-chunking for how embedding-based similarity detection creates topic-aligned chunks without manual rules.

Concept to explore

See agentic-chunking for how LLMs can determine chunk boundaries by reasoning about content structure.

4. The research evidence: chunk size affects accuracy

A study from Aalto University tested how chunking granularity affects question-answering accuracy in RAG systems and found that chunk size directly and measurably impacts both retrieval precision and the quality of generated answers.² This is not a minor implementation detail — it is a primary determinant of system performance.

Complementary research in the medical domain compared four chunking strategies (fixed-size, semantic clustering, proposition-based, and adaptive) and found that adaptive chunking — which dynamically places boundaries based on semantic similarity — achieved 87% accuracy compared to 50% for the fixed-size baseline, without changing the language model at all.⁴ The chunking strategy alone nearly doubled the system’s accuracy.

Example: the same question, different chunk sizes (click to expand)

Question: “What is the recommended daily intake of vitamin D for adults?”

Chunk too small (1 sentence): “The recommended amount is 600 IU per day.” — Missing: for whom? Set by which authority? Under what conditions? The LLM must guess, and may guess wrong.

Chunk too large (full section): A 2,000-word section on vitamins A through K, with vitamin D mentioned in one paragraph. The relevant fact is buried in noise. The LLM may fixate on a different vitamin or average across the section.

Chunk right-sized (1 paragraph): “For adults aged 19-70, the National Institutes of Health recommends 600 IU (15 mcg) of vitamin D per day. Adults over 70 should aim for 800 IU (20 mcg). These values assume minimal sun exposure. The tolerable upper intake level is 4,000 IU per day.” — Self-contained, specific, and includes the relevant qualifications.

5. Metadata enrichment: preserving context at fine granularity

There is a way to have both precision and context: metadata enrichment. Instead of relying solely on the text within a chunk, you attach structured metadata — entity tags, topic labels, source document identifiers, temporal markers — to each chunk. This lets the retrieval system pre-filter by metadata before computing semantic similarity, dramatically reducing false positives.⁶

For example, a chunk about vitamin D intake can be tagged with {topic: "nutrition", entity: "vitamin D", audience: "adults", source: "NIH guidelines", year: 2024}. Even if the chunk text is small, the metadata preserves the context that would otherwise be lost. The retrieval system can filter for “nutrition + vitamin D + adults” before comparing embeddings, ensuring the right chunk surfaces even at fine granularity.⁶

This approach mirrors how libraries use catalogue metadata (author, subject, date, call number) alongside the text itself. The metadata is structured data about the content; the chunk is the content. Together they solve the granularity dilemma — fine chunks for precision, rich metadata for context.

Key distinction

Chunk size controls precision (how targeted the retrieval is). Metadata controls context (how much surrounding meaning the system retains). The best systems optimise both, not just one.

Why do we use it?

Key reasons

1. Retrieval accuracy. The chunk size in a RAG system directly determines whether the right information reaches the language model. Wrong granularity means wrong answers, regardless of how good the model is.²

2. Reduced hallucination. When a chunk contains a complete, self-contained claim with sufficient context, the language model has less reason to invent missing details. Fine chunks without context are an invitation to hallucinate.⁴

3. Effective teaching. Learners absorb material more reliably when it is decomposed into atoms they can master individually. Granularity determines cognitive load — too coarse overwhelms, too fine fragments meaning.¹

4. Cost efficiency. In RAG systems, every retrieved chunk consumes context window tokens. Coarse chunks waste tokens on irrelevant content. Right-sized chunks maximise the ratio of useful information to tokens spent.⁵

When do we use it?

When designing a RAG pipeline and choosing how to split source documents for embedding and retrieval
When building a knowledge base and deciding the fundamental unit of storage (document, section, paragraph, claim)
When planning a curriculum and determining how to sequence material from atoms to compound concepts
When debugging retrieval quality in a system that returns technically relevant but practically unhelpful results
When choosing between chunking strategies and needing to understand the trade-offs of each approach
When evaluating AI system accuracy and suspecting that chunking — not the model — is the bottleneck

Rule of thumb

The right chunk size is the smallest unit that can stand alone as a meaningful, self-contained claim. If a chunk requires the reader (human or machine) to guess what it refers to, it is too small. If a chunk contains multiple unrelated claims, it is too large.

How can I think about it?

The jigsaw puzzle analogy

Imagine cutting a photograph into a jigsaw puzzle. If the pieces are very large (4 pieces for the whole image), each piece shows a lot of context but you cannot isolate a specific detail — asking “where is the red house?” means scanning a quarter of the image. If the pieces are tiny (10,000 pieces), each piece shows a few pixels of colour with no recognisable content — you cannot tell what any individual piece depicts without assembling its neighbours.

The photograph = a document or body of knowledge

Each puzzle piece = a chunk

Large pieces = coarse granularity (lots of context, poor precision)

Tiny pieces = fine granularity (high precision potential, but meaningless in isolation)

The right piece size = large enough to show a recognisable feature (a tree, a window, a face) but small enough to find quickly

The picture on the box = metadata (tells you what the piece depicts even before you see its neighbours)

The recipe card analogy

A chef organises recipes. One extreme: a single card that says “French Cuisine” — useless for finding a specific technique. The other extreme: a separate card for every individual action (“pick up knife,” “hold onion,” “position blade”) — absurd and unusable. The right level: one card per dish or technique, each self-contained enough that the chef can execute it without flipping to other cards.

The entire cookbook = the whole document (too coarse to retrieve from)

One card per action = sentence-level chunks (too fine to be meaningful)

One card per recipe = paragraph or section-level chunks (a complete, self-contained unit)

The index at the back = the embedding index that helps you find the right card

Tags on each card (cuisine, difficulty, time) = metadata enrichment that enables precise filtering without enlarging the card itself

Concepts to explore next

Concept	What it covers	Status
semantic-chunking	Using embedding similarity to detect topic shifts and place chunk boundaries	stub
agentic-chunking	Using LLMs to reason about content structure and determine optimal boundaries	stub
claims-and-propositions	The smallest unit of knowledge that can stand alone — the atom that defines the lower bound of useful granularity	complete
embeddings	The vector representations that power semantic similarity comparison between chunks and queries	complete
rag	The retrieval pattern where chunking granularity directly determines answer quality	complete

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.

Check your understanding

Test yourself (click to expand)

Explain why a sentence removed from its paragraph can be meaningless, even though it is grammatically complete. Give a concrete example.

Name four chunking strategies used in RAG systems and describe the key trade-off each makes between precision and context.

Distinguish between the too-fine and too-coarse problems. What symptoms does each produce in a RAG system’s output?

Interpret this scenario: a RAG-powered customer support system retrieves relevant chunks but the LLM frequently invents details not present in the source documents. What granularity problem might cause this, and how would you investigate?

Connect knowledge granularity to the concept of claims and propositions. Why is the claim a useful reference point for determining chunk size?

Where this concept fits

Position in the knowledge graph
graph TD
    KE[Knowledge Engineering] --> KGran[Knowledge Granularity]
    KE --> KGraph[Knowledge Graphs]
    KE --> MRF[Machine-Readable Formats]
    KGran --> SC[Semantic Chunking]
    KGran --> AC[Agentic Chunking]
    style KGran fill:#4a9ede,color:#fff
Related concepts:

semantic-triples — the atomic unit of a knowledge graph, representing the finest meaningful granularity for structured knowledge

structured-data-vs-prose — granularity decisions differ depending on whether content is structured (already atomic) or prose (requires decomposition)

knowledge-graphs — graph structures that operate at triple-level granularity, complementing the paragraph-level granularity of RAG retrieval

hallucination — a direct consequence of poor granularity: chunks too small to carry context invite the model to fabricate the missing details

Explorer

Knowledge Granularity

Knowledge Granularity

What is it?

At a glance

How does it work?

1. The too-fine problem: lost context

2. The too-coarse problem: lost precision

3. Chunking strategies for RAG

4. The research evidence: chunk size affects accuracy

5. Metadata enrichment: preserving context at fine granularity

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Graph View

Table of Contents

Backlinks

Explorer

Knowledge Granularity

Knowledge Granularity

What is it?

At a glance

How does it work?

1. The too-fine problem: lost context

2. The too-coarse problem: lost precision

3. Chunking strategies for RAG

4. The research evidence: chunk size affects accuracy

5. Metadata enrichment: preserving context at fine granularity

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Footnotes

Graph View

Table of Contents

Backlinks