Knowledge Granularity

The design question of how finely to break knowledge into pieces so that each piece preserves enough meaning to be useful — for both human learners and machine retrieval systems.


What is it?

Every knowledge system faces the same fundamental question: how big should the pieces be? A textbook chapter is too large to retrieve precisely — if someone asks about one formula, they should not get thirty pages. A single word is too small to mean anything on its own — “velocity” without a sentence around it tells you nothing useful. Somewhere between these extremes lies the right level of decomposition, and finding it is the problem of knowledge granularity.

This question matters in two domains that turn out to be deeply connected. In pedagogy, teachers must decide how to break a curriculum into teachable units. Too fine and students drown in disconnected facts. Too coarse and they face cognitive overload from compound concepts they cannot parse.1 In machine retrieval — particularly Retrieval-Augmented Generation (rag) — engineers must decide how to split documents into chunks for embedding and search. Too fine and the chunk loses the context that gives it meaning. Too coarse and irrelevant material dilutes the useful content.2

The parallel is not a coincidence. In both cases, the atom of knowledge is the claim — the smallest statement that can stand alone and be evaluated as true or false (see claims-and-propositions). A chunk that contains exactly one claim is precise. A chunk that contains half a claim is meaningless. The challenge is that claims do not come neatly labelled — they are embedded in paragraphs, surrounded by context, dependent on definitions established earlier in the document.3

Knowledge granularity is not about finding the single correct chunk size. It is about understanding the trade-offs at each level of decomposition and choosing deliberately based on the use case.

In plain terms

Knowledge granularity is like slicing bread. Too thick and you cannot make a proper sandwich — the filling falls out. Too thin and the slice crumbles in your hand. The right thickness depends on what you are making: a delicate canape needs thinner slices than a hearty club sandwich. The bread itself has not changed — only the cut.


At a glance


How does it work?

1. The too-fine problem: lost context

A sentence ripped from its paragraph can be meaningless or misleading. Consider: “This approach reduced errors by 40%.” Without the surrounding context, you do not know what approach, what errors, what baseline, or what domain. The sentence is a valid claim, but it is not a self-contained claim — it depends on antecedents established in prior sentences.2

In RAG systems, this manifests as retrieval that is technically relevant but practically useless. The system finds a chunk that matches the query, but the chunk lacks enough context for the language model to generate a meaningful answer. The model may then hallucinate the missing context, producing a confident but incorrect response.4

In teaching, the equivalent is drilling isolated facts without connecting them to a framework. Students can recite “mitochondria are the powerhouse of the cell” without understanding what a mitochondrion does, what a cell needs power for, or why this matters.

Think of it like...

Tearing a page out of a novel and handing it to someone. They can read the words, but without the chapters before it, they do not know who the characters are, what they want, or why this scene matters. The page is too fine a grain to carry the story.

2. The too-coarse problem: lost precision

At the other extreme, a chunk that contains an entire chapter will match many queries — but most of the content will be irrelevant to any given question. When a RAG system retrieves a 5,000-word section to answer a question about one specific fact, it burns context window tokens on irrelevant material, drowns the relevant fact in noise, and forces the language model to find a needle in a haystack.2

In teaching, the coarse equivalent is the lecture that tries to cover everything at once. Students leave with a vague sense of the topic but cannot articulate any specific claim. The material was not wrong — it was just too much, undifferentiated.

Think of it like...

Searching a library catalogue and getting back “Science Section, Floor 3” when you asked for the boiling point of water. Technically correct. Practically useless.

3. Chunking strategies for RAG

The engineering field has developed several approaches to the granularity problem, each making different trade-offs:5

Fixed-size chunking splits documents into blocks of a set token count (e.g., 256 or 512 tokens), often with an overlap window so that boundary sentences appear in two adjacent chunks. Simple and predictable, but blind to the document’s natural structure — a chunk boundary might fall mid-sentence or split a paragraph’s claim from its evidence.5

Recursive chunking starts with large separators (chapter breaks, double newlines) and progressively splits into smaller units (paragraphs, sentences) only when chunks exceed a size threshold. This respects the document’s hierarchy better than fixed-size, making it a practical default for many use cases.5

Semantic chunking uses embeddings to detect where the topic shifts. It computes the similarity between consecutive sentences; when similarity drops below a threshold, it places a boundary. This aligns chunks with thematic boundaries rather than arbitrary token counts.6

Agentic chunking uses a language model to determine where boundaries should fall, treating each chunk as a standalone proposition or set of closely related propositions. This is the most expensive approach but produces the most semantically coherent units.6

Concept to explore

See semantic-chunking for how embedding-based similarity detection creates topic-aligned chunks without manual rules.

Concept to explore

See agentic-chunking for how LLMs can determine chunk boundaries by reasoning about content structure.

4. The research evidence: chunk size affects accuracy

A study from Aalto University tested how chunking granularity affects question-answering accuracy in RAG systems and found that chunk size directly and measurably impacts both retrieval precision and the quality of generated answers.2 This is not a minor implementation detail — it is a primary determinant of system performance.

Complementary research in the medical domain compared four chunking strategies (fixed-size, semantic clustering, proposition-based, and adaptive) and found that adaptive chunking — which dynamically places boundaries based on semantic similarity — achieved 87% accuracy compared to 50% for the fixed-size baseline, without changing the language model at all.4 The chunking strategy alone nearly doubled the system’s accuracy.

5. Metadata enrichment: preserving context at fine granularity

There is a way to have both precision and context: metadata enrichment. Instead of relying solely on the text within a chunk, you attach structured metadata — entity tags, topic labels, source document identifiers, temporal markers — to each chunk. This lets the retrieval system pre-filter by metadata before computing semantic similarity, dramatically reducing false positives.6

For example, a chunk about vitamin D intake can be tagged with {topic: "nutrition", entity: "vitamin D", audience: "adults", source: "NIH guidelines", year: 2024}. Even if the chunk text is small, the metadata preserves the context that would otherwise be lost. The retrieval system can filter for “nutrition + vitamin D + adults” before comparing embeddings, ensuring the right chunk surfaces even at fine granularity.6

This approach mirrors how libraries use catalogue metadata (author, subject, date, call number) alongside the text itself. The metadata is structured data about the content; the chunk is the content. Together they solve the granularity dilemma — fine chunks for precision, rich metadata for context.

Key distinction

Chunk size controls precision (how targeted the retrieval is). Metadata controls context (how much surrounding meaning the system retains). The best systems optimise both, not just one.


Why do we use it?

Key reasons

1. Retrieval accuracy. The chunk size in a RAG system directly determines whether the right information reaches the language model. Wrong granularity means wrong answers, regardless of how good the model is.2

2. Reduced hallucination. When a chunk contains a complete, self-contained claim with sufficient context, the language model has less reason to invent missing details. Fine chunks without context are an invitation to hallucinate.4

3. Effective teaching. Learners absorb material more reliably when it is decomposed into atoms they can master individually. Granularity determines cognitive load — too coarse overwhelms, too fine fragments meaning.1

4. Cost efficiency. In RAG systems, every retrieved chunk consumes context window tokens. Coarse chunks waste tokens on irrelevant content. Right-sized chunks maximise the ratio of useful information to tokens spent.5


When do we use it?

  • When designing a RAG pipeline and choosing how to split source documents for embedding and retrieval
  • When building a knowledge base and deciding the fundamental unit of storage (document, section, paragraph, claim)
  • When planning a curriculum and determining how to sequence material from atoms to compound concepts
  • When debugging retrieval quality in a system that returns technically relevant but practically unhelpful results
  • When choosing between chunking strategies and needing to understand the trade-offs of each approach
  • When evaluating AI system accuracy and suspecting that chunking — not the model — is the bottleneck

Rule of thumb

The right chunk size is the smallest unit that can stand alone as a meaningful, self-contained claim. If a chunk requires the reader (human or machine) to guess what it refers to, it is too small. If a chunk contains multiple unrelated claims, it is too large.


How can I think about it?

The jigsaw puzzle analogy

Imagine cutting a photograph into a jigsaw puzzle. If the pieces are very large (4 pieces for the whole image), each piece shows a lot of context but you cannot isolate a specific detail — asking “where is the red house?” means scanning a quarter of the image. If the pieces are tiny (10,000 pieces), each piece shows a few pixels of colour with no recognisable content — you cannot tell what any individual piece depicts without assembling its neighbours.

  • The photograph = a document or body of knowledge
  • Each puzzle piece = a chunk
  • Large pieces = coarse granularity (lots of context, poor precision)
  • Tiny pieces = fine granularity (high precision potential, but meaningless in isolation)
  • The right piece size = large enough to show a recognisable feature (a tree, a window, a face) but small enough to find quickly
  • The picture on the box = metadata (tells you what the piece depicts even before you see its neighbours)

The recipe card analogy

A chef organises recipes. One extreme: a single card that says “French Cuisine” — useless for finding a specific technique. The other extreme: a separate card for every individual action (“pick up knife,” “hold onion,” “position blade”) — absurd and unusable. The right level: one card per dish or technique, each self-contained enough that the chef can execute it without flipping to other cards.

  • The entire cookbook = the whole document (too coarse to retrieve from)
  • One card per action = sentence-level chunks (too fine to be meaningful)
  • One card per recipe = paragraph or section-level chunks (a complete, self-contained unit)
  • The index at the back = the embedding index that helps you find the right card
  • Tags on each card (cuisine, difficulty, time) = metadata enrichment that enables precise filtering without enlarging the card itself

Concepts to explore next

ConceptWhat it coversStatus
semantic-chunkingUsing embedding similarity to detect topic shifts and place chunk boundariesstub
agentic-chunkingUsing LLMs to reason about content structure and determine optimal boundariesstub
claims-and-propositionsThe smallest unit of knowledge that can stand alone — the atom that defines the lower bound of useful granularitycomplete
embeddingsThe vector representations that power semantic similarity comparison between chunks and queriescomplete
ragThe retrieval pattern where chunking granularity directly determines answer qualitycomplete

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.


Check your understanding


Where this concept fits

Position in the knowledge graph

graph TD
    KE[Knowledge Engineering] --> KGran[Knowledge Granularity]
    KE --> KGraph[Knowledge Graphs]
    KE --> MRF[Machine-Readable Formats]
    KGran --> SC[Semantic Chunking]
    KGran --> AC[Agentic Chunking]
    style KGran fill:#4a9ede,color:#fff

Related concepts:

  • semantic-triples — the atomic unit of a knowledge graph, representing the finest meaningful granularity for structured knowledge
  • structured-data-vs-prose — granularity decisions differ depending on whether content is structured (already atomic) or prose (requires decomposition)
  • knowledge-graphs — graph structures that operate at triple-level granularity, complementing the paragraph-level granularity of RAG retrieval
  • hallucination — a direct consequence of poor granularity: chunks too small to carry context invite the model to fabricate the missing details

Sources


Further reading

Resources

Footnotes

  1. Didau, D. (2023). Decomposition for Dummies: How to Break Knowledge into Teachable Pieces. David Didau’s Substack. 2

  2. Ruotsalainen, J. (2025). Impact of Chunking Granularity on Accuracy and Token Consumption in Retrieval-Augmented Generation for Question-Answering. Aalto University. 2 3 4 5

  3. Boulton, K. (2023). What Is Atomisation?. Unstoppable Learning.

  4. Li, Y. et al. (2025). Adaptive Chunking Strategies for RAG in Clinical Decision Support. PubMed Central. 2 3

  5. Glukhov, A. (2025). Chunking Strategies in RAG. glukhov.org. 2 3 4

  6. Shaik, M. (2024). Beyond Fixed Chunks: How Semantic Chunking and Metadata Enrichment Transform RAG Accuracy. Medium. 2 3 4