What Gives Knowledge Meaning — From Human Cognition to Machine Representation
You know that knowledge matters. You have heard about knowledge graphs, RAG, and embeddings. But you have not stopped to ask the deeper question: what is knowledge, what makes it meaningful, and why does the answer determine whether your AI system is reliable or dangerous? This article builds that foundation.
Who this is for
You have worked with AI systems or knowledge management tools. You understand that “garbage in, garbage out” applies to knowledge, but you want to understand why — what gives knowledge its structure, how humans and machines represent it differently, and what breaks when the representation is wrong.
You may have read agentic-design and understand how AI systems are structured. Now you want to understand the layer underneath: the knowledge itself.
What this article is NOT
This is not a tutorial on building a knowledge graph or a RAG pipeline. This is the epistemological foundation — the thinking that must happen before you choose a tool, a chunk size, or a schema. The implementation details come later.
Part 1 — What is knowledge?
Most people use “data,” “information,” and “knowledge” interchangeably. They are not the same thing. The distinction matters because every design decision in a knowledge system — what to store, how to structure it, how to retrieve it — depends on which layer you are working with.
The DIKW hierarchy is the standard framework for thinking about this.1
graph TD D[Data - raw symbols without context] -->|add context| I[Information - data with meaning] I -->|add experience| K[Knowledge - information you can act on] K -->|add judgement| W[Wisdom - knowing when and why to act] style D fill:#e8b84b,color:#fff style I fill:#4a9ede,color:#fff style K fill:#5cb85c,color:#fff style W fill:#9b59b6,color:#fff
| Layer | Example | What transforms it |
|---|---|---|
| Data | ”37.8” | Nothing — it is a raw number |
| Information | ”The patient’s temperature is 37.8C” | Context: who, what, when |
| Knowledge | ”A temperature of 37.8C is slightly elevated and may indicate early infection” | Experience, pattern recognition, domain expertise |
| Wisdom | ”Given this patient’s history and current symptoms, we should monitor but not yet treat” | Judgement, ethics, situational awareness |
Each layer adds something the previous layer lacks. Data becomes information when you add context — who measured it, when, what it refers to. Information becomes knowledge when you add structure and relationships — connecting it to other things you know, recognising patterns, drawing inferences. Knowledge becomes wisdom when you add judgement — knowing what to do with what you know, and equally, what not to do.1
Why this matters for AI systems
Most AI systems operate at the information layer. They retrieve text, summarise it, and return it. A well-designed knowledge system operates at the knowledge layer — it understands relationships between pieces of information and can reason across them. The gap between these two layers is the gap between a search engine and an expert.
Part 2 — The atom of knowledge
If knowledge is built from smaller pieces, what is the smallest piece that still carries meaning?
The answer is the claim — also called a proposition. A claim is the smallest unit of knowledge that can be true or false. It is a single assertion about the world.2
- “Paris is the capital of France” — claim.
- “Vaccination reduces disease incidence” — claim.
- “Paris” — not a claim. It is a label.
- “Vaccines are good” — not a claim. It is a value judgement without a falsifiable assertion.
This matters because knowledge systems that cannot identify their atoms cannot reason reliably. If you store a paragraph as a single unit, you cannot ask “is this true?” — because the paragraph contains multiple claims, some of which may be true and others false.
Three types of atomisation
Boulton identifies three distinct types of knowledge atoms, each requiring a different approach:3
| Type | What it contains | Example |
|---|---|---|
| Routine | Procedural steps, sequences | ”To deploy, first run tests, then build, then push” |
| Factual | Discrete true/false assertions | ”The HTTP status code for ‘not found’ is 404” |
| Conceptual | Relationships between ideas | ”Ontologies are to machines what schemas are to humans” |
Factual atoms are the easiest to extract and verify. Conceptual atoms are the hardest — they encode relationships, not just facts.4 And routine atoms encode sequence, which means order matters.
Questions are atoms too
A question is the inverse of a claim. Where a claim asserts, a question probes. “What is the capital of France?” is the atomic question corresponding to the claim “Paris is the capital of France.” Good knowledge systems store both: the assertion and the question it answers.5
The computational parallel: semantic triples
Computer science arrived at the same atom independently. A semantic triple is a three-part structure: Subject — Predicate — Object.6
graph LR S[Subject<br/>Paris] -->|Predicate<br/>is capital of| O[Object<br/>France] style S fill:#4a9ede,color:#fff style O fill:#5cb85c,color:#fff
| Human term | CS term | Example |
|---|---|---|
| Claim | Triple | ”Paris is the capital of France” |
| Subject | Subject node | Paris |
| Relationship | Predicate | is capital of |
| Referent | Object node | France |
The parallel is not a coincidence. Both disciplines are solving the same problem: how do you decompose knowledge into its smallest meaningful unit while preserving the relationships that give it meaning?
The atom principle
The smallest useful unit of knowledge is not a word, a sentence, or a paragraph. It is a claim — a single assertion that can be true or false, and that encodes a relationship between two things. In human terms, this is a proposition. In machine terms, this is a triple. They are the same idea in different notation.2
Part 3 — How humans organise knowledge
Humans do not store knowledge as a flat list of claims. They organise it into frameworks — mental structures that group related knowledge, fill in gaps, and guide expectations. Cognitive psychology calls these schemas.7
A schema is a mental model made explicit.7 When a doctor hears “37.8C, cough, fatigue,” they do not process each fact independently. They activate a schema — “upper respiratory infection” — that connects these symptoms, predicts what else they might find, and suggests what to do next. The schema is not the facts. It is the structure that gives the facts meaning.
graph TD S[Schema: Upper Respiratory Infection] --> S1[Expected symptoms] S --> S2[Typical duration] S --> S3[Treatment protocol] S --> S4[Red flags] S1 --> F1[Fever] S1 --> F2[Cough] S1 --> F3[Fatigue] S4 --> F4[High fever >39C] S4 --> F5[Difficulty breathing] style S fill:#4a9ede,color:#fff style S4 fill:#e74c3c,color:#fff
Everyone operates on schemas, whether they know it or not.8 When you walk into a restaurant, you do not reason from first principles about what to do. You activate your “restaurant schema” — sit down, read menu, order, eat, pay. The schema handles the routine so your conscious mind can focus on exceptions.
Knowledge is constructed, not transmitted
Constructivism — the dominant theory in learning science — holds that knowledge cannot be transferred from one mind to another like a file. It must be built by the learner through interaction with new information, existing schemas, and experience.9
This has a radical implication: two people can read the same text and construct different knowledge from it, because they bring different schemas. The text is information. The knowledge is what each reader builds.
graph LR INFO[Same Information] --> L1[Learner A<br/>Schema X] --> K1[Knowledge A] INFO --> L2[Learner B<br/>Schema Y] --> K2[Knowledge B] style INFO fill:#e8b84b,color:#fff style K1 fill:#5cb85c,color:#fff style K2 fill:#5cb85c,color:#fff
Levels of cognitive processing
Bloom’s taxonomy describes six levels of cognitive processing, from shallow to deep:9
| Level | Verb | What it means |
|---|---|---|
| Remember | Recall | Retrieve a fact from memory |
| Understand | Explain | Restate in your own words |
| Apply | Use | Apply knowledge to a new situation |
| Analyse | Distinguish | Break apart and examine relationships |
| Evaluate | Judge | Assess quality, validity, or fitness |
| Create | Design | Produce something new from existing knowledge |
Each level requires the levels below it. You cannot analyse something you do not understand, and you cannot understand something you do not remember. This hierarchy maps directly to knowledge system design: a system that only retrieves facts (Remember) is fundamentally different from one that supports reasoning across relationships (Analyse) or synthesis (Create).
The brain as a knowledge graph
Part 4 — How machines represent knowledge
Machines face the same fundamental problem as humans: how do you represent knowledge so it can be stored, retrieved, and reasoned over? The solutions, developed over decades in knowledge-engineering, are strikingly parallel to how humans do it.12
Ontologies
An ontology is a formal specification of a conceptualisation — a structured vocabulary that defines the types of things that exist in a domain and the relationships between them.13
Where a human schema says “I know that diseases have symptoms,
treatments, and risk factors,” an ontology says the same thing in
machine-readable form: Disease hasSymptom Symptom,
Disease hasTreatment Treatment, Disease hasRiskFactor RiskFactor.
graph TD D[Disease] -->|hasSymptom| SY[Symptom] D -->|hasTreatment| T[Treatment] D -->|hasRiskFactor| R[Risk Factor] D -->|subclassOf| MC[Medical Condition] SY -->|hasSeverity| SEV[Severity Level] style D fill:#4a9ede,color:#fff style MC fill:#9b59b6,color:#fff
An ontology is not a database. A database stores instances (“Patient John has Condition X”). An ontology defines the structure of what can be stored (“Patients can have Conditions, Conditions have Symptoms”).13 It is the schema that makes the data interpretable.
Knowledge graphs
A knowledge graph is an ontology populated with instances — nodes (things) connected by typed edges (relationships).14 If the ontology says “Cities can be capitals of Countries,” the knowledge graph says “Paris is the capital of France.”
graph LR P[Paris] -->|is capital of| F[France] P -->|located in| EU[Europe] F -->|member of| EUN[European Union] F -->|has language| FR[French] P -->|has population| POP[2.1M] style P fill:#4a9ede,color:#fff style F fill:#5cb85c,color:#fff
Knowledge graphs enable multi-hop reasoning: given “Paris is the capital of France” and “France is a member of the EU,” the system can infer “Paris is a capital city within the EU” — even though that fact was never explicitly stored.
Taxonomies
Taxonomies are hierarchical classification systems — a specific type of ontological structure where each level gets more specific.15 They answer the question “what kind of thing is this?”
graph TD ROOT[Knowledge Representation] --> FORMAL[Formal Methods] ROOT --> INFORMAL[Informal Methods] FORMAL --> ONT[Ontologies] FORMAL --> KG[Knowledge Graphs] FORMAL --> TAX[Taxonomies] INFORMAL --> FOLK[Folksonomies] INFORMAL --> TAG[Tag Systems] style ROOT fill:#4a9ede,color:#fff style FORMAL fill:#5cb85c,color:#fff style INFORMAL fill:#e8b84b,color:#fff
Semantic triples — the RDF model
The Resource Description Framework (RDF) is the W3C standard for encoding semantic-triples. Every statement is a triple: Subject — Predicate — Object. An entire knowledge graph is just a collection of triples.6
| Triple | Subject | Predicate | Object |
|---|---|---|---|
| 1 | Paris | is capital of | France |
| 2 | France | has population | 67M |
| 3 | France | is member of | EU |
| 4 | EU | founded in | 1993 |
From four triples, you can traverse a graph, answer questions, and draw inferences. This is the power of atomic representation: each triple is independently verifiable, updatable, and composable.
The parallel
Schemas are to humans what ontologies are to machines. Mental models are to cognition what knowledge graphs are to computation. The structures are different in implementation but identical in purpose: organise knowledge into a navigable network of typed relationships so it can be retrieved, reasoned over, and applied.
Part 5 — The granularity problem
You now understand that knowledge has atoms (claims, triples) and structures (schemas, ontologies, graphs). The critical design question is: at what level of decomposition do you store and retrieve knowledge?
This is the knowledge-granularity problem, and it has no universal answer. Too fine and you lose context. Too coarse and you lose precision. The right granularity depends on what you are trying to do with the knowledge.
Chunking strategies for RAG
In RAG systems, documents must be split into chunks for embedding and retrieval. The chunking strategy directly determines what the system can and cannot find.16
graph TD DOC[Full Document] --> FIX[Fixed-Size Chunks<br/>Split every N tokens] DOC --> REC[Recursive Chunks<br/>Split on structure: headings, paragraphs] DOC --> SEM[Semantic Chunks<br/>Split on meaning shifts] DOC --> AGT[Agentic Chunks<br/>LLM decides boundaries] FIX --> PROS1[Simple, predictable] FIX --> CONS1[Breaks mid-sentence] REC --> PROS2[Respects document structure] REC --> CONS2[Structure may not match meaning] SEM --> PROS3[Preserves semantic coherence] SEM --> CONS3[Expensive, model-dependent] AGT --> PROS4[Best boundaries] AGT --> CONS4[Slowest, most expensive] style DOC fill:#4a9ede,color:#fff style CONS1 fill:#e74c3c,color:#fff style CONS2 fill:#e74c3c,color:#fff style CONS3 fill:#e74c3c,color:#fff style CONS4 fill:#e74c3c,color:#fff style PROS1 fill:#5cb85c,color:#fff style PROS2 fill:#5cb85c,color:#fff style PROS3 fill:#5cb85c,color:#fff style PROS4 fill:#5cb85c,color:#fff
Each strategy sits on a spectrum from simple and fast to intelligent and expensive:17
| Strategy | How it works | Preserves meaning? | Cost |
|---|---|---|---|
| Fixed-size | Split every 512 tokens | Often not — cuts mid-thought | Lowest |
| Recursive | Split on headers, paragraphs, sentences | Partially — respects syntax, not semantics | Low |
| Semantic | Detect meaning shifts via embeddings | Usually — groups coherent ideas | Medium |
| Agentic | LLM identifies natural boundaries | Best — understands content | Highest |
The Aalto thesis finding
Research from Aalto University demonstrated that chunk size directly and measurably affects retrieval accuracy.16 Smaller chunks improve precision (finding exactly the right fact) but reduce recall (missing the surrounding context that makes the fact interpretable). Larger chunks improve recall but reduce precision (retrieving too much irrelevant material alongside the target).
There is no magic number. But the research converges on a principle: the optimal chunk should contain exactly one coherent claim or closely related cluster of claims, plus enough context to interpret them.16
Metadata as epistemological context
Recent work argues that metadata — source, date, author, confidence, domain — is not administrative overhead but epistemological infrastructure.18 A chunk without metadata is a claim without provenance. You cannot evaluate its reliability, currency, or scope. Adding structured metadata to chunks is the equivalent of adding context to data in the DIKW hierarchy: it transforms raw text into interpretable knowledge.
Semantic chunking combined with metadata enrichment has been shown to significantly improve RAG accuracy over naive fixed-size approaches.19
The pedagogical parallel
The granularity problem in AI mirrors the atomisation problem in education.3 A teacher who breaks a lesson into atoms too small loses the narrative that connects them. A teacher who keeps the lesson as a monolithic block loses the ability to assess and remediate individual misunderstandings.
The principle is the same in both domains: the atom must be the smallest unit that can stand alone as a claim.2 Smaller than that, meaning disintegrates. Larger than that, you cannot isolate what is true from what is false, what is understood from what is not.
The granularity principle
Every knowledge system — human or machine — must answer the question: what is my atom? The answer determines everything downstream: how knowledge is stored, retrieved, evaluated, and composed. Get the atom wrong and no amount of downstream engineering can compensate.
Part 6 — Meaning reconstruction in probabilistic models
Humans reconstruct meaning from schemas. They encounter new information, match it against existing frameworks, and integrate or reject it. This process is active, structured, and grounded in experience.7
Large language models do something that looks similar but works differently. They reconstruct meaning from statistical patterns learned during training. They do not have schemas. They have what might be called statistical shadows of schemas — learned co-occurrence patterns that approximate the structure of knowledge without explicitly representing it.
Embeddings as compressed meaning
Embeddings are vector representations of text — lists of numbers that encode “meaning” as a position in high-dimensional space.20 Two pieces of text with similar meanings will have similar vectors, enabling the system to find conceptually related content even when the exact words differ.
graph LR T1[Paris is the capital of France] -->|embed| V1[Vector: 0.82, -0.31, ...] T2[France's capital city is Paris] -->|embed| V2[Vector: 0.81, -0.30, ...] T3[The weather in Tokyo is mild] -->|embed| V3[Vector: -0.45, 0.72, ...] V1 -.->|high similarity| V2 V1 -.->|low similarity| V3 style V1 fill:#5cb85c,color:#fff style V2 fill:#5cb85c,color:#fff style V3 fill:#e74c3c,color:#fff
Vector similarity functions as “conceptual proximity.” But it is proximity in a statistical space, not a logical one. Two statements can be semantically similar (close in vector space) while being logically contradictory. “Vaccines cause autism” is close to “Vaccines prevent disease” in embedding space because they share the same vocabulary and topic — but they make opposite claims.
Why ontology-grounded systems hallucinate less
Hallucination occurs when a model generates plausible-sounding but factually incorrect output. It happens because the model is reconstructing from statistical patterns, not from verified knowledge structures.
Ontology-grounded systems — those that combine LLMs with explicit knowledge graphs — show dramatically lower hallucination rates. A clinical question-answering system using ontology-grounded GraphRAG achieved approximately 1.7% hallucination with 98% accuracy, compared to significantly higher error rates in baseline LLMs.21
graph TD Q[Question] --> LLM_ONLY[LLM Only<br/>Statistical reconstruction] Q --> GROUNDED[Ontology-Grounded<br/>Graph + LLM] LLM_ONLY --> H[Higher hallucination<br/>No structural constraints] GROUNDED --> R[Lower hallucination<br/>Answers constrained by graph] style H fill:#e74c3c,color:#fff style R fill:#5cb85c,color:#fff
The reason is structural. An ontology constrains what the model can say. If the graph does not contain a relationship between Drug A and Condition B, the system will not fabricate one. The ontology acts as a structural guardrail — not eliminating errors, but dramatically reducing the space in which errors can occur.21
The gap: LLMs have no schema
This is the core insight of this article. Humans reconstruct meaning through schemas — structured, updatable, falsifiable frameworks built through experience. LLMs reconstruct meaning through statistical correlation — patterns learned from training data that approximate schemas without being schemas.
The implications for system design are profound:
| Property | Human schema | LLM statistical pattern |
|---|---|---|
| Structure | Explicit, navigable, typed relationships | Implicit, distributed across weights |
| Updatable | Yes — schemas evolve with experience | Only through retraining or fine-tuning |
| Falsifiable | Yes — a schema can be wrong and corrected | No — patterns are statistical, not logical |
| Explainable | Partially — you can articulate your mental model | No — the model cannot explain its representations |
| Grounded | In experience and evidence | In training data (which may be wrong) |
When you pair an LLM with an explicit knowledge graph, you are giving it something it lacks: a schema. The graph provides the structure. The LLM provides the fluency. Together, they approximate what a human expert does: reason within a structured framework while communicating naturally.
The design implication
Do not ask “how do I make my LLM know more?” Ask “how do I give my LLM a schema?” The answer is almost always: build a knowledge graph, ground retrieval in ontology, and let the model reason within explicit structural constraints.
Part 7 — Design implications
Everything in this article converges on a single insight: knowledge architecture is the highest-leverage investment you can make in any system that processes, stores, or retrieves knowledge — whether that system is a human learning environment or a machine intelligence pipeline.
The five questions every knowledge architect must answer:
graph TD Q1[1. What are my atoms?] --> Q2[2. What are my relations?] Q2 --> Q3[3. What granularity preserves meaning?] Q3 --> Q4[4. How do I ground retrieval?] Q4 --> Q5[5. How do I validate reconstruction?] Q1 -.-> A1[Claims, propositions, triples] Q2 -.-> A2[Ontology, typed edges, predicates] Q3 -.-> A3[Chunking strategy, atom boundaries] Q4 -.-> A4[Graph traversal, metadata, provenance] Q5 -.-> A5[Evaluation against source, human review] style Q1 fill:#4a9ede,color:#fff style Q2 fill:#4a9ede,color:#fff style Q3 fill:#4a9ede,color:#fff style Q4 fill:#4a9ede,color:#fff style Q5 fill:#4a9ede,color:#fff
Question 1: What are my atoms?
Decide what the smallest unit of knowledge is in your domain. For a legal system, it might be a statutory clause. For a medical system, a clinical assertion. For a learning system, a claim that can be true or false. Everything downstream — storage, retrieval, evaluation — depends on getting this right.
Question 2: What are my relations?
Define the types of relationships that connect your atoms. “Requires,” “contradicts,” “supports,” “is-a,” “part-of” — these are your predicates. Together with your atoms, they form the ontology. Without explicit relations, your knowledge is a bag of facts, not a graph of understanding.
Question 3: What granularity preserves meaning?
Choose a chunking or decomposition strategy that respects your atom boundaries. Test it empirically: retrieve chunks, read them in isolation, and ask “does this chunk contain enough context to be meaningful on its own?” If not, your granularity is too fine. If it contains multiple unrelated claims, it is too coarse.
Question 4: How do I ground retrieval?
Ensure that every piece of retrieved knowledge carries provenance: where it came from, when it was created, who authored it, how confident the system is in it. Metadata is not overhead — it is the epistemological infrastructure that makes knowledge trustworthy.18
Question 5: How do I validate reconstruction?
When a system reconstructs an answer from retrieved knowledge, how do you verify that the reconstruction is faithful to the source? This is where hallucination becomes a design problem, not a model problem. Ontology-grounded retrieval, citation of sources, and human-in-the-loop review are all validation mechanisms.21
Applying the five questions
Consider designing a knowledge base for a customer support system.
- Atoms: Each product feature, each known issue, each resolution step is a claim.
- Relations: “Product X has Feature Y,” “Issue A is resolved by Step B,” “Feature Y requires Configuration Z.”
- Granularity: Each chunk should contain one issue-resolution pair with enough context to act on.
- Grounding: Every resolution carries a source (documentation version, last verified date, success rate).
- Validation: Generated answers are checked against the knowledge graph — if the answer references a relationship that does not exist in the graph, it is flagged.
What you now understand
Mental models you have gained
- The DIKW hierarchy — data, information, knowledge, and wisdom are distinct layers; each transformation adds context, structure, or judgement
- Claims as atoms — the smallest useful unit of knowledge is a proposition that can be true or false, paralleling the semantic triple in computer science
- Schema theory — humans organise knowledge into structured mental frameworks that guide perception, expectation, and reasoning
- Constructivism — knowledge is built by the learner through interaction, not transmitted like a file
- Ontologies and knowledge graphs — machines represent knowledge through formal type systems (ontologies) populated with instances (graphs), mirroring human schemas
- The granularity problem — decomposition that is too fine loses context; too coarse loses precision; the atom must preserve the claim
- Embeddings as compressed meaning — vector representations capture statistical proximity, not logical truth
- Ontology-grounded systems — explicit knowledge structures constrain LLM output and dramatically reduce hallucination
- The five questions — atoms, relations, granularity, grounding, and validation form the design checklist for any knowledge architecture
Check your understanding
Test yourself before moving on (click to expand)
- Explain the difference between information and knowledge using the DIKW hierarchy. Give an example where the same data becomes different knowledge depending on the schema of the person interpreting it.
- Describe what makes a claim the “atom” of knowledge, and why a sentence or a paragraph is not always an atom. How does this map to the concept of a semantic triple?
- Distinguish between how a human reconstructs meaning from a schema and how an LLM reconstructs meaning from statistical patterns. What can the human do that the LLM cannot?
- Interpret this scenario: a RAG system retrieves relevant chunks but consistently produces inaccurate answers. Using the concepts from this article, diagnose at least two possible causes and the design principle you would apply to fix each.
- Design the knowledge architecture for a domain of your choice. Answer the five questions: what are your atoms, what are your relations, what granularity preserves meaning, how do you ground retrieval, and how do you validate reconstruction?
Where to go next
I want to understand how to build agentic systems on top of this
You now understand the knowledge layer — the foundation. The next step is understanding how AI systems are structured to use that knowledge: routing, orchestration, pipelines, and human-in-the-loop design.
Read agentic-design for the architectural patterns.
Best for: People ready to design systems that reason over structured knowledge.
I want to go deeper on knowledge graphs and ontologies
The concept cards for knowledge-graphs, ontology, semantic-triples, and taxonomies provide detailed explanations, diagrams, and examples for each representation method.
Best for: People who want to understand the formal machinery of knowledge representation.
I want to understand the software fundamentals underneath
Knowledge systems run on software. If you have not read it yet, from-zero-to-building covers the base layer: frontend, backend, APIs, databases, and the document chain.
Best for: People who want to understand the full stack from infrastructure to intelligence.
I want to explore chunking and RAG in depth
The concept cards for rag, embeddings, knowledge-granularity, and structured-data-vs-prose go deeper on the practical design decisions for retrieval systems.
Best for: People building or improving RAG pipelines who want the theoretical grounding for their design choices.
Sources
Further reading
Resources
- Separate Claims from Evidence (How to Think) — Why distinguishing what is asserted from what supports it is fundamental to knowledge quality
- Clinical Chunking Evaluation (PubMed) — Empirical evaluation of chunking strategies in clinical NLP, showing domain-specific impacts on accuracy
- Conceptual Foundations Across Disciplines (Preprints.org) — Cross-disciplinary analysis of how knowledge representation concepts appear in education, philosophy, and computer science
- What Is a Knowledge Representation? (AAAI) — The seminal 1993 paper defining five roles of knowledge representation, still the best introduction to the field
- Ontology-Grounded GraphRAG (PubMed) — How ontology-grounded knowledge graphs reduce hallucination in clinical question-answering systems
- Your Brain Already Has a Knowledge Graph (Polymathik) — The parallel between neural associative networks and computational knowledge graphs
Footnotes
-
Liew, A. (2013). Data, Information, Knowledge, Wisdom (DIKW): A Semiotic Theoretical and Empirical Exploration of the Hierarchy and its Quality Dimension. ResearchGate. The foundational academic treatment of the DIKW hierarchy and what differentiates each layer. ↩ ↩2
-
Boulton, C. (2024). The Smallest Useful Unit. How to Think. Defines the claim as the atom of knowledge — the smallest unit that can be true or false. ↩ ↩2 ↩3
-
Boulton, C. (2024). What is Atomisation?. Unstoppable Learning. Introduces three types of knowledge atoms: routine, factual, and conceptual. ↩ ↩2
-
Boulton, C. (2024). Factual Atomisation: Teaching Facts That Stick. Unstoppable Learning. Detailed treatment of factual atomisation and why conceptual atoms are harder to extract. ↩
-
Boulton, C. (2024). Questions Are Atomic Too. How to Think. Argues that questions are the inverse of claims and equally fundamental to knowledge architecture. ↩
-
W3C. (2014). RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. The standard defining semantic triples as Subject-Predicate-Object structures. ↩ ↩2
-
Boulton, C. (2024). A Schema Is a Mental Model Made Explicit. How to Think. Connects schema theory to knowledge representation and learning. ↩ ↩2 ↩3
-
Boulton, C. (2024). Everyone Operates on Schemas. How to Think. Argues that schemas are universal cognitive infrastructure, not optional tools. ↩
-
Constructivism and Bloom’s taxonomy are foundational concepts in learning science, synthesised here from Piaget, Vygotsky, and Anderson and Krathwohl’s revised taxonomy (2001). ↩ ↩2
-
Polymathik. (2024). Your Brain Already Has a Knowledge Graph. Medium. Draws the parallel between neural associative networks and computational knowledge graphs. ↩
-
Boulton, C. (2024). Building Your Graph Is Building Your Mind. How to Think. Argues that constructing an explicit knowledge graph mirrors and accelerates cognitive schema formation. ↩
-
Davis, R., Shrobe, H., and Szolovits, P. (1993). What Is a Knowledge Representation?. AI Magazine / AAAI. The seminal paper on knowledge representation as a field, defining five roles of a KR. ↩
-
Wikipedia. (2024). Ontology (information science). Wikipedia. Overview of ontologies as formal specifications of conceptualisations, including history and applications. ↩ ↩2
-
Knowledge Systems Authority. (2024). Knowledge Representation Methods. KSA. Survey of knowledge representation approaches including graphs, frames, and semantic networks. ↩
-
Knowledge Systems Authority. (2024). Knowledge Ontologies and Taxonomies. KSA. Distinguishes ontologies from taxonomies and explains when each is appropriate. ↩
-
Aalto University. (2024). Chunking Strategies and Their Impact on RAG Accuracy. Master’s thesis. Empirical study demonstrating that chunk size directly and measurably affects retrieval accuracy. ↩ ↩2 ↩3
-
Glukhov, A. (2024). Chunking Strategies in RAG. Comparison of fixed, recursive, semantic, and agentic chunking strategies with tradeoff analysis. ↩
-
Fehlau, M. (2025). The Theoretical Foundations of Metadata in Knowledge Management. Argues that metadata is epistemological infrastructure, not administrative overhead. ↩ ↩2
-
Shaik, M. (2024). Beyond Fixed Chunks: How Semantic Chunking and Metadata Enrichment Transform RAG Accuracy. Medium. Demonstrates improvements from combining semantic chunking with metadata enrichment. ↩
-
Embeddings as vector representations of meaning are a foundational concept in modern NLP, originating from Mikolov et al. (2013) Word2Vec and extended through transformer architectures. ↩
-
PubMed. (2025). Ontology-Grounded Knowledge Graphs for Mitigating Hallucinations in LLMs for Clinical QA. Demonstrates that ontology-grounded GraphRAG achieved approximately 1.7% hallucination with 98% accuracy vs baseline LLMs. ↩ ↩2 ↩3