What Gives Knowledge Meaning — From Human Cognition to Machine Representation

You know that knowledge matters. You have heard about knowledge graphs, RAG, and embeddings. But you have not stopped to ask the deeper question: what is knowledge, what makes it meaningful, and why does the answer determine whether your AI system is reliable or dangerous? This article builds that foundation.


Who this is for

You have worked with AI systems or knowledge management tools. You understand that “garbage in, garbage out” applies to knowledge, but you want to understand why — what gives knowledge its structure, how humans and machines represent it differently, and what breaks when the representation is wrong.

You may have read agentic-design and understand how AI systems are structured. Now you want to understand the layer underneath: the knowledge itself.

What this article is NOT

This is not a tutorial on building a knowledge graph or a RAG pipeline. This is the epistemological foundation — the thinking that must happen before you choose a tool, a chunk size, or a schema. The implementation details come later.


Part 1 — What is knowledge?

Most people use “data,” “information,” and “knowledge” interchangeably. They are not the same thing. The distinction matters because every design decision in a knowledge system — what to store, how to structure it, how to retrieve it — depends on which layer you are working with.

The DIKW hierarchy is the standard framework for thinking about this.1

graph TD
    D[Data - raw symbols without context] -->|add context| I[Information - data with meaning]
    I -->|add experience| K[Knowledge - information you can act on]
    K -->|add judgement| W[Wisdom - knowing when and why to act]

    style D fill:#e8b84b,color:#fff
    style I fill:#4a9ede,color:#fff
    style K fill:#5cb85c,color:#fff
    style W fill:#9b59b6,color:#fff
LayerExampleWhat transforms it
Data”37.8”Nothing — it is a raw number
Information”The patient’s temperature is 37.8C”Context: who, what, when
Knowledge”A temperature of 37.8C is slightly elevated and may indicate early infection”Experience, pattern recognition, domain expertise
Wisdom”Given this patient’s history and current symptoms, we should monitor but not yet treat”Judgement, ethics, situational awareness

Each layer adds something the previous layer lacks. Data becomes information when you add context — who measured it, when, what it refers to. Information becomes knowledge when you add structure and relationships — connecting it to other things you know, recognising patterns, drawing inferences. Knowledge becomes wisdom when you add judgement — knowing what to do with what you know, and equally, what not to do.1

Why this matters for AI systems

Most AI systems operate at the information layer. They retrieve text, summarise it, and return it. A well-designed knowledge system operates at the knowledge layer — it understands relationships between pieces of information and can reason across them. The gap between these two layers is the gap between a search engine and an expert.


Part 2 — The atom of knowledge

If knowledge is built from smaller pieces, what is the smallest piece that still carries meaning?

The answer is the claim — also called a proposition. A claim is the smallest unit of knowledge that can be true or false. It is a single assertion about the world.2

  • “Paris is the capital of France” — claim.
  • “Vaccination reduces disease incidence” — claim.
  • “Paris” — not a claim. It is a label.
  • “Vaccines are good” — not a claim. It is a value judgement without a falsifiable assertion.

This matters because knowledge systems that cannot identify their atoms cannot reason reliably. If you store a paragraph as a single unit, you cannot ask “is this true?” — because the paragraph contains multiple claims, some of which may be true and others false.

Three types of atomisation

Boulton identifies three distinct types of knowledge atoms, each requiring a different approach:3

TypeWhat it containsExample
RoutineProcedural steps, sequences”To deploy, first run tests, then build, then push”
FactualDiscrete true/false assertions”The HTTP status code for ‘not found’ is 404”
ConceptualRelationships between ideas”Ontologies are to machines what schemas are to humans”

Factual atoms are the easiest to extract and verify. Conceptual atoms are the hardest — they encode relationships, not just facts.4 And routine atoms encode sequence, which means order matters.

Questions are atoms too

A question is the inverse of a claim. Where a claim asserts, a question probes. “What is the capital of France?” is the atomic question corresponding to the claim “Paris is the capital of France.” Good knowledge systems store both: the assertion and the question it answers.5

The computational parallel: semantic triples

Computer science arrived at the same atom independently. A semantic triple is a three-part structure: Subject — Predicate — Object.6

graph LR
    S[Subject<br/>Paris] -->|Predicate<br/>is capital of| O[Object<br/>France]

    style S fill:#4a9ede,color:#fff
    style O fill:#5cb85c,color:#fff
Human termCS termExample
ClaimTriple”Paris is the capital of France”
SubjectSubject nodeParis
RelationshipPredicateis capital of
ReferentObject nodeFrance

The parallel is not a coincidence. Both disciplines are solving the same problem: how do you decompose knowledge into its smallest meaningful unit while preserving the relationships that give it meaning?

The atom principle

The smallest useful unit of knowledge is not a word, a sentence, or a paragraph. It is a claim — a single assertion that can be true or false, and that encodes a relationship between two things. In human terms, this is a proposition. In machine terms, this is a triple. They are the same idea in different notation.2


Part 3 — How humans organise knowledge

Humans do not store knowledge as a flat list of claims. They organise it into frameworks — mental structures that group related knowledge, fill in gaps, and guide expectations. Cognitive psychology calls these schemas.7

A schema is a mental model made explicit.7 When a doctor hears “37.8C, cough, fatigue,” they do not process each fact independently. They activate a schema — “upper respiratory infection” — that connects these symptoms, predicts what else they might find, and suggests what to do next. The schema is not the facts. It is the structure that gives the facts meaning.

graph TD
    S[Schema: Upper Respiratory Infection] --> S1[Expected symptoms]
    S --> S2[Typical duration]
    S --> S3[Treatment protocol]
    S --> S4[Red flags]

    S1 --> F1[Fever]
    S1 --> F2[Cough]
    S1 --> F3[Fatigue]
    S4 --> F4[High fever >39C]
    S4 --> F5[Difficulty breathing]

    style S fill:#4a9ede,color:#fff
    style S4 fill:#e74c3c,color:#fff

Everyone operates on schemas, whether they know it or not.8 When you walk into a restaurant, you do not reason from first principles about what to do. You activate your “restaurant schema” — sit down, read menu, order, eat, pay. The schema handles the routine so your conscious mind can focus on exceptions.

Knowledge is constructed, not transmitted

Constructivism — the dominant theory in learning science — holds that knowledge cannot be transferred from one mind to another like a file. It must be built by the learner through interaction with new information, existing schemas, and experience.9

This has a radical implication: two people can read the same text and construct different knowledge from it, because they bring different schemas. The text is information. The knowledge is what each reader builds.

graph LR
    INFO[Same Information] --> L1[Learner A<br/>Schema X] --> K1[Knowledge A]
    INFO --> L2[Learner B<br/>Schema Y] --> K2[Knowledge B]

    style INFO fill:#e8b84b,color:#fff
    style K1 fill:#5cb85c,color:#fff
    style K2 fill:#5cb85c,color:#fff

Levels of cognitive processing

Bloom’s taxonomy describes six levels of cognitive processing, from shallow to deep:9

LevelVerbWhat it means
RememberRecallRetrieve a fact from memory
UnderstandExplainRestate in your own words
ApplyUseApply knowledge to a new situation
AnalyseDistinguishBreak apart and examine relationships
EvaluateJudgeAssess quality, validity, or fitness
CreateDesignProduce something new from existing knowledge

Each level requires the levels below it. You cannot analyse something you do not understand, and you cannot understand something you do not remember. This hierarchy maps directly to knowledge system design: a system that only retrieves facts (Remember) is fundamentally different from one that supports reasoning across relationships (Analyse) or synthesis (Create).

The brain as a knowledge graph

Your brain already has a knowledge graph.10 Schemas are its ontologies. Associations are its edges. When you learn something new, you are not adding a node to a flat list — you are wiring it into an existing network of relationships. Building your graph is building your mind.11


Part 4 — How machines represent knowledge

Machines face the same fundamental problem as humans: how do you represent knowledge so it can be stored, retrieved, and reasoned over? The solutions, developed over decades in knowledge-engineering, are strikingly parallel to how humans do it.12

Ontologies

An ontology is a formal specification of a conceptualisation — a structured vocabulary that defines the types of things that exist in a domain and the relationships between them.13

Where a human schema says “I know that diseases have symptoms, treatments, and risk factors,” an ontology says the same thing in machine-readable form: Disease hasSymptom Symptom, Disease hasTreatment Treatment, Disease hasRiskFactor RiskFactor.

graph TD
    D[Disease] -->|hasSymptom| SY[Symptom]
    D -->|hasTreatment| T[Treatment]
    D -->|hasRiskFactor| R[Risk Factor]
    D -->|subclassOf| MC[Medical Condition]
    SY -->|hasSeverity| SEV[Severity Level]

    style D fill:#4a9ede,color:#fff
    style MC fill:#9b59b6,color:#fff

An ontology is not a database. A database stores instances (“Patient John has Condition X”). An ontology defines the structure of what can be stored (“Patients can have Conditions, Conditions have Symptoms”).13 It is the schema that makes the data interpretable.

Knowledge graphs

A knowledge graph is an ontology populated with instances — nodes (things) connected by typed edges (relationships).14 If the ontology says “Cities can be capitals of Countries,” the knowledge graph says “Paris is the capital of France.”

graph LR
    P[Paris] -->|is capital of| F[France]
    P -->|located in| EU[Europe]
    F -->|member of| EUN[European Union]
    F -->|has language| FR[French]
    P -->|has population| POP[2.1M]

    style P fill:#4a9ede,color:#fff
    style F fill:#5cb85c,color:#fff

Knowledge graphs enable multi-hop reasoning: given “Paris is the capital of France” and “France is a member of the EU,” the system can infer “Paris is a capital city within the EU” — even though that fact was never explicitly stored.

Taxonomies

Taxonomies are hierarchical classification systems — a specific type of ontological structure where each level gets more specific.15 They answer the question “what kind of thing is this?”

graph TD
    ROOT[Knowledge Representation] --> FORMAL[Formal Methods]
    ROOT --> INFORMAL[Informal Methods]
    FORMAL --> ONT[Ontologies]
    FORMAL --> KG[Knowledge Graphs]
    FORMAL --> TAX[Taxonomies]
    INFORMAL --> FOLK[Folksonomies]
    INFORMAL --> TAG[Tag Systems]

    style ROOT fill:#4a9ede,color:#fff
    style FORMAL fill:#5cb85c,color:#fff
    style INFORMAL fill:#e8b84b,color:#fff

Semantic triples — the RDF model

The Resource Description Framework (RDF) is the W3C standard for encoding semantic-triples. Every statement is a triple: Subject — Predicate — Object. An entire knowledge graph is just a collection of triples.6

TripleSubjectPredicateObject
1Parisis capital ofFrance
2Francehas population67M
3Franceis member ofEU
4EUfounded in1993

From four triples, you can traverse a graph, answer questions, and draw inferences. This is the power of atomic representation: each triple is independently verifiable, updatable, and composable.

The parallel

Schemas are to humans what ontologies are to machines. Mental models are to cognition what knowledge graphs are to computation. The structures are different in implementation but identical in purpose: organise knowledge into a navigable network of typed relationships so it can be retrieved, reasoned over, and applied.


Part 5 — The granularity problem

You now understand that knowledge has atoms (claims, triples) and structures (schemas, ontologies, graphs). The critical design question is: at what level of decomposition do you store and retrieve knowledge?

This is the knowledge-granularity problem, and it has no universal answer. Too fine and you lose context. Too coarse and you lose precision. The right granularity depends on what you are trying to do with the knowledge.

Chunking strategies for RAG

In RAG systems, documents must be split into chunks for embedding and retrieval. The chunking strategy directly determines what the system can and cannot find.16

graph TD
    DOC[Full Document] --> FIX[Fixed-Size Chunks<br/>Split every N tokens]
    DOC --> REC[Recursive Chunks<br/>Split on structure: headings, paragraphs]
    DOC --> SEM[Semantic Chunks<br/>Split on meaning shifts]
    DOC --> AGT[Agentic Chunks<br/>LLM decides boundaries]

    FIX --> PROS1[Simple, predictable]
    FIX --> CONS1[Breaks mid-sentence]

    REC --> PROS2[Respects document structure]
    REC --> CONS2[Structure may not match meaning]

    SEM --> PROS3[Preserves semantic coherence]
    SEM --> CONS3[Expensive, model-dependent]

    AGT --> PROS4[Best boundaries]
    AGT --> CONS4[Slowest, most expensive]

    style DOC fill:#4a9ede,color:#fff
    style CONS1 fill:#e74c3c,color:#fff
    style CONS2 fill:#e74c3c,color:#fff
    style CONS3 fill:#e74c3c,color:#fff
    style CONS4 fill:#e74c3c,color:#fff
    style PROS1 fill:#5cb85c,color:#fff
    style PROS2 fill:#5cb85c,color:#fff
    style PROS3 fill:#5cb85c,color:#fff
    style PROS4 fill:#5cb85c,color:#fff

Each strategy sits on a spectrum from simple and fast to intelligent and expensive:17

StrategyHow it worksPreserves meaning?Cost
Fixed-sizeSplit every 512 tokensOften not — cuts mid-thoughtLowest
RecursiveSplit on headers, paragraphs, sentencesPartially — respects syntax, not semanticsLow
SemanticDetect meaning shifts via embeddingsUsually — groups coherent ideasMedium
AgenticLLM identifies natural boundariesBest — understands contentHighest

The Aalto thesis finding

Research from Aalto University demonstrated that chunk size directly and measurably affects retrieval accuracy.16 Smaller chunks improve precision (finding exactly the right fact) but reduce recall (missing the surrounding context that makes the fact interpretable). Larger chunks improve recall but reduce precision (retrieving too much irrelevant material alongside the target).

There is no magic number. But the research converges on a principle: the optimal chunk should contain exactly one coherent claim or closely related cluster of claims, plus enough context to interpret them.16

Metadata as epistemological context

Recent work argues that metadata — source, date, author, confidence, domain — is not administrative overhead but epistemological infrastructure.18 A chunk without metadata is a claim without provenance. You cannot evaluate its reliability, currency, or scope. Adding structured metadata to chunks is the equivalent of adding context to data in the DIKW hierarchy: it transforms raw text into interpretable knowledge.

Semantic chunking combined with metadata enrichment has been shown to significantly improve RAG accuracy over naive fixed-size approaches.19

The pedagogical parallel

The granularity problem in AI mirrors the atomisation problem in education.3 A teacher who breaks a lesson into atoms too small loses the narrative that connects them. A teacher who keeps the lesson as a monolithic block loses the ability to assess and remediate individual misunderstandings.

The principle is the same in both domains: the atom must be the smallest unit that can stand alone as a claim.2 Smaller than that, meaning disintegrates. Larger than that, you cannot isolate what is true from what is false, what is understood from what is not.

The granularity principle

Every knowledge system — human or machine — must answer the question: what is my atom? The answer determines everything downstream: how knowledge is stored, retrieved, evaluated, and composed. Get the atom wrong and no amount of downstream engineering can compensate.


Part 6 — Meaning reconstruction in probabilistic models

Humans reconstruct meaning from schemas. They encounter new information, match it against existing frameworks, and integrate or reject it. This process is active, structured, and grounded in experience.7

Large language models do something that looks similar but works differently. They reconstruct meaning from statistical patterns learned during training. They do not have schemas. They have what might be called statistical shadows of schemas — learned co-occurrence patterns that approximate the structure of knowledge without explicitly representing it.

Embeddings as compressed meaning

Embeddings are vector representations of text — lists of numbers that encode “meaning” as a position in high-dimensional space.20 Two pieces of text with similar meanings will have similar vectors, enabling the system to find conceptually related content even when the exact words differ.

graph LR
    T1[Paris is the capital of France] -->|embed| V1[Vector: 0.82, -0.31, ...]
    T2[France's capital city is Paris] -->|embed| V2[Vector: 0.81, -0.30, ...]
    T3[The weather in Tokyo is mild] -->|embed| V3[Vector: -0.45, 0.72, ...]

    V1 -.->|high similarity| V2
    V1 -.->|low similarity| V3

    style V1 fill:#5cb85c,color:#fff
    style V2 fill:#5cb85c,color:#fff
    style V3 fill:#e74c3c,color:#fff

Vector similarity functions as “conceptual proximity.” But it is proximity in a statistical space, not a logical one. Two statements can be semantically similar (close in vector space) while being logically contradictory. “Vaccines cause autism” is close to “Vaccines prevent disease” in embedding space because they share the same vocabulary and topic — but they make opposite claims.

Why ontology-grounded systems hallucinate less

Hallucination occurs when a model generates plausible-sounding but factually incorrect output. It happens because the model is reconstructing from statistical patterns, not from verified knowledge structures.

Ontology-grounded systems — those that combine LLMs with explicit knowledge graphs — show dramatically lower hallucination rates. A clinical question-answering system using ontology-grounded GraphRAG achieved approximately 1.7% hallucination with 98% accuracy, compared to significantly higher error rates in baseline LLMs.21

graph TD
    Q[Question] --> LLM_ONLY[LLM Only<br/>Statistical reconstruction]
    Q --> GROUNDED[Ontology-Grounded<br/>Graph + LLM]

    LLM_ONLY --> H[Higher hallucination<br/>No structural constraints]
    GROUNDED --> R[Lower hallucination<br/>Answers constrained by graph]

    style H fill:#e74c3c,color:#fff
    style R fill:#5cb85c,color:#fff

The reason is structural. An ontology constrains what the model can say. If the graph does not contain a relationship between Drug A and Condition B, the system will not fabricate one. The ontology acts as a structural guardrail — not eliminating errors, but dramatically reducing the space in which errors can occur.21

The gap: LLMs have no schema

This is the core insight of this article. Humans reconstruct meaning through schemas — structured, updatable, falsifiable frameworks built through experience. LLMs reconstruct meaning through statistical correlation — patterns learned from training data that approximate schemas without being schemas.

The implications for system design are profound:

PropertyHuman schemaLLM statistical pattern
StructureExplicit, navigable, typed relationshipsImplicit, distributed across weights
UpdatableYes — schemas evolve with experienceOnly through retraining or fine-tuning
FalsifiableYes — a schema can be wrong and correctedNo — patterns are statistical, not logical
ExplainablePartially — you can articulate your mental modelNo — the model cannot explain its representations
GroundedIn experience and evidenceIn training data (which may be wrong)

When you pair an LLM with an explicit knowledge graph, you are giving it something it lacks: a schema. The graph provides the structure. The LLM provides the fluency. Together, they approximate what a human expert does: reason within a structured framework while communicating naturally.

The design implication

Do not ask “how do I make my LLM know more?” Ask “how do I give my LLM a schema?” The answer is almost always: build a knowledge graph, ground retrieval in ontology, and let the model reason within explicit structural constraints.


Part 7 — Design implications

Everything in this article converges on a single insight: knowledge architecture is the highest-leverage investment you can make in any system that processes, stores, or retrieves knowledge — whether that system is a human learning environment or a machine intelligence pipeline.

The five questions every knowledge architect must answer:

graph TD
    Q1[1. What are my atoms?] --> Q2[2. What are my relations?]
    Q2 --> Q3[3. What granularity preserves meaning?]
    Q3 --> Q4[4. How do I ground retrieval?]
    Q4 --> Q5[5. How do I validate reconstruction?]

    Q1 -.-> A1[Claims, propositions, triples]
    Q2 -.-> A2[Ontology, typed edges, predicates]
    Q3 -.-> A3[Chunking strategy, atom boundaries]
    Q4 -.-> A4[Graph traversal, metadata, provenance]
    Q5 -.-> A5[Evaluation against source, human review]

    style Q1 fill:#4a9ede,color:#fff
    style Q2 fill:#4a9ede,color:#fff
    style Q3 fill:#4a9ede,color:#fff
    style Q4 fill:#4a9ede,color:#fff
    style Q5 fill:#4a9ede,color:#fff

Question 1: What are my atoms?

Decide what the smallest unit of knowledge is in your domain. For a legal system, it might be a statutory clause. For a medical system, a clinical assertion. For a learning system, a claim that can be true or false. Everything downstream — storage, retrieval, evaluation — depends on getting this right.

Question 2: What are my relations?

Define the types of relationships that connect your atoms. “Requires,” “contradicts,” “supports,” “is-a,” “part-of” — these are your predicates. Together with your atoms, they form the ontology. Without explicit relations, your knowledge is a bag of facts, not a graph of understanding.

Question 3: What granularity preserves meaning?

Choose a chunking or decomposition strategy that respects your atom boundaries. Test it empirically: retrieve chunks, read them in isolation, and ask “does this chunk contain enough context to be meaningful on its own?” If not, your granularity is too fine. If it contains multiple unrelated claims, it is too coarse.

Question 4: How do I ground retrieval?

Ensure that every piece of retrieved knowledge carries provenance: where it came from, when it was created, who authored it, how confident the system is in it. Metadata is not overhead — it is the epistemological infrastructure that makes knowledge trustworthy.18

Question 5: How do I validate reconstruction?

When a system reconstructs an answer from retrieved knowledge, how do you verify that the reconstruction is faithful to the source? This is where hallucination becomes a design problem, not a model problem. Ontology-grounded retrieval, citation of sources, and human-in-the-loop review are all validation mechanisms.21

Applying the five questions

Consider designing a knowledge base for a customer support system.

  1. Atoms: Each product feature, each known issue, each resolution step is a claim.
  2. Relations: “Product X has Feature Y,” “Issue A is resolved by Step B,” “Feature Y requires Configuration Z.”
  3. Granularity: Each chunk should contain one issue-resolution pair with enough context to act on.
  4. Grounding: Every resolution carries a source (documentation version, last verified date, success rate).
  5. Validation: Generated answers are checked against the knowledge graph — if the answer references a relationship that does not exist in the graph, it is flagged.

What you now understand

Mental models you have gained

  • The DIKW hierarchy — data, information, knowledge, and wisdom are distinct layers; each transformation adds context, structure, or judgement
  • Claims as atoms — the smallest useful unit of knowledge is a proposition that can be true or false, paralleling the semantic triple in computer science
  • Schema theory — humans organise knowledge into structured mental frameworks that guide perception, expectation, and reasoning
  • Constructivism — knowledge is built by the learner through interaction, not transmitted like a file
  • Ontologies and knowledge graphs — machines represent knowledge through formal type systems (ontologies) populated with instances (graphs), mirroring human schemas
  • The granularity problem — decomposition that is too fine loses context; too coarse loses precision; the atom must preserve the claim
  • Embeddings as compressed meaning — vector representations capture statistical proximity, not logical truth
  • Ontology-grounded systems — explicit knowledge structures constrain LLM output and dramatically reduce hallucination
  • The five questions — atoms, relations, granularity, grounding, and validation form the design checklist for any knowledge architecture

Check your understanding


Where to go next

I want to understand how to build agentic systems on top of this

You now understand the knowledge layer — the foundation. The next step is understanding how AI systems are structured to use that knowledge: routing, orchestration, pipelines, and human-in-the-loop design.

Read agentic-design for the architectural patterns.

Best for: People ready to design systems that reason over structured knowledge.

I want to go deeper on knowledge graphs and ontologies

The concept cards for knowledge-graphs, ontology, semantic-triples, and taxonomies provide detailed explanations, diagrams, and examples for each representation method.

Best for: People who want to understand the formal machinery of knowledge representation.

I want to understand the software fundamentals underneath

Knowledge systems run on software. If you have not read it yet, from-zero-to-building covers the base layer: frontend, backend, APIs, databases, and the document chain.

Best for: People who want to understand the full stack from infrastructure to intelligence.

I want to explore chunking and RAG in depth

The concept cards for rag, embeddings, knowledge-granularity, and structured-data-vs-prose go deeper on the practical design decisions for retrieval systems.

Best for: People building or improving RAG pipelines who want the theoretical grounding for their design choices.


Sources


Further reading

Resources

Footnotes

  1. Liew, A. (2013). Data, Information, Knowledge, Wisdom (DIKW): A Semiotic Theoretical and Empirical Exploration of the Hierarchy and its Quality Dimension. ResearchGate. The foundational academic treatment of the DIKW hierarchy and what differentiates each layer. 2

  2. Boulton, C. (2024). The Smallest Useful Unit. How to Think. Defines the claim as the atom of knowledge — the smallest unit that can be true or false. 2 3

  3. Boulton, C. (2024). What is Atomisation?. Unstoppable Learning. Introduces three types of knowledge atoms: routine, factual, and conceptual. 2

  4. Boulton, C. (2024). Factual Atomisation: Teaching Facts That Stick. Unstoppable Learning. Detailed treatment of factual atomisation and why conceptual atoms are harder to extract.

  5. Boulton, C. (2024). Questions Are Atomic Too. How to Think. Argues that questions are the inverse of claims and equally fundamental to knowledge architecture.

  6. W3C. (2014). RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. The standard defining semantic triples as Subject-Predicate-Object structures. 2

  7. Boulton, C. (2024). A Schema Is a Mental Model Made Explicit. How to Think. Connects schema theory to knowledge representation and learning. 2 3

  8. Boulton, C. (2024). Everyone Operates on Schemas. How to Think. Argues that schemas are universal cognitive infrastructure, not optional tools.

  9. Constructivism and Bloom’s taxonomy are foundational concepts in learning science, synthesised here from Piaget, Vygotsky, and Anderson and Krathwohl’s revised taxonomy (2001). 2

  10. Polymathik. (2024). Your Brain Already Has a Knowledge Graph. Medium. Draws the parallel between neural associative networks and computational knowledge graphs.

  11. Boulton, C. (2024). Building Your Graph Is Building Your Mind. How to Think. Argues that constructing an explicit knowledge graph mirrors and accelerates cognitive schema formation.

  12. Davis, R., Shrobe, H., and Szolovits, P. (1993). What Is a Knowledge Representation?. AI Magazine / AAAI. The seminal paper on knowledge representation as a field, defining five roles of a KR.

  13. Wikipedia. (2024). Ontology (information science). Wikipedia. Overview of ontologies as formal specifications of conceptualisations, including history and applications. 2

  14. Knowledge Systems Authority. (2024). Knowledge Representation Methods. KSA. Survey of knowledge representation approaches including graphs, frames, and semantic networks.

  15. Knowledge Systems Authority. (2024). Knowledge Ontologies and Taxonomies. KSA. Distinguishes ontologies from taxonomies and explains when each is appropriate.

  16. Aalto University. (2024). Chunking Strategies and Their Impact on RAG Accuracy. Master’s thesis. Empirical study demonstrating that chunk size directly and measurably affects retrieval accuracy. 2 3

  17. Glukhov, A. (2024). Chunking Strategies in RAG. Comparison of fixed, recursive, semantic, and agentic chunking strategies with tradeoff analysis.

  18. Fehlau, M. (2025). The Theoretical Foundations of Metadata in Knowledge Management. Argues that metadata is epistemological infrastructure, not administrative overhead. 2

  19. Shaik, M. (2024). Beyond Fixed Chunks: How Semantic Chunking and Metadata Enrichment Transform RAG Accuracy. Medium. Demonstrates improvements from combining semantic chunking with metadata enrichment.

  20. Embeddings as vector representations of meaning are a foundational concept in modern NLP, originating from Mikolov et al. (2013) Word2Vec and extended through transformer architectures.

  21. PubMed. (2025). Ontology-Grounded Knowledge Graphs for Mitigating Hallucinations in LLMs for Clinical QA. Demonstrates that ontology-grounded GraphRAG achieved approximately 1.7% hallucination with 98% accuracy vs baseline LLMs. 2 3