Cognitivism
The learning paradigm that treats the mind as an information processor — studying how knowledge is encoded into memory, organised internally, and retrieved when needed.
What is it?
Cognitivism emerged in the 1950s and 1960s as a direct rebellion against behaviorism. Where behaviorists refused to study the mind (calling it an unobservable “black box”), cognitivists argued that the black box was exactly where learning happened. Observable behaviour is just the output — to understand learning, you need to understand the internal processes that produce it.1
The cognitive revolution was sparked by researchers who drew an analogy between the human mind and a computer. Just as a computer takes in data, processes it through algorithms, and stores the results, the mind takes in sensory information, processes it through cognitive operations, and stores it in memory. This analogy, while imperfect, proved enormously productive. It gave researchers a vocabulary and a set of models for studying mental processes that had previously been considered beyond the reach of science.2
Three foundational contributions defined cognitivism. George Miller (1956) demonstrated the limits of working memory — we can hold roughly 4 to 7 items in mind at once. Richard Atkinson and Richard Shiffrin (1968) proposed the multi-store model of memory, describing how information flows from sensory memory to short-term (working) memory to long-term memory. Allan Paivio (1971) developed dual coding theory, showing that information encoded through both verbal and visual channels is remembered better than information encoded through only one.345
Cognitivism provides the theoretical foundation for many of the study strategies that actually work: spaced-repetition, retrieval-practice, chunking, elaborative rehearsal, and concept mapping. If behaviorism tells you that practice works, cognitivism tells you why — and, more importantly, which kinds of practice work best for which kinds of learning.1
In plain terms
Cognitivism treats your mind like a kitchen. Information comes in as raw ingredients (sensory input). Your working memory is the countertop — small, limited, and cluttered easily. Long-term memory is the pantry — vast, well-organised (ideally), but requiring effort to retrieve what you need. Learning is the process of moving information from the countertop to the pantry in a way that lets you find it again later.
At a glance
The multi-store memory model (click to expand)
graph LR S[Sensory Memory] -->|attention| WM[Working Memory] WM -->|encoding| LTM[Long-Term Memory] LTM -->|retrieval| WM WM -->|decay or displacement| F[Forgotten] S -->|not attended| F style WM fill:#4a9ede,color:#fffKey: Information enters through the senses, is filtered by attention into working memory (the bottleneck), and is encoded into long-term memory through rehearsal and elaboration. Retrieval brings stored information back into working memory for use. Information that is not attended to or not encoded is forgotten.
How does it work?
The mind as information processor
The central metaphor of cognitivism is the information-processing model: the mind receives input, transforms it through a series of stages, and produces output. This is not a claim that the brain is literally a computer — it is a useful analogy that enables precise questions about mental processes.2
The key stages are:
- Perception — sensory organs detect information from the environment
- Attention — a filter selects which information to process further (most is discarded)
- Encoding — selected information is transformed into a mental representation
- Storage — the representation is maintained in memory
- Retrieval — the stored representation is located and brought back into conscious awareness
Each stage is a potential bottleneck. Understanding where information is lost — and why — is the core contribution of cognitivism to learning science.1
Think of it like...
The information-processing model works like a mail sorting facility. Letters arrive in huge volumes (sensory memory). Most are discarded as junk (inattention). The ones that make it to the sorting desk (working memory) can only be handled a few at a time. If properly filed, they end up in the correct drawer (long-term memory) where they can be retrieved later. But if the filing system is poor, the letter is effectively lost — even though it’s still in the building somewhere.
Working memory: the bottleneck
Working memory is where conscious thinking happens — and it is severely limited. George Miller’s landmark 1956 paper, “The Magical Number Seven, Plus or Minus Two,” established that people can hold approximately 7 (plus or minus 2) items in short-term memory at once.3
More recent research by Nelson Cowan has revised this estimate downward. The true capacity of working memory, when chunking and rehearsal strategies are controlled for, appears to be closer to 4 plus or minus 1 items.6 This means your conscious mind can juggle roughly four things at a time — and every new item risks displacing an existing one.
This limit has profound implications for learning:
- Cognitive overload occurs when the amount of information exceeds working memory capacity. When this happens, nothing is processed effectively — not even the important parts.7
- Chunking is the strategy of grouping individual items into meaningful units. The phone number 0-2-1-4-5-5-8-9-2-1 is 10 items. Chunked as 021-455-8921, it’s 3 items. Experts in any domain are expert chunkers — they see patterns where novices see individual elements.3
- Automation frees capacity. When a skill becomes automatic through practice (a behaviorist mechanism), it no longer occupies working memory, leaving capacity for higher-level thinking. This is why you can drive and hold a conversation simultaneously, but a learner driver cannot.7
Example: why programming is hard to learn (click to expand)
Consider a beginner learning to write a function in Python. They must simultaneously hold in working memory: the syntax for defining a function, the parameter names, the logic of what the function should do, the variable types, how indentation works, and where to put the return statement. That’s at least 6 items — near or beyond working memory capacity.
An experienced programmer has chunked most of these into a single pattern: “define a function.” This leaves working memory free to focus on the logic. The beginner’s struggle is not a failure of intelligence — it’s a failure of chunking. They haven’t yet compressed enough sub-skills into automatic chunks.
The Atkinson-Shiffrin multi-store model
In 1968, Richard Atkinson and Richard Shiffrin proposed the most influential model of memory architecture: the multi-store model (also called the modal model). It describes three distinct memory stores:4
Sensory memory holds raw sensory input for a very brief duration — roughly 0.5 seconds for visual information (iconic memory) and 3-4 seconds for auditory information (echoic memory). Most of this information decays without ever being noticed. Only information that receives attention passes to the next store.4
Short-term memory (working memory) holds attended information for approximately 15-30 seconds without rehearsal. Its capacity is limited (Miller’s 4-7 items). Information is maintained here through rehearsal — actively repeating or manipulating it. If not rehearsed, it decays or is displaced by new incoming information.4
Long-term memory has effectively unlimited capacity and duration. Information is transferred from working memory to long-term memory through encoding — a process that is strengthened by elaboration (connecting new information to existing knowledge), organisation (structuring information meaningfully), and retrieval practice (actively recalling information rather than passively re-reading it).4
Think of it like...
Sensory memory is like glancing at a crowded room — you see everything for a split second but retain almost nothing. Working memory is like your hands — you can hold a few things at once, but you have to put something down to pick something new up. Long-term memory is like your house — it can hold everything you own, but finding a specific item depends on how well you’ve organised it.
Concept to explore
See cognitive-load-theory for how John Sweller built on working memory limits to develop a theory of instructional design — how to present information so that it doesn’t overwhelm the learner’s working memory.
Encoding strategies: how information gets into long-term memory
Not all study strategies are equally effective at encoding information into long-term memory. Cognitivism explains why:1
Rote rehearsal (repeating information over and over) is the weakest encoding strategy. It keeps information in working memory but does little to transfer it to long-term memory. This is why re-reading notes is one of the least effective study methods.1
Elaborative rehearsal connects new information to existing knowledge. “The mitochondria is the powerhouse of the cell” is rote. “The mitochondria converts glucose into ATP, which is the energy currency cells use — like a power plant that converts fuel into electricity” is elaborative. The connections create multiple retrieval paths.2
Organisation structures information into meaningful groups. A list of 20 random words is hard to remember. The same 20 words sorted into categories (animals, foods, colours, tools) is much easier — because the category structure provides a framework for storage and retrieval.2
Retrieval practice — actively recalling information from memory rather than passively re-reading it — is the most powerful encoding strategy. Each successful retrieval strengthens the memory trace and creates new pathways to the information. This is why flashcard testing works better than flashcard reading, and why practice exams work better than re-reading notes.8
Concept to explore
See retrieval-practice for how the testing effect works and why active recall is more effective than passive review for long-term retention.
Dual coding theory (Paivio)
Allan Paivio proposed in 1971 that the mind processes information through two distinct but connected channels: a verbal channel (words, language, speech) and a non-verbal channel (images, spatial information, visual patterns). Information that is encoded through both channels simultaneously is remembered significantly better than information encoded through only one.5
This explains why:
- Diagrams with labels are more memorable than either diagrams or text alone
- Concrete words (“dog,” “table”) are easier to remember than abstract words (“justice,” “entropy”) — because concrete words activate both verbal and visual representations
- Drawing a concept while learning it produces better retention than writing about it
Think of it like...
Dual coding is like saving a file in two locations. If you save a document only on your laptop, you lose it when the laptop fails. If you also save it to the cloud, you have a backup. Similarly, encoding information both verbally and visually gives your brain two routes to retrieve it — if one fails, the other may succeed.
Spaced repetition: why timing matters
Cognitivism also explains why the timing of practice matters as much as the amount. Hermann Ebbinghaus (1885) demonstrated the forgetting curve: newly learned information decays rapidly at first, then more slowly over time. Without review, roughly 70% of new information is forgotten within 24 hours.8
Spaced repetition exploits a counterintuitive finding: reviewing information just as you’re about to forget it produces stronger encoding than reviewing it while it’s still fresh. Each successful retrieval at the edge of forgetting resets and flattens the forgetting curve, making the memory more durable. This is the theoretical basis for spaced repetition software like Anki.8
From a cognitivist perspective, spaced-repetition works because it combines two powerful mechanisms: retrieval practice (the act of recalling) and desirable difficulty (the effort required when information is partially forgotten). The difficulty signals to the encoding system that this information is important and worth strengthening.8
Concept to explore
See spaced-repetition for the practical mechanics of spaced repetition systems — algorithms, scheduling, and how to use them effectively.
Strengths and limitations
Cognitivism excels at:
- Explaining why certain study strategies work (retrieval practice, spaced repetition, dual coding, chunking)
- Designing information to fit within working memory limits (cognitive-load-theory)
- Predicting when learners will become overloaded and how to prevent it
- Providing precise models of memory that can be tested and refined
Cognitivism struggles with:
- Context and meaning. Cognitivism treats information as data to be processed, but constructivism argues that meaning is not inherent in information — it is constructed by the learner based on prior knowledge and experience.1
- Social and cultural factors. The information-processing model is individualistic. It doesn’t account for how social interaction, culture, and community shape learning.9
- Motivation and emotion. Cognitivism models the mechanics of memory but says little about why someone would want to learn in the first place, or how emotional states affect processing.1
- Creativity and insight. Like behaviorism, cognitivism has difficulty explaining moments of sudden understanding or creative leaps that don’t fit the sequential processing model.9
Key distinction
Behaviorism asks: “Did the learner perform the correct behaviour?” Cognitivism asks: “Did the learner encode, store, and retrieve the information?” Neither asks the constructivist question: “Did the learner build understanding?”
Why do we use it?
Key reasons
1. Designing study strategies that actually work. Cognitivism is the paradigm behind evidence-based learning techniques: spaced repetition, retrieval practice, interleaving, dual coding, and chunking. Without the cognitivist model of memory, these strategies would be unexplained observations rather than principles you can apply deliberately.8
2. Understanding and preventing cognitive overload. If you’ve ever been overwhelmed by a lecture, a textbook page, or a user interface, you’ve experienced what cognitivism predicts: working memory has been exceeded. Cognitive-load-theory provides specific design principles for managing this bottleneck.7
3. Explaining why some knowledge sticks and some doesn’t. Cognitivism reveals that the problem is usually not exposure (you saw the information) but encoding (you didn’t process it deeply enough) or retrieval (you stored it but can’t find it). This shifts the focus from “study more” to “study differently.”1
When do we use it?
- When choosing study strategies and want to know which ones are supported by evidence (retrieval practice, spaced repetition, dual coding)
- When designing educational materials and need to manage information density to avoid overloading working memory
- When explaining to a learner why they forgot something they studied — and what to do differently
- When building knowledge systems and need to structure information in ways that support human encoding and retrieval (e.g., chunking, hierarchical organisation)
- When deciding whether to use text, diagrams, or both to present information (dual coding)
- When implementing spaced repetition software and need to understand the theoretical basis for scheduling algorithms
Rule of thumb
If the question is “How do I remember this?” or “Why did I forget this?”, cognitivism has the answer. If the question is “How do I understand this?”, you also need constructivism.
How can I think about it?
The library analogy
Your mind is like a library. Sensory memory is the loading dock where deliveries arrive constantly — most are never unpacked. Working memory is the reading desk: you can have 4 or 5 books open at once, but adding a sixth means closing one of the others. Long-term memory is the stacks — millions of books stored on shelves.
The quality of your learning depends on three things: (1) which deliveries you choose to unpack (attention), (2) how well you catalogue each book (encoding), and (3) how well-organised the shelves are (schema organisation). A library with millions of books but no catalogue is useless — you can’t find anything. A library with a brilliant catalogue but only five books is too limited. Effective learning requires both capacity and organisation.
- Loading dock = sensory memory
- Reading desk = working memory (limited surface area)
- Book stacks = long-term memory
- Catalogue system = schemas and retrieval cues
- Librarian who helps you find books = retrieval practice
The photography analogy
Learning through a cognitivist lens is like photography. Your camera sensor (sensory memory) captures everything in the field of view, but you choose what to focus on (attention). The image is temporarily held in the camera buffer (working memory) while you decide whether to keep it. If you save it, it goes to the memory card (long-term memory).
But saving the photo is not enough — you also need to find it later. A photographer who dumps thousands of unsorted photos into a single folder will never find the one they need. A photographer who tags, organises, and catalogues their photos can retrieve any image in seconds. Encoding strategies (elaboration, organisation, dual coding) are the tagging and sorting of your mental photo library. Retrieval practice is the act of searching your library, which — paradoxically — makes future searches faster.
- Camera sensor = sensory memory
- Focusing = attention
- Camera buffer = working memory
- Memory card = long-term memory
- Tagging and sorting = encoding strategies
- Searching your photo library = retrieval practice
Concepts to explore next
| Concept | What it covers | Status |
|---|---|---|
| cognitive-load-theory | How working memory limits constrain instructional design — intrinsic, extraneous, and germane load | stub |
| retrieval-practice | Why actively recalling information strengthens memory more than re-reading | stub |
| spaced-repetition | The scheduling of review at optimal intervals to maximise long-term retention | stub |
| schema-theory | How knowledge is organised into mental frameworks that guide encoding and retrieval | complete |
| knowledge-granularity | How to decompose knowledge into appropriately sized units for both human learning and machine retrieval | complete |
Some cards don't exist yet
A broken link is a placeholder for future learning, not an error.
Check your understanding
Test yourself (click to expand)
- Explain why cognitivism emerged as a response to behaviorism. What specific limitation of behaviorism did cognitivists address?
- Name the three memory stores in the Atkinson-Shiffrin model and describe the capacity and duration of each.
- Distinguish between rote rehearsal and elaborative rehearsal. Why does the second produce more durable learning?
- Interpret this scenario: a medical student studies anatomy by re-reading the textbook five times. They feel confident but perform poorly on the exam. Using cognitivism, diagnose the problem and prescribe a better study strategy.
- Connect cognitivism to schema-theory. How do schemas function as an organisational structure within long-term memory, and why does this matter for encoding new information?
Where this concept fits
Position in the knowledge graph
graph TD LP[Learning Paradigms] --> B[Behaviorism] LP --> C[Cognitivism] LP --> CO[Constructivism] LP --> CN[Connectivism] C --> CLT[Cognitive Load Theory] style C fill:#4a9ede,color:#fffRelated concepts:
- schema-theory — schemas are the organisational structure within long-term memory that cognitivism relies on for encoding and retrieval
- retrieval-practice — the most powerful encoding strategy explained by cognitivism, where the act of recalling information strengthens memory traces
- spaced-repetition — the application of cognitivist forgetting-curve research to optimise the timing of review
- knowledge-granularity — cognitivism’s working memory limits directly inform how finely knowledge should be chunked for effective processing
Sources
Further reading
Resources
- Multi-Store Memory Model: Atkinson and Shiffrin (Simply Psychology) — Clear visual explanation of the three-store memory model
- Information Processing Theory (Research.com) — Comprehensive overview of the information-processing approach with practical implications
- Cognitive Load Theory (MindTools) — Practical guide to managing working memory demands in learning and presentation design
- Dual Coding: A Teacher’s Guide (Structural Learning) — How to apply Paivio’s dual coding theory in practice
- George Miller’s Magical Number in Retrospect (Cowan, 2015) — How Miller’s original estimate has been revised and what it means for modern cognitive science
Footnotes
-
Structural Learning. (2022). Learning Theories: Behaviourism, Cognitivism, Constructivism. Structural Learning. ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8
-
Research.com. (2025). What is Information Processing Theory? Stages, Models & Limitations. Research.com. ↩ ↩2 ↩3 ↩4
-
Miller, G. A. (1956). The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. Psychological Review, 63(2), 81-97. ↩ ↩2 ↩3
-
Simply Psychology. (2022). Multi-Store Memory Model: Atkinson and Shiffrin. Simply Psychology. ↩ ↩2 ↩3 ↩4 ↩5
-
Paivio, A. (1971). Imagery and Verbal Processes. Holt, Rinehart, and Winston. ↩ ↩2
-
Cowan, N. (2015). George Miller’s Magical Number of Immediate Memory in Retrospect. Psychological Review, 122(3), 536-541. ↩
-
MindTools. (2024). Cognitive Load Theory. MindTools. ↩ ↩2 ↩3
-
Structural Learning. (2023). Information Processing Theory: A Teacher’s Guide to Memory. Structural Learning. ↩ ↩2 ↩3 ↩4 ↩5
-
BCL Training. (2025). Comparing Behaviorism, Cognitivism, and Constructivism. BCL Training. ↩ ↩2
