Topological Sort
An algorithm that arranges nodes in a directed graph into a linear sequence where every node appears after all of its dependencies.
What is it?
A topological sort takes a directed acyclic graph (DAG) --- a set of nodes-and-edges where edges have direction and no cycles exist --- and produces a linear ordering where every node comes after everything it depends on.1
The problem it solves is universal: given a set of tasks with dependencies, in what order should you do them so that you never start a task before its prerequisites are finished? This is the same question whether you are compiling software, scheduling university courses, planning a construction project, or building a learning path through a knowledge graph.2
The key constraint is that the graph must be acyclic --- there can be no circular dependencies. If A depends on B and B depends on A, no valid ordering exists. Detecting and breaking such cycles is an essential part of the process.3
Topological sort is not one specific algorithm but a class of solutions. The two most common are Kahn’s algorithm (which works by repeatedly removing nodes with no remaining dependencies) and DFS-based post-order (which works by exploring paths to their end and then recording nodes in reverse). Both produce a valid ordering, though the specific order may differ when multiple valid orderings exist.1
In plain terms
A topological sort is like figuring out the right order to get dressed in the morning. You cannot put on shoes before trousers, and you cannot put on trousers before underwear. Given all those “must come before” rules, a topological sort produces a sequence that respects every one of them.
At a glance
From dependency graph to sorted order (click to expand)
graph LR A[Underwear] --> B[Trousers] B --> C[Shoes] B --> D[Belt] E[Shirt] --> D E --> F[Jacket] D --> FOne valid topological order: Underwear ⇒ Shirt ⇒ Trousers ⇒ Belt ⇒ Shoes ⇒ Jacket
Key: Arrows mean “must come before.” The sorted order respects every arrow: no item appears before something it depends on. Notice that Shirt and Underwear have no dependency between them, so either can come first --- topological sort allows multiple valid orderings when items are independent.
How does it work?
1. The dependency graph
The input is a directed acyclic graph (DAG). Each node is a task, concept, or item. Each directed edge means “this must come before that.” The edge from Underwear to Trousers means “Underwear must be done before Trousers.”1
The graph must be acyclic --- if you can follow edges from a node and eventually return to the same node, that is a cycle, and no topological ordering is possible. For example: if Course A requires Course B, and Course B requires Course A, no student can ever begin either course.
Think of it like...
A recipe where step 3 says “add the mixture from step 2” and step 2 says “chop the ingredients from step 1.” Each step points forward. If step 3 said “go back to step 1,” you would be stuck in an infinite loop --- that is a cycle.
2. Kahn’s algorithm (remove the ready items)
Kahn’s algorithm is the most intuitive approach:2
- Find all nodes with no incoming edges (no dependencies). These are “ready” --- nothing needs to happen before them
- Remove one of these nodes from the graph and add it to the output list
- Remove all edges leaving that node (its dependents now have one fewer dependency to wait for)
- Repeat: check for new nodes with no incoming edges, remove them, and continue
- When the graph is empty, the output list is a valid topological order
If the graph is not empty but no nodes have zero incoming edges, a cycle exists and the sort is impossible.
Example: sorting a course schedule (click to expand)
Consider four university courses with prerequisites:
Course Prerequisites Calculus I None Linear Algebra None Calculus II Calculus I Multivariable Calculus Calculus II, Linear Algebra graph LR C1[Calculus I] --> C2[Calculus II] C2 --> MV[Multivariable Calculus] LA[Linear Algebra] --> MVApplying Kahn’s algorithm:
- Nodes with no incoming edges: Calculus I, Linear Algebra (both ready)
- Remove Calculus I ⇒ output: [Calculus I]. Now Calculus II has no incoming edges
- Remove Linear Algebra ⇒ output: [Calculus I, Linear Algebra]
- Remove Calculus II ⇒ output: [Calculus I, Linear Algebra, Calculus II]
- Remove Multivariable Calculus ⇒ output: [Calculus I, Linear Algebra, Calculus II, Multivariable Calculus]
An equally valid order: [Linear Algebra, Calculus I, Calculus II, Multivariable Calculus]. Both respect all prerequisites.
3. DFS-based approach (explore to the end, then record)
The alternative approach uses depth-first search (DFS):3
- Pick any unvisited node and follow its edges as deep as possible
- When you reach a node with no unvisited dependents, add it to the front of the output list (or record it in reverse)
- Backtrack and continue exploring unvisited paths
- Repeat until all nodes are visited
The result is the same: a valid topological order. DFS-based sorting is often preferred in implementations because it naturally integrates with cycle detection --- if the DFS encounters a node it is currently processing (not just one it has finished), a cycle exists.3
Think of it like...
Imagine untangling a chain of paperclips. You grab one end and pull it out as far as it goes. The last clip in the chain (the one with nothing after it) gets placed first in your sorted line. Then you backtrack and handle the remaining branches.
4. Cycle detection
A cycle means a circular dependency: A needs B, B needs C, C needs A. No valid ordering exists because there is no starting point.1
Both algorithms detect cycles as a side effect:
- Kahn’s: if the graph is not empty at the end but no node has zero incoming edges, a cycle remains
- DFS: if you encounter a node that is currently in the recursion stack (grey node), you have found a back edge --- a cycle
When a cycle is detected, the system must either report an error or break the cycle by removing one of the offending edges.
Key distinction
A tree (like a taxonomy) is always acyclic --- you can always topologically sort it. A general directed graph might contain cycles, which must be detected and resolved before sorting is possible.
Why do we use it?
Key reasons
1. Dependency resolution. Any system where tasks have prerequisites needs topological sort. Package managers (npm, pip, apt) use it to install dependencies in the right order. Build systems (Make, Gradle) use it to compile source files before the files that depend on them.2
2. Learning path generation. In a knowledge graph with prerequisite edges, topological sort produces a valid learning sequence --- ensuring a student encounters every concept only after mastering its prerequisites.4
3. Scheduling. Project management tools use topological sort to determine the critical path --- the longest chain of dependent tasks that determines the minimum project duration.1
4. Cycle detection. Running a topological sort is the standard way to check whether a directed graph contains cycles. If the sort fails, cycles exist --- and the algorithm can identify where they are.
When do we use it?
- When installing software packages that depend on other packages
- When compiling code where source files import other source files
- When generating a learning path from a graph of concepts with prerequisites
- When scheduling tasks in a project where some tasks must finish before others can start
- When validating a dependency graph to ensure it contains no circular references
Rule of thumb
If your problem involves “do things in the right order, given a set of must-come-before rules,” you are looking at a topological sort problem.
How can I think about it?
The getting-dressed analogy
Getting dressed in the morning is a topological sort.
- Each garment is a node
- Each must-come-before rule is a directed edge (underwear before trousers, trousers before belt)
- Some garments are independent --- socks and shirt have no dependency between them, so either can come first
- The sorted order is a valid sequence for getting dressed
- A cycle would be like saying “you must wear your jacket before your shirt, and your shirt before your jacket” --- impossible, so you would have to break one of the rules
- Multiple valid orderings exist: you could put on socks before or after your shirt, as long as both come before shoes
The construction project
Building a house is a topological sort of construction tasks.
- Foundation must come before walls
- Walls must come before roof
- Roof must come before interior painting
- Electrical wiring must come before drywall, which must come before painting
- Plumbing can happen in parallel with electrical (no dependency between them)
- The project manager runs a topological sort to produce a valid construction schedule
- If someone accidentally creates a circular dependency (“painting must finish before we pour the foundation”), the sort fails and flags the error
- The critical path (longest chain of dependencies) determines the minimum project duration
Concepts to explore next
| Concept | What it covers | Status |
|---|---|---|
| taxonomies | Hierarchical classification --- always a DAG, always sortable | complete |
| knowledge-graphs | The broader graph structure that topological sort operates on | complete |
| llm-pipelines | How topological sort orders the stages of an LLM processing pipeline | stub |
Some cards don't exist yet
A broken link is a placeholder for future learning, not an error.
Check your understanding
Test yourself (click to expand)
- Explain what a topological sort does, using the getting-dressed analogy.
- Name the two main algorithms for topological sorting and describe the key idea behind each.
- Distinguish between a DAG and a graph with cycles. Why can only DAGs be topologically sorted?
- Interpret this scenario: a package manager tries to install packages A, B, and C, but reports a “circular dependency” error. What does this mean, and what would you look for to fix it?
- Connect topological sort to learning systems: if a knowledge graph has prerequisite edges between concepts, how would topological sort help generate a study plan?
Where this concept fits
Position in the knowledge graph
graph TD KG[Knowledge Graphs] --> NE[Nodes and Edges] KG --> TAX[Taxonomies] KG --> TS[Topological Sort] TS -.->|related| TAX TS -.->|related| LP[LLM Pipelines] style TS fill:#4a9ede,color:#fffRelated concepts:
- taxonomies --- a taxonomy is always a DAG, so it can always be topologically sorted; the sorted order represents a path from general to specific
- knowledge-graphs --- topological sort is one of the key algorithms for traversing and making use of a knowledge graph’s dependency structure
- llm-pipelines --- multi-step AI pipelines often have stage dependencies that are resolved by topological sorting
Sources
Further reading
Resources
- Topological Sorting (GeeksforGeeks) --- Comprehensive reference with pseudocode for both Kahn’s and DFS approaches
- Topological Sort: Ordering Tasks with Dependencies (The Coder Cafe) --- Beginner-friendly walkthrough with real-world examples and clear diagrams
- Topological Sort Explained from First Principles (Codeintuition) --- Builds intuition from scratch, covering both algorithms and cycle detection
- Topological Sorting in DAGs (upGrad) --- Practical guide with applications in scheduling, build systems, and data processing
- Topological Sort Explained! Kahn’s Algorithm for Beginners (YouTube) --- Visual video walkthrough ideal for learners who prefer watching over reading
Footnotes
-
GeeksforGeeks. (2024). Topological Sorting. GeeksforGeeks. ↩ ↩2 ↩3 ↩4 ↩5
-
Harsanyi, T. (2024). Topological Sort: Ordering Tasks with Dependencies. The Coder Cafe. ↩ ↩2 ↩3
-
Codeintuition. (2026). Topological Sort Explained from First Principles. Codeintuition. ↩ ↩2 ↩3
-
Kumar, A. (2025). Topological Sorting Explained: A Step-by-Step Guide for Dependency Resolution. Medium. ↩
