Structured Output

Constraining a language model to produce responses in a specific, machine-readable format — such as JSON with defined fields and types — rather than free-form text.


What is it?

When you ask a language model a question, it normally replies in natural language — sentences and paragraphs, like a person writing an email. That works well for conversation, but it creates a serious problem for automation. If the next step in your pipeline is a piece of software that needs to read the model’s answer, free-form text is unreliable. A program cannot easily extract the “price” from a sentence that says “The price is around $42, give or take” — it needs {"price": 42}.

Structured output is the practice of constraining an LLM so that its response conforms to a predefined schema — a blueprint that specifies exactly which fields must be present, what types they must be, and what values are allowed.1 Instead of asking the model to “describe the event”, you tell it to return a JSON object with title (string), date (string), location (string), and attendee_count (integer). The model’s response is then guaranteed (or at least strongly encouraged) to match that shape.

This matters because LLM pipelines — the parent concept llm-pipelines — pass data between stages. If stage 1 produces free-form text where stage 2 expects structured data, the pipeline breaks. Structured output is the mechanism that makes inter-stage handoffs reliable.2 It is also what enables tool-use: when a model decides to call a function, it must produce a structured object (the function name and arguments) that a program can execute — not a prose description of what it wants to do.

The prerequisite concept json explains the data format most commonly used for structured output. Understanding key-value pairs, objects, arrays, and nesting is essential before working with structured output, because JSON Schema — the validation language used to define the expected shape — builds directly on those primitives.

In plain terms

Structured output is like giving someone a form to fill in instead of asking them to write a letter. The form has labelled boxes (fields) with specific formats (date, number, yes/no). You get back exactly the information you need, in exactly the shape you need it — no extra prose, no missing fields, no surprises.


At a glance


How does it work?

Structured output is achieved through a combination of techniques, ranging from simple prompt instructions to deep integration with the model’s generation process. Each approach offers a different trade-off between simplicity and reliability.

1. Prompt-based enforcement

The simplest approach is to instruct the model in the prompt: “Return your answer as a JSON object with these fields.” This works surprisingly often, but it offers no guarantee. The model might add explanatory text before the JSON, use slightly different field names, or produce invalid syntax.3

For example:

Prompt: "Extract the event details. Return ONLY valid JSON with
these fields: title (string), date (string), location (string),
attendee_count (integer)."

Model response (hoping for the best):
{
  "title": "PyCon 2026",
  "date": "May 14-22",
  "location": "Pittsburgh",
  "attendee_count": 3500
}

Prompt-based enforcement is useful for prototyping and simple tasks, but it is insufficient for production systems where downstream code depends on the output shape being exact every time.

Think of it like...

Asking someone “please write your answer on the form” without actually giving them the form. Polite people will try, but you have no guarantee they will use the right format or include all the fields.

2. JSON Mode and response format constraints

Major model providers now offer a dedicated JSON mode that constrains the model’s output to valid JSON at the token-generation level. OpenAI’s response_format: { type: "json_object" } and Anthropic’s tool-use-based structured output both use this approach.4 5

JSON mode guarantees syntactically valid JSON, but it does not guarantee the JSON matches your specific schema. The model might return {"event_name": "PyCon"} when you expected {"title": "PyCon"}. This is where JSON Schema enforcement goes further.

3. JSON Schema enforcement

The most reliable approach combines JSON mode with a specific JSON Schema that defines the exact fields, types, and constraints. OpenAI’s Structured Outputs feature (introduced in 2024) accepts a JSON Schema or Pydantic model and uses constrained decoding to guarantee the output matches the schema exactly — not just valid JSON, but valid according to your schema.4

For example, using Pydantic with OpenAI:

from pydantic import BaseModel
 
class Event(BaseModel):
    title: str
    date: str
    location: str
    attendee_count: int
    is_virtual: bool
 
# The model MUST return an object matching this schema
response = client.chat.completions.parse(
    model="gpt-4o",
    response_format=Event,
    messages=[...]
)

This is constrained decoding: at each token-generation step, the model is only allowed to produce tokens that keep the output conforming to the schema. It cannot deviate, add extra fields, or use wrong types.1

Think of it like...

A web form with input validation. The date field only accepts dates. The number field only accepts numbers. The user cannot submit the form until all required fields are filled correctly. The validation is enforced by the form itself, not by hoping the user follows instructions.

4. Validation and retry loops

Even with schema enforcement, production systems add a validation layer after generation. This catches edge cases: a field that is syntactically valid but semantically wrong (e.g., attendee_count: -5), or a model that produces a refusal instead of data.3

The pattern is straightforward:

  1. Generate the response with schema constraints
  2. Validate the output against the schema (and business rules)
  3. If validation fails, retry with error feedback appended to the prompt
  4. After N retries, return a structured error or escalate

This validate-retry loop is a specific instance of the evaluator-optimizer pattern described in llm-pipelines.

Concept to explore

See guardrails for the broader framework of constraints that keep LLM systems safe and reliable — structured output is one guardrail among many.

5. The flexibility-reliability trade-off

Structured output introduces a fundamental tension: the more tightly you constrain the model’s output, the more reliable it becomes for automation — but the less room it has for nuance, explanation, or unexpected but useful information.2

ApproachReliabilityFlexibilityBest for
Free textLowHighConversation, creative tasks
Prompt-based JSONMediumMediumPrototyping, simple extraction
JSON ModeHighMediumGuaranteed valid JSON, flexible schema
Schema enforcementVery highLowProduction pipelines, tool calls

The right choice depends on what consumes the output. If a human reads it, flexibility matters. If a program reads it, reliability wins.


Why do we use it?

Key reasons

1. Automation reliability. Downstream systems — APIs, databases, other pipeline stages — need predictable data shapes. Structured output eliminates the fragile parsing layer between the LLM and the rest of the system.1

2. Reduced hallucination in tool-call pipelines. When a model must produce a specific function name and typed arguments, it is less likely to fabricate information than when writing free prose. The schema acts as a constraint that narrows the space of possible outputs.2

3. Validation becomes possible. You cannot meaningfully validate free text against a specification. With structured output, you can check every field for type, range, format, and presence — catching errors before they propagate downstream.3

4. Interoperability. Structured output in standard formats (JSON, XML) can be consumed by any programming language or system. This makes LLMs composable with existing software infrastructure, not siloed text generators.1


When do we use it?

  • When the LLM’s output will be consumed by code rather than read by a human
  • When building multi-stage pipelines where one stage’s output is the next stage’s input
  • When the model needs to call tools or functions (tool use requires structured arguments)
  • When extracting specific data points from unstructured text (entity extraction, classification)
  • When multiple models or systems need to exchange information in a common format
  • When you need to validate, test, or audit the model’s outputs programmatically

Rule of thumb

If the next consumer of the model’s output is a program (not a person), use structured output. If it is a person, free text is usually better.


How can I think about it?

The order form analogy

Structured output is like ordering from a restaurant using an order form instead of telling the waiter what you want in a conversation.

  • The order form is the schema — it defines what information is needed (dish, quantity, special requests, table number)
  • Each field has a type: dish is selected from a menu (enum), quantity is a number, special requests is free text
  • The kitchen (downstream system) can process the form directly — no waiter needs to interpret your casual conversation and translate it into kitchen instructions
  • If a required field is blank, the form is rejected before it reaches the kitchen (validation)
  • The form constrains your order: you cannot order a dish that is not on the menu, and you cannot write a poem in the quantity field

The trade-off: you lose the ability to say “something like yesterday’s special, but spicier” — the form does not have a field for that. Structured output sacrifices conversational flexibility for processing reliability.

The airport customs declaration analogy

Structured output is like a customs declaration form at an airport.

  • Every traveller (query) gets the same form (schema) with the same fields
  • The form specifies exact formats: passport number (alphanumeric, fixed length), date (DD/MM/YYYY), value of goods (number in local currency)
  • Customs officers (downstream systems) can process thousands of forms efficiently because every one has the same shape
  • A form with a missing passport number is rejected at the counter (validation), not discovered later when it causes a problem in the database
  • Without the form, each traveller would write a letter describing their trip — some would include the needed information, some would not, and processing would be slow and error-prone

The customs form exists because the system handling the data needs predictability at scale — exactly the same reason LLM pipelines use structured output.


Concepts to explore next

ConceptWhat it coversStatus
jsonThe data format most commonly used for structured outputcomplete
tool-useHow structured output enables models to call functions and interact with external systemscomplete
guardrailsThe broader framework of constraints on LLM behaviour, including output validationcomplete
structured-data-vs-proseWhen structured formats are better than free text, and vice versastub
machine-readable-formatsThe family of formats (JSON, XML, YAML) that machines can parsestub

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.


Check your understanding


Where this concept fits

Position in the knowledge graph

graph TD
    LP[LLM Pipelines] --> PC[Prompt Chaining]
    LP --> PR[Prompt Routing]
    LP --> CC[Context Cascading]
    LP --> RAG[RAG]
    LP --> SO[Structured Output]
    JSON[JSON] -.->|prerequisite| SO
    SO -.->|related| TU[Tool Use]
    SO -.->|related| GR[Guardrails]
    style SO fill:#4a9ede,color:#fff

Related concepts:

  • machine-readable-formats — structured output produces data in machine-readable formats; JSON is the most common choice
  • tool-use — tool calling depends on structured output to format function names and arguments as parseable objects
  • structured-data-vs-prose — structured output is one answer to the broader question of when data should be structured rather than free-form
  • guardrails — output schema enforcement is one type of guardrail that constrains LLM behaviour for reliability

Sources


Further reading

Resources

Footnotes

  1. Qasim. (2026). How to Enforce JSON Schema in LLM Outputs with Python. how2. 2 3 4

  2. Schluntz, E. and Zhang, B. (2024). Building Effective Agents. Anthropic. 2 3

  3. Reintech. (2026). How to Implement LLM Output Validation and Schema Enforcement. Reintech. 2 3

  4. OpenAI. (2024). Structured Outputs. OpenAI. 2

  5. Anthropic. (2025). Introducing Advanced Tool Use on the Claude Developer Platform. Anthropic.