RETURN_TO_BLOG
AI & Automation 14 min

Structured outputs from AI: Pydantic, Instructor and JSON Schema in production

How to stop parsing strings from GPT and start getting data ready to use in code — JSON Schema, Pydantic and the Instructor library.

Tuesday morning, production deployment. The model returns: `{"name": "Jan Kowalski", "age": "thirty-two", "tags": "python, django"}`. Your code expected `age` as int, `tags` as list — and it throws an exception. The model "tried its best", but it couldn't know that the list is `["python", "django"]`, not a string. This isn't an edge case — it's the daily reality when an LLM and code communicate through a string.

Three Approaches — and Why the First Two Fail

Most teams go through the same phases. Phase 1 — "tell GPT to return JSON" — works for a week, then the model adds a markdown fence or a comment and `json.loads` blows up. Phase 2 — JSON mode (`response_format={"type": "json_object"}`) — stable JSON, but without a schema the model decides the field shapes itself. Phase 3 — Structured Outputs with JSON Schema or Instructor — you get exactly what you described, validated either at the API level or in code.

/// EWOLUCJA STRUCTURED OUTPUTS

3 podejścia — od chaosu do gwarantowanej struktury

01Prompt JSON
"Zwróć odpowiedź jako JSON"
Stabilność losowa
Walidacja brak
Schemat brak
json.loads() wybucha
02JSON Mode
response_format: json_object
Stabilność stabilna
Walidacja brak
Schemat model decyduje
Pole może być int lub string
03Structured Outputs
JSON Schema + Instructor
Stabilność gwarantowana
Walidacja automatyczna
Schemat wymuszony
Type-safe obiekt Pydantic
~60%
PARSE SUCCESS
PROMPT JSON
~95%
PARSE SUCCESS
JSON MODE
100%
PARSE SUCCESS
STRUCTURED OUTPUTS

JSON Schema and strict mode — API-side validation

OpenAI Structured Outputs (from GPT-4o) enforce the schema at the tokenisation level — the model only generates tokens matching the defined structure. `strict: true` + `response_format` with `json_schema` guarantees the response always parses without error. Requirements: every object needs `additionalProperties: false` and all fields in `required` — you implement optionality via `anyOf` with `{"type": "null"}`.

json_schema_strict.py
from openai import OpenAIimport jsonclient = OpenAI()SCHEMA = {    "name": "order_extraction",    "strict": True,    "schema": {        "type": "object",        "properties": {            "customer_name": {"type": "string"},            "order_id": {"type": "string"},            "items": {                "type": "array",                "items": {                    "type": "object",                    "properties": {                        "product": {"type": "string"},                        "quantity": {"type": "integer"},                        "price_pln": {"type": "number"}                    },                    "required": ["product", "quantity", "price_pln"],                    "additionalProperties": False                }            },            "total_pln": {"type": "number"}        },        "required": ["customer_name", "order_id", "items", "total_pln"],        "additionalProperties": False    }}resp = client.chat.completions.create(    model="gpt-4o",    messages=[{"role": "user", "content": "Extract: Jan Kowalski, ORD-001234, 3x coffee 12.99 PLN, 1x tea 8.50 PLN"}],    response_format={"type": "json_schema", "json_schema": SCHEMA})order = json.loads(resp.choices[0].message.content)print(order["total_pln"])

The result is always valid JSON matching the schema — zero exceptions from `json.loads`. The downside is verbosity: for complex objects, JSON Schema quickly becomes unreadable and hard to maintain.

Pydantic as the schema description layer

Instead of writing JSON Schema by hand, describe the structure as a Pydantic class. `Model.model_json_schema()` generates the schema automatically from type hints and validators. The key: `Field(description=...)` — the LLM reads field descriptions and fills data far more accurately when it knows what you expect. `field_validator` lets you add business rules that JSON Schema can't express — sum validation, ID format, conditional rules.

pydantic_model.py
from pydantic import BaseModel, Field, field_validatorfrom typing import Optionalimport reclass OrderItem(BaseModel):    product: str = Field(description="Product name exactly as written in the text")    quantity: int = Field(ge=1, description="Number of units, min 1")    price_pln: float = Field(gt=0, description="Unit price in PLN")class Order(BaseModel):    customer_name: str = Field(description="Customer's first and last name")    order_id: str = Field(description="Order ID in format ORD-XXXXXX")    items: list[OrderItem] = Field(description="List of all order items")    total_pln: float = Field(description="Sum of all items in PLN")    notes: Optional[str] = Field(default=None, description="Notes if provided, otherwise null")    @field_validator("order_id")    @classmethod    def validate_order_id(cls, v: str) -> str:        if not re.match(r"ORD-d{6}$", v):            raise ValueError(f"order_id must be ORD-XXXXXX, got: {v}")        return v    @field_validator("total_pln")    @classmethod    def validate_total(cls, v: float, info) -> float:        if "items" in info.data:            expected = sum(i.price_pln * i.quantity for i in info.data["items"])            if abs(v - expected) > 0.01:                raise ValueError(f"total_pln {v} != sum of items {expected:.2f}")        return v

`field_validator` lets you define business rules — sum validation, ID format, date ranges — that JSON Schema can't handle. A validation error gives you a concrete message you can pass back to the model in the next retry.

Instructor — 3 lines of code instead of your own parser

Instructor wraps the OpenAI client (and 10+ other providers) and turns the response directly into a validated Pydantic object. You don't need `json.loads`, `model.model_validate` or manual retry — the library handles it for you with 3 retries by default, sending the validation error message back to the model as context.

instructor_basic.py
import instructorfrom openai import OpenAIfrom pydantic import BaseModel, Fieldfrom typing import Literalclient = instructor.from_openai(OpenAI())class ProductReview(BaseModel):    sentiment: Literal["positive", "negative", "neutral"]    score: int = Field(ge=1, le=5, description="Rating 1–5")    key_issues: list[str] = Field(description="List of main problems or strengths, max 5 points")    would_recommend: bool    summary: str = Field(max_length=200, description="One-sentence summary")review = client.chat.completions.create(    model="gpt-4o",    response_model=ProductReview,    messages=[        {"role": "user", "content": "Analyse: 'Product arrived damaged, support didn't pick up for 3 days, eventually got a refund but wasted my time. Never again.'"}    ])print(review.sentiment)print(review.score)print(review.key_issues)

`response_model=ProductReview` is all you need — Instructor generates the JSON Schema from the class, calls the API, parses the response, validates with Pydantic, and on failure automatically retries with the error appended to the conversation context.

/// INSTRUCTOR — PIPELINE WALIDACJI

Od klasy Pydantic do zwalidowanego obiektu

01
Pydantic Model
Klasa z opisami pól
02
instructor.from_openai()
Wrap klienta
03
LLM Call
response_model=Model
04
JSON Parse
Automatyczne
05
Pydantic Validate
field_validator()
Automatyczny retry (domyślnie 3×)
Gdy walidacja Pydantic nie przejdzie, Instructor dołącza komunikat błędu do kontekstu modelu i ponawia wywołanie. Model "widzi" własny błąd i poprawia dane.
DOMYŚLNY LIMIT RETRY
10+
PROVIDERÓW (OAI, ANTHROPIC…)
0
LINII BOILERPLATE

Patterns: extraction, classification, normalisation

Three main use cases differ in their approach to the schema. Extraction (pulling data from text) — use `Optional` for fields that may not appear; never force fields the model can't fill. Classification — use `Literal` or `Enum` instead of `str`, the model will only choose from allowed values. Normalisation — describe the exact output format with an example in `description` and use `field_validator` to verify.

PatternField typeKey trickPitfall
ExtractionOptional[str]null when field absent from textForcing fields that aren't there
ClassificationLiteral["a","b","c"]Enum instead of strToo many classes (>10) — quality drops
Date normalisationstr + validatorFormat example in descriptionTimezones — always use UTC
List of itemslist[Model]"Extract ALL" in descriptionDuplicates — deduplicate in validator
Nested objectsBaseModel in BaseModelFlat schema is faster and more accurateDepth >3 — hallucinations
instructor_patterns.py
from enum import Enumfrom typing import Optional, Literalfrom pydantic import BaseModel, Field, field_validatorfrom datetime import datetimeimport instructorfrom openai import OpenAIclass Priority(str, Enum):    LOW = "low"    MEDIUM = "medium"    HIGH = "high"    CRITICAL = "critical"class TicketExtraction(BaseModel):    title: str = Field(max_length=100, description="Short ticket title")    priority: Priority = Field(description="Priority based on urgency and business impact")    category: Literal["bug", "feature", "question", "billing"]    affected_users: Optional[int] = Field(default=None, ge=1, description="Number of affected users if stated, otherwise null")    reported_at: Optional[str] = Field(default=None, description="Date in ISO 8601 format e.g. 2026-06-05T10:30:00Z, null if unknown")    is_regression: bool = Field(description="True if it worked before")    @field_validator("reported_at")    @classmethod    def validate_date(cls, v: Optional[str]) -> Optional[str]:        if v is None:            return v        try:            datetime.fromisoformat(v.replace("Z", "+00:00"))        except ValueError:            raise ValueError(f"reported_at must be ISO 8601, got: {v}")        return vclient = instructor.from_openai(OpenAI())ticket = client.chat.completions.create(    model="gpt-4o",    response_model=TicketExtraction,    messages=[{"role": "user", "content": "URGENT: login stopped working at 10:30, around 500 users can't log in, it worked before"}])print(ticket.priority)print(ticket.affected_users)

Instructor works with multiple providers — `instructor.from_anthropic()`, `instructor.from_gemini()`, `instructor.from_mistral()` — same Pydantic code, different client.

When structured output fails — 4 scenarios

Even with Instructor you hit walls. Here are the four main ones and how to get out.

1. The model can't fill a required field. Symptom: retry loop, model hallucinates a value just to put "something". Fix: change the field to `Optional` and add `description="null if unknown"` — let the model admit missing information.

2. The schema is too complex. Symptom: the model fills a field with a random value instead of null. Fix: simplify to a flat structure. If you need complexity, split into two calls — first extracts flat data, second classifies or normalises.

3. Business validation fails after 3 retries. Symptom: `InstructorRetryException`. Fix: catch the exception and log the model's last attempt — often the rule is too restrictive or the prompt doesn't contain the information the validator expects. Loosen the validator or enrich the prompt with an example of a correct response.

4. The list has too few elements. Symptom: `items` has 2 instead of 5 entries. Fix: add `"Extract ALL items — don't skip any"` to the `description`. Instructor also supports `Iterable[Model]` as `response_model` — the model streams objects incrementally.

---

I build data extraction and classification systems for companies — from simple pipelines to complex multi-step architectures with business validation and monitoring. Get in touch — I start with an analysis of your input data and schema design.

/// AUTHOR
Paweł Wiszniewski – AI & Web Engineer

Paweł Wiszniewski

Senior Full-Stack Engineer & AI Architect

8+ years building AI systems, automations, and scalable web applications that reduce costs and improve operational efficiency.

Signal received?

Terminate
Silence

Initiate protocol. Establish connection. Let's build something loud.

> WAITING_FOR_INPUT...