RETURN_TO_BLOG
AI & Automation 16 min

LangGraph — How to Build Production-Ready AI Agent Workflows

LangGraph is a Python library from LangChain that lets you build AI agents as directed graphs with managed state — where each node is a step of logic, and edges (including conditional ones) decide the flow. It is the de facto standard for building production agent workflows in 2026, because it solves the fundamental problem of simple agent loops: no control over state, hard to debug, and no way to resume interrupted tasks. Instead of a while-loop with an LLM inside, you get an explicit, controllable state machine — with the ability to pause, resume, run parallel branches and built-in checkpointing.

Complete guide to LangGraph: how StateGraph works with nodes, edges and conditional routing, the ReAct pattern step by step, human-in-the-loop with interrupt/resume, PostgreSQL persistence with checkpointers, parallel branches with Send API, streaming, and comparison with CrewAI, AutoGen and Pydantic AI.

Imagine an agent that analyses a financial report: first it extracts data, then in parallel checks two databases, then — if the data is contradictory — asks a human for a resolution, and finally generates the result. Doing that in a plain while-loop is a maze of if-else. In LangGraph it is a readable graph with five nodes. This article will walk you through the key concepts, production patterns and a comparison with other frameworks.

What is LangGraph and when to use it

/// LANGGRAPH — STATE GRAPH FLOW (REACT AGENT)

START
Entry point
agent_node
LLM call
should_continue?
Conditional edge
yes (tools)
END
tool_node
Tool execution
END
Final result
↺ tool_node → agent_node (loop)

KEY API

StateGraphState schema (TypedDict / Pydantic)
add_node()Register node — any Python function
add_edge()Unconditional connection A→B
add_conditional_edges()Routing via a router function
compile()Finalize graph, optional checkpointer

LangGraph extends LangChain with a control-flow graph with managed state. Key concepts:

  • StateGraph — the main class; you define the state schema (TypedDict or Pydantic) and add nodes
  • Node — a Python function that receives the state and returns an update
  • Edge — a connection between nodes; can be unconditional or conditional
  • Conditional edge — a routing function that chooses the next node based on state
  • START / END — special framework nodes; every graph begins at START and ends at END

LangGraph is the right choice when: - The agent needs to act multiple times (loops, iterations, retries) - You need human-in-the-loop (approvals, corrections mid-flight) - The workflow has conditional branches or parallel lanes - The task can be interrupted and resumed (long operations, asynchronous) - You want to debug and visualise the flow step by step

When LangGraph is overkill: for a simple agent with a single loop, plain Python with an LLM and tool calling is enough. LangGraph shines with complex, multi-step flows.

First graph in LangGraph — ReAct agent from scratch

The ReAct pattern (Reasoning + Acting) is the most popular starting point: the agent thinks, decides on an action, executes a tool, observes the result and moves forward — or finishes. In LangGraph this maps naturally to a graph.

react_agent.py
from typing import TypedDict, Annotated, Sequencefrom langgraph.graph import StateGraph, START, ENDfrom langgraph.prebuilt import ToolNodefrom langchain_anthropic import ChatAnthropicfrom langchain_core.messages import BaseMessage, HumanMessageimport operatorclass AgentState(TypedDict):    messages: Annotated[Sequence[BaseMessage], operator.add]model = ChatAnthropic(model="claude-haiku-4-5-20251001").bind_tools(tools)def agent_node(state: AgentState) -> dict:    response = model.invoke(state["messages"])    return {"messages": [response]}def should_continue(state: AgentState) -> str:    last = state["messages"][-1]    if last.tool_calls:        return "tools"    return ENDgraph = StateGraph(AgentState)graph.add_node("agent", agent_node)graph.add_node("tools", ToolNode(tools))graph.add_edge(START, "agent")graph.add_conditional_edges("agent", should_continue)graph.add_edge("tools", "agent")app = graph.compile()

This example shows a complete ReAct cycle in 30 lines. The key element is the should_continue function — conditional routing that decides whether to go to the tools node or finish. ToolNode is a prebuilt utility that automatically invokes tools from tool calls in the message.

State management — the foundation of LangGraph

State in LangGraph is a managed structure with update-merge rules. The annotation Annotated[list, operator.add] tells LangGraph to append new items to the existing list on each node update instead of overwriting it.

state_management.py
from typing import TypedDict, Annotatedfrom langgraph.graph import StateGraphimport operatorclass WorkflowState(TypedDict):    messages: Annotated[list, operator.add]    current_step: str    tool_results: Annotated[list, operator.add]    final_answer: str | NoneBest practices for state management:- Keep state flat and simple — deeply nested objects are harder to debug- Use Pydantic instead of TypedDict for validation and default values- Each node returns only the keys it changed — no need to return the whole state- Store in state the context needed for routing decisions, not just output data

Persistence — checkpointing and resumption

This is the feature that separates LangGraph from plain loops. A checkpointer saves a snapshot of state after every step — so you can resume an interrupted session, review step history for debugging, roll back to a previous state, and implement human-in-the-loop with suspension.

checkpointing.py
from langgraph.checkpoint.sqlite import SqliteSaverfrom langgraph.checkpoint.postgres import PostgresSaver# Developmentmemory_saver = SqliteSaver.from_conn_string(":memory:")app = graph.compile(checkpointer=memory_saver)# Productionwith PostgresSaver.from_conn_string("postgresql://user:pass@host/db") as checkpointer:    checkpointer.setup()    app = graph.compile(checkpointer=checkpointer)config = {"configurable": {"thread_id": "session-abc-123"}}result = app.invoke({"messages": [HumanMessage("Analyse Q1 report")]}, config)

thread_id is the session key — the same ID continues the same session, a new ID starts a new independent session. In production always pass thread_id — without it every call is an isolated session with no history.

Human-in-the-loop — suspend and approve

/// LANGGRAPH vs CREWAI vs AUTOGEN vs PYDANTIC AI

LangGraph
PRODUCTION
ParadigmState graph
ControlFull, explicit
Human-in-loopNative (interrupt)
PersistenceNative (PostgreSQL)
Best forComplex workflows, prod
CrewAI
FAST START
ParadigmRole-based agents
ControlLimited
Human-in-loopExternal
PersistenceExternal
Best forPrototypes, role-based
AutoGen
RESEARCH
ParadigmAgent conversations
ControlLimited
Human-in-loopVia config
PersistenceExternal
Best forResearch, prototypes
Pydantic AI
RISING
ParadigmTyped agents
ControlFull, explicit
Human-in-loopNative
PersistenceNative (PostgreSQL)
Best forPydantic stack, Logfire

One of the most powerful patterns: the agent suspends before a sensitive operation and waits for human approval.

human_in_the_loop.py
from langgraph.types import interrupt, Commanddef review_before_send(state: WorkflowState) -> dict:    draft_email = state["draft_email"]    approved = interrupt({        "action": "review_email",        "draft": draft_email,        "message": "Approve or revise the email before sending"    })    if approved["decision"] == "approve":        return {"send_email": True}    return {"send_email": False, "draft_email": approved.get("revised", draft_email)}app = graph.compile(interrupt_before=["review_before_send"])app.invoke(    Command(resume={"decision": "approve"}),    config)

The human-in-the-loop pattern is indispensable in production systems: the agent can prepare an email, a quote or a purchase decision — and stop before the actual action. A human approves, corrects or rejects. Only then does the system proceed.

Parallel branches — Send API and fan-out

LangGraph lets you run nodes in parallel using the Send API. The map-reduce pattern: one node generates a list of items, each is processed in parallel, results are collected.

parallel_branches.py
from langgraph.types import Senddef generate_tasks(state: WorkflowState) -> list:    documents = state["documents"]    return [Send("analyze_doc", {"doc": doc}) for doc in documents]def analyze_doc(state: dict) -> dict:    doc = state["doc"]    analysis = llm.invoke("Analyse: " + doc["content"])    return {"analyses": [{"doc_id": doc["id"], "result": analysis.content}]}graph.add_conditional_edges("prepare_tasks", generate_tasks)

Instead of processing 10 documents sequentially (10× time), LangGraph runs them in parallel — total time equals the slowest single document. Key use cases: batch processing, many API calls at once, parallel analysis of multiple sources.

Streaming — real-time results

LangGraph supports streaming at multiple levels — from tokens to state updates to node-specific events.

streaming.py
async for chunk in app.astream_events(    {"messages": [HumanMessage("Analyse the data")]},    config,    version="v2"):    if chunk["event"] == "on_chat_model_stream":        token = chunk["data"]["chunk"].content        print(token, end="", flush=True)    elif chunk["event"] == "on_chain_end" and chunk["name"] == "agent":        print("\n[Agent finished step]")async for state_update in app.astream(inputs, config):    node_name = list(state_update.keys())[0]    print("Node " + node_name + " finished")

Streaming is a must in UI — the user sees progress instead of waiting tens of seconds for the final result. In practice you stream via WebSocket or Server-Sent Events to the frontend.

Subgraphs and multi-agent systems

LangGraph supports nested graphs — subgraphs as nodes in the main graph. This is the foundation of multi-agent architectures: an orchestrator and specialised agents.

subgraph_multiagent.py
from langgraph.graph import StateGraphfinance_graph = StateGraph(FinanceState)finance_graph.add_node("fetch_data", fetch_financial_data)finance_graph.add_node("analyze", analyze_financials)finance_graph.add_edge(START, "fetch_data")finance_graph.add_edge("fetch_data", "analyze")finance_graph.add_edge("analyze", END)finance_agent = finance_graph.compile()main_graph = StateGraph(OrchestratorState)main_graph.add_node("route_task", route_incoming_task)main_graph.add_node("finance_agent", finance_agent)main_graph.add_node("legal_agent", legal_agent)main_graph.add_node("synthesize", synthesize_results)

Each agent has its own state and internal logic. This approach scales better than one large prompt with all the logic, because each agent is specialised, independently testable and easy to swap out.

LangGraph vs other frameworks

CriterionLangGraphCrewAIAutoGenPydantic AI
ParadigmState graphRole-based agentsAgent conversationsTyped agents
Control flowFull, explicitLimitedLimitedFull, explicit
Human-in-the-loopNative (interrupt)ExternalVia configNative
PersistenceNative (PostgreSQL)ExternalExternalNative (PostgreSQL)
StreamingNative (astream_events)LimitedLimitedNative
DebuggingLangGraph StudioNone built-inNone built-inLogfire
Best forComplex workflows, prodFast start, role-basedResearch, prototypesPydantic stack

LangGraph — choose when you need full control, multi-step workflows with state management, human-in-the-loop and streaming in production. The most mature production option in 2026.

CrewAI — choose for a fast start with the role-based model. Good for simpler pipelines without complex routing.

AutoGen — choose for research and prototyping; the conversational model is intuitive but harder to control in production.

Pydantic AI — choose if you already use Pydantic and want a typed Python API. Growing fast, good integration with Logfire for monitoring.

Production patterns

Reflection (Self-critique) — The agent generates a response, evaluates its own quality, corrects or finishes. Implementation: a "generate" node + a "critique" node + a conditional edge checking the score.

Plan-and-Execute — A planner creates a list of steps, an executor processes them sequentially or in parallel, a synthesiser combines the results. Good for long tasks requiring many tools.

Orchestrator-Workers — An orchestrator assigns tasks, workers execute in parallel, the orchestrator collects and decides on the next step. The fan-out/fan-in equivalent.

Routing with a classifier — The first node classifies the user's intent, a conditional edge routes to a specialised agent. Instead of one large agent handling everything.

Deployment — LangGraph Platform and LangGraph Studio

LangGraph Studio is a visual debugger: you see the graph, trace state step by step, can edit state and resume from any point. Invaluable when debugging complex workflows.

LangGraph Platform is managed hosting with automatic scaling, persistent storage, a REST API for interacting with graphs, webhooks for async tasks and built-in monitoring. For self-hosted: a LangGraph application is a plain Python application — you deploy it like FastAPI with your own PostgreSQL as the checkpointer.

Common mistakes and production deployment checklist

  1. 1.Infinite loop without an exit condition — every conditional edge should have a path to END; add an iteration counter to state
  2. 2.State too large — keep only what you need for decisions; store large data externally, only a reference in state
  3. 3.No thread_id with checkpointing — without it every call is a new session with no history
  4. 4.Synchronous code in an async graph — if you use astream, all nodes must be async
  5. 5.Sensitive data leaking through state — state goes to the checkpointer; do not store tokens or PII in plain text
  6. 6.No error handling for tools — your routing logic must handle the "all tools failed" scenario
  7. 7.Define state as a Pydantic model — validation, default values, documentation
  8. 8.Always add an iteration limit to every loop (an "iteration_count" field in state)
  9. 9.Connect PostgresSaver with a dedicated table — SQLite in dev only
  10. 10.Every production call has a thread_id — session or task identifier
  11. 11.Enable human-in-the-loop on sensitive operations (send, payment, write)
  12. 12.Add tracing (LangSmith or Langfuse) — log every step and see where the agent gets lost

Key takeaways

LangGraph is the de facto standard for AI agent orchestration in Python in 2026. It turns an agent from a plain while-loop into an explicit, controllable state machine — with checkpointing, full step history, built-in human-in-the-loop, parallel branches and streaming. Key concepts: StateGraph, nodes (Python functions), conditional edges (routing), checkpointer (persistence) and Send API (parallelism). In production use PostgresSaver, always pass thread_id, stream via astream_events and add an iteration limit to every loop. LangGraph beats CrewAI and AutoGen when you need full control, complex routing and production reliability.

---

I help companies design and deploy production AI agent workflows on LangGraph — from graph architecture and checkpointer selection, through human-in-the-loop and parallel branches, to deployment, tracing with LangSmith and cost optimisation. Get in touch — I start with a free 30-minute analysis of your use case.

/// AUTHOR
Paweł Wiszniewski – AI & Web Engineer

Paweł Wiszniewski

SEO & GEO Specialist & AI Engineer

SEO/GEO specialist (10 years) and AI engineer (3 years). I build search visibility, AI systems and automations that reduce costs and improve operational efficiency.

Signal received?

Terminate
Silence

Initiate protocol. Establish connection. Let's build something loud.

> WAITING_FOR_INPUT...