What is LangGraph and how does it differ from LangChain?

LangGraph is a separate Python library from the LangChain ecosystem that adds a control-flow graph with managed state. LangChain is a set of abstractions and components (prompts, LLMs, tools, RAG) — LangGraph is the orchestration layer that says: in what order and under what conditions to run those components. You can use LangGraph without LangChain (with the raw Anthropic or OpenAI SDK), but they are most often used together. The key difference: LangChain Expression Language is sequential chains; LangGraph is graphs with loops, branches and persisted state — and that is exactly what allows you to build agents that act over many steps.

Is LangGraph suitable for production?

Yes — and that is its main advantage over other frameworks. LangGraph is used in production by companies such as Replit, Elastic, Klarna and hundreds of others. It supports native checkpointing with PostgreSQL (state survives restarts), token and state-update streaming, human-in-the-loop (interrupt/resume), async code and horizontal scaling (state is in the DB, not in server memory). LangGraph Platform adds a REST API, webhooks and automatic scaling. For complex agent workflows it is the most mature option in 2026.

What is checkpointing in LangGraph and how does it work?

Checkpointing is the mechanism that saves a snapshot of the graph state after every step. LangGraph provides ready-made implementations: SqliteSaver (development), PostgresSaver (production), MemorySaver (tests). State is saved with a key composed of thread_id and position in the graph — the same thread_id in a subsequent call automatically resumes from the point of interruption. This enables: crash recovery (the agent resumes after a server restart), time-travel debugging (roll back to any step), human-in-the-loop with suspension for days or weeks, and analytics that let you replay the full history of every session.

How do I implement human-in-the-loop in LangGraph?

You use a combination of two mechanisms: interrupt() inside a node or interrupt_before=[...] at compile time. A node calls interrupt(payload) — this stops the graph and returns the payload to the calling code; the agent waits until an external system calls it with Command(resume=answer). Compiling with interrupt_before=["node_name"] stops the graph before entering that node. In both cases state is preserved in the checkpointer — you can resume after minutes, hours or days. Key use cases: approving an email before sending, enriching data with human input, deciding on an unexpected edge case.

LangGraph or CrewAI — which should I choose?

It depends on the case and complexity. CrewAI has a lower entry bar — the "agents with roles and goals" model is intuitive and good for a fast start with simple pipelines. LangGraph gives full control: you explicitly define every step, every routing decision and every state-merge rule — but it requires more code and understanding of the graph. In production in 2026 LangGraph has the edge: native human-in-the-loop, better persistent memory, more mature streaming and LangGraph Studio for debugging. Rule of thumb: CrewAI for a prototype and simple workflows; LangGraph when the workflow is complex or you are deploying in an environment that demands high reliability.

How do I debug an agent in LangGraph?

Three main tools: LangGraph Studio, LangSmith and local state inspection. LangGraph Studio is a visual debugger — it shows the graph graphically, lets you trace state step by step, edit state and resume from any point; invaluable for complex workflows. LangSmith automatically logs every call with full tracing of every node, LLM and tool inputs and outputs. Locally: app.get_state(config) returns the current state, app.get_state_history(config) the full step history.

How does LangGraph handle parallelism?

Via the Send API — a mechanism that lets a node return a list of Send objects instead of a state update. Each Send("node_name", data) runs a node instance with the provided data in parallel. This is an implementation of the map-reduce pattern: one node generates a list of tasks (map), each task is processed in parallel, results are collected into state by a reducer (reduce). Example: 10 documents to analyse — without Send, 10 sequential LLM calls; with Send all 10 run at once and total time equals the slowest document. LangGraph automatically manages goroutines and result collection.

RETURN_TO_BLOG

2026-06-21AI & Automation 16 min

LangGraph — How to Build Production-Ready AI Agent Workflows

LangGraph is a Python library from LangChain that lets you build AI agents as directed graphs with managed state — where each node is a step of logic, and edges (including conditional ones) decide the flow. It is the de facto standard for building production agent workflows in 2026, because it solves the fundamental problem of simple agent loops: no control over state, hard to debug, and no way to resume interrupted tasks. Instead of a while-loop with an LLM inside, you get an explicit, controllable state machine — with the ability to pause, resume, run parallel branches and built-in checkpointing.

Complete guide to LangGraph: how StateGraph works with nodes, edges and conditional routing, the ReAct pattern step by step, human-in-the-loop with interrupt/resume, PostgreSQL persistence with checkpointers, parallel branches with Send API, streaming, and comparison with CrewAI, AutoGen and Pydantic AI.

Imagine an agent that analyses a financial report: first it extracts data, then in parallel checks two databases, then — if the data is contradictory — asks a human for a resolution, and finally generates the result. Doing that in a plain while-loop is a maze of if-else. In LangGraph it is a readable graph with five nodes. This article will walk you through the key concepts, production patterns and a comparison with other frameworks.

What is LangGraph and when to use it

/// LANGGRAPH — STATE GRAPH FLOW (REACT AGENT)

START

Entry point

agent_node

LLM call

should_continue?

Conditional edge

yes (tools)

END

tool_node

Tool execution

END

Final result

↺ tool_node → agent_node (loop)

KEY API

StateGraph— State schema (TypedDict / Pydantic)

add_node()— Register node — any Python function

add_edge()— Unconditional connection A→B

add_conditional_edges()— Routing via a router function

compile()— Finalize graph, optional checkpointer

LangGraph extends LangChain with a control-flow graph with managed state. Key concepts:

StateGraph — the main class; you define the state schema (TypedDict or Pydantic) and add nodes
Node — a Python function that receives the state and returns an update
Edge — a connection between nodes; can be unconditional or conditional
Conditional edge — a routing function that chooses the next node based on state
START / END — special framework nodes; every graph begins at START and ends at END

LangGraph is the right choice when: - The agent needs to act multiple times (loops, iterations, retries) - You need human-in-the-loop (approvals, corrections mid-flight) - The workflow has conditional branches or parallel lanes - The task can be interrupted and resumed (long operations, asynchronous) - You want to debug and visualise the flow step by step

When LangGraph is overkill: for a simple agent with a single loop, plain Python with an LLM and tool calling is enough. LangGraph shines with complex, multi-step flows.

First graph in LangGraph — ReAct agent from scratch

The ReAct pattern (Reasoning + Acting) is the most popular starting point: the agent thinks, decides on an action, executes a tool, observes the result and moves forward — or finishes. In LangGraph this maps naturally to a graph.

react_agent.py

from typing import TypedDict, Annotated, Sequencefrom langgraph.graph import StateGraph, START, ENDfrom langgraph.prebuilt import ToolNodefrom langchain_anthropic import ChatAnthropicfrom langchain_core.messages import BaseMessage, HumanMessageimport operatorclass AgentState(TypedDict):    messages: Annotated[Sequence[BaseMessage], operator.add]model = ChatAnthropic(model="claude-haiku-4-5-20251001").bind_tools(tools)def agent_node(state: AgentState) -> dict:    response = model.invoke(state["messages"])    return {"messages": [response]}def should_continue(state: AgentState) -> str:    last = state["messages"][-1]    if last.tool_calls:        return "tools"    return ENDgraph = StateGraph(AgentState)graph.add_node("agent", agent_node)graph.add_node("tools", ToolNode(tools))graph.add_edge(START, "agent")graph.add_conditional_edges("agent", should_continue)graph.add_edge("tools", "agent")app = graph.compile()

This example shows a complete ReAct cycle in 30 lines. The key element is the should_continue function — conditional routing that decides whether to go to the tools node or finish. ToolNode is a prebuilt utility that automatically invokes tools from tool calls in the message.

State management — the foundation of LangGraph

State in LangGraph is a managed structure with update-merge rules. The annotation Annotated[list, operator.add] tells LangGraph to append new items to the existing list on each node update instead of overwriting it.

state_management.py

from typing import TypedDict, Annotatedfrom langgraph.graph import StateGraphimport operatorclass WorkflowState(TypedDict):    messages: Annotated[list, operator.add]    current_step: str    tool_results: Annotated[list, operator.add]    final_answer: str | NoneBest practices for state management:- Keep state flat and simple — deeply nested objects are harder to debug- Use Pydantic instead of TypedDict for validation and default values- Each node returns only the keys it changed — no need to return the whole state- Store in state the context needed for routing decisions, not just output data

Persistence — checkpointing and resumption

This is the feature that separates LangGraph from plain loops. A checkpointer saves a snapshot of state after every step — so you can resume an interrupted session, review step history for debugging, roll back to a previous state, and implement human-in-the-loop with suspension.

checkpointing.py

from langgraph.checkpoint.sqlite import SqliteSaverfrom langgraph.checkpoint.postgres import PostgresSaver# Developmentmemory_saver = SqliteSaver.from_conn_string(":memory:")app = graph.compile(checkpointer=memory_saver)# Productionwith PostgresSaver.from_conn_string("postgresql://user:pass@host/db") as checkpointer:    checkpointer.setup()    app = graph.compile(checkpointer=checkpointer)config = {"configurable": {"thread_id": "session-abc-123"}}result = app.invoke({"messages": [HumanMessage("Analyse Q1 report")]}, config)

thread_id is the session key — the same ID continues the same session, a new ID starts a new independent session. In production always pass thread_id — without it every call is an isolated session with no history.

Human-in-the-loop — suspend and approve

/// LANGGRAPH vs CREWAI vs AUTOGEN vs PYDANTIC AI

LangGraph

PRODUCTION

ParadigmState graph

ControlFull, explicit

Human-in-loopNative (interrupt)

PersistenceNative (PostgreSQL)

Best forComplex workflows, prod

CrewAI

FAST START

ParadigmRole-based agents

ControlLimited

Human-in-loopExternal

PersistenceExternal

Best forPrototypes, role-based

AutoGen

RESEARCH

ParadigmAgent conversations

ControlLimited

Human-in-loopVia config

PersistenceExternal

Best forResearch, prototypes

Pydantic AI

RISING

ParadigmTyped agents

ControlFull, explicit

Human-in-loopNative

PersistenceNative (PostgreSQL)

Best forPydantic stack, Logfire

One of the most powerful patterns: the agent suspends before a sensitive operation and waits for human approval.

human_in_the_loop.py

from langgraph.types import interrupt, Commanddef review_before_send(state: WorkflowState) -> dict:    draft_email = state["draft_email"]    approved = interrupt({        "action": "review_email",        "draft": draft_email,        "message": "Approve or revise the email before sending"    })    if approved["decision"] == "approve":        return {"send_email": True}    return {"send_email": False, "draft_email": approved.get("revised", draft_email)}app = graph.compile(interrupt_before=["review_before_send"])app.invoke(    Command(resume={"decision": "approve"}),    config)

The human-in-the-loop pattern is indispensable in production systems: the agent can prepare an email, a quote or a purchase decision — and stop before the actual action. A human approves, corrects or rejects. Only then does the system proceed.

Parallel branches — Send API and fan-out

LangGraph lets you run nodes in parallel using the Send API. The map-reduce pattern: one node generates a list of items, each is processed in parallel, results are collected.

parallel_branches.py

from langgraph.types import Senddef generate_tasks(state: WorkflowState) -> list:    documents = state["documents"]    return [Send("analyze_doc", {"doc": doc}) for doc in documents]def analyze_doc(state: dict) -> dict:    doc = state["doc"]    analysis = llm.invoke("Analyse: " + doc["content"])    return {"analyses": [{"doc_id": doc["id"], "result": analysis.content}]}graph.add_conditional_edges("prepare_tasks", generate_tasks)

Instead of processing 10 documents sequentially (10× time), LangGraph runs them in parallel — total time equals the slowest single document. Key use cases: batch processing, many API calls at once, parallel analysis of multiple sources.

Streaming — real-time results

LangGraph supports streaming at multiple levels — from tokens to state updates to node-specific events.

streaming.py

async for chunk in app.astream_events(    {"messages": [HumanMessage("Analyse the data")]},    config,    version="v2"):    if chunk["event"] == "on_chat_model_stream":        token = chunk["data"]["chunk"].content        print(token, end="", flush=True)    elif chunk["event"] == "on_chain_end" and chunk["name"] == "agent":        print("\n[Agent finished step]")async for state_update in app.astream(inputs, config):    node_name = list(state_update.keys())[0]    print("Node " + node_name + " finished")

Streaming is a must in UI — the user sees progress instead of waiting tens of seconds for the final result. In practice you stream via WebSocket or Server-Sent Events to the frontend.

Subgraphs and multi-agent systems

LangGraph supports nested graphs — subgraphs as nodes in the main graph. This is the foundation of multi-agent architectures: an orchestrator and specialised agents.

subgraph_multiagent.py

from langgraph.graph import StateGraphfinance_graph = StateGraph(FinanceState)finance_graph.add_node("fetch_data", fetch_financial_data)finance_graph.add_node("analyze", analyze_financials)finance_graph.add_edge(START, "fetch_data")finance_graph.add_edge("fetch_data", "analyze")finance_graph.add_edge("analyze", END)finance_agent = finance_graph.compile()main_graph = StateGraph(OrchestratorState)main_graph.add_node("route_task", route_incoming_task)main_graph.add_node("finance_agent", finance_agent)main_graph.add_node("legal_agent", legal_agent)main_graph.add_node("synthesize", synthesize_results)

Each agent has its own state and internal logic. This approach scales better than one large prompt with all the logic, because each agent is specialised, independently testable and easy to swap out.

LangGraph vs other frameworks

Criterion	LangGraph	CrewAI	AutoGen	Pydantic AI
Paradigm	State graph	Role-based agents	Agent conversations	Typed agents
Control flow	Full, explicit	Limited	Limited	Full, explicit
Human-in-the-loop	Native (interrupt)	External	Via config	Native
Persistence	Native (PostgreSQL)	External	External	Native (PostgreSQL)
Streaming	Native (astream_events)	Limited	Limited	Native
Debugging	LangGraph Studio	None built-in	None built-in	Logfire
Best for	Complex workflows, prod	Fast start, role-based	Research, prototypes	Pydantic stack

LangGraph — choose when you need full control, multi-step workflows with state management, human-in-the-loop and streaming in production. The most mature production option in 2026.

CrewAI — choose for a fast start with the role-based model. Good for simpler pipelines without complex routing.

AutoGen — choose for research and prototyping; the conversational model is intuitive but harder to control in production.

Pydantic AI — choose if you already use Pydantic and want a typed Python API. Growing fast, good integration with Logfire for monitoring.

Production patterns

Reflection (Self-critique) — The agent generates a response, evaluates its own quality, corrects or finishes. Implementation: a "generate" node + a "critique" node + a conditional edge checking the score.

Plan-and-Execute — A planner creates a list of steps, an executor processes them sequentially or in parallel, a synthesiser combines the results. Good for long tasks requiring many tools.

Orchestrator-Workers — An orchestrator assigns tasks, workers execute in parallel, the orchestrator collects and decides on the next step. The fan-out/fan-in equivalent.

Routing with a classifier — The first node classifies the user's intent, a conditional edge routes to a specialised agent. Instead of one large agent handling everything.

Deployment — LangGraph Platform and LangGraph Studio

LangGraph Studio is a visual debugger: you see the graph, trace state step by step, can edit state and resume from any point. Invaluable when debugging complex workflows.

LangGraph Platform is managed hosting with automatic scaling, persistent storage, a REST API for interacting with graphs, webhooks for async tasks and built-in monitoring. For self-hosted: a LangGraph application is a plain Python application — you deploy it like FastAPI with your own PostgreSQL as the checkpointer.

Common mistakes and production deployment checklist

1.Infinite loop without an exit condition — every conditional edge should have a path to END; add an iteration counter to state
2.State too large — keep only what you need for decisions; store large data externally, only a reference in state
3.No thread_id with checkpointing — without it every call is a new session with no history
4.Synchronous code in an async graph — if you use astream, all nodes must be async
5.Sensitive data leaking through state — state goes to the checkpointer; do not store tokens or PII in plain text
6.No error handling for tools — your routing logic must handle the "all tools failed" scenario
7.Define state as a Pydantic model — validation, default values, documentation
8.Always add an iteration limit to every loop (an "iteration_count" field in state)
9.Connect PostgresSaver with a dedicated table — SQLite in dev only
10.Every production call has a thread_id — session or task identifier
11.Enable human-in-the-loop on sensitive operations (send, payment, write)
12.Add tracing (LangSmith or Langfuse) — log every step and see where the agent gets lost

Key takeaways

LangGraph is the de facto standard for AI agent orchestration in Python in 2026. It turns an agent from a plain while-loop into an explicit, controllable state machine — with checkpointing, full step history, built-in human-in-the-loop, parallel branches and streaming. Key concepts: StateGraph, nodes (Python functions), conditional edges (routing), checkpointer (persistence) and Send API (parallelism). In production use PostgresSaver, always pass thread_id, stream via astream_events and add an iteration limit to every loop. LangGraph beats CrewAI and AutoGen when you need full control, complex routing and production reliability.

---

I help companies design and deploy production AI agent workflows on LangGraph — from graph architecture and checkpointer selection, through human-in-the-loop and parallel branches, to deployment, tracing with LangSmith and cost optimisation. Get in touch — I start with a free 30-minute analysis of your use case.

/// RELATED_RECORDS

AI & Automation

How AI Reads Invoices from Email and Enters Them into ERP

AI can automatically read an invoice from an email attachment — PDF, scan, or phone photo — and enter the data directly into an ERP system without any manual retyping. Full automation of cost invoice processing: from the mailbox to accounting.

10 min

AI & Automation

Where to Start with AI Implementation in Your Company

AI implementation starts not with choosing a tool, but with identifying one repetitive process that wastes the most human time. Learn step by step how to select, map, and automate that process.

8 min

AI & Automation

How to Build a Company Internal Knowledge Base with AI (RAG in Practice)

An internal knowledge base built on RAG lets you create your own company chatbot that answers only from your company's documents — not the model's guesses. Safe, up-to-date, precise AI with full control over your data.

11 min

/// AUTHOR

Paweł Wiszniewski

SEO & GEO Specialist & AI Engineer

SEO/GEO specialist (10 years) and AI engineer (3 years). I build search visibility, AI systems and automations that reduce costs and improve operational efficiency.

LinkedIn Facebook

Signal received?

Terminate
Silence

Initiate protocol. Establish connection. Let's build something loud.

> WAITING_FOR_INPUT...

BIAŁYSTOK, PL

+48 732 022 086 pawel.wiszniewski95@gmail.com

What is LangGraph and when to use it

First graph in LangGraph — ReAct agent from scratch

State management — the foundation of LangGraph

Persistence — checkpointing and resumption

Human-in-the-loop — suspend and approve

Parallel branches — Send API and fan-out

Streaming — real-time results

Subgraphs and multi-agent systems

LangGraph vs other frameworks

Production patterns

Deployment — LangGraph Platform and LangGraph Studio

Common mistakes and production deployment checklist

Key takeaways

/// RELATED_RECORDS

How AI Reads Invoices from Email and Enters Them into ERP

Where to Start with AI Implementation in Your Company

How to Build a Company Internal Knowledge Base with AI (RAG in Practice)

Signal received?

TerminateSilence

Terminate
Silence