Case study

Multi-agent AI workflow system

Orchestrated agents with explicit task graphs, shared memory boundaries, and durable execution—LangGraph and CrewAI coordinated behind FastAPI with Redis-backed checkpoints.

Platform engineer — orchestration, state, observability

Orchestration
LangGraph + CrewAI
Durability
Redis checkpoints
Transport
SSE progress stream

Reference architecture

Problem

  • Single-shot prompts could not decompose long operational runbooks into verifiable substeps.
  • Teams needed delegation patterns: researcher, verifier, and executor roles with different tool access.

Architecture

  • LangGraph defines the state machine: nodes are agents or tools; edges encode branching and human approval gates.
  • CrewAI crews encapsulate role prompts and toolkits for bounded subtasks invoked from graph nodes.
  • Redis stores run checkpoints, dedupe keys, and distributed locks for fan-out/fan-in segments.
  • FastAPI exposes run management, streaming events to clients over SSE for progress and partial artifacts.

Challenges

  • Agent memory: separate short-term thread state from durable facts written only after validation.
  • Failure recovery: resume from last good checkpoint without double-applying side effects.
  • Observability: correlate spans across agents with a single run_id propagated on every tool call.

Technologies

  • LangGraph
  • CrewAI
  • FastAPI
  • Redis
  • PostgreSQL

Engineering decisions

  • Chose explicit graphs over implicit agent loops to make behavior reviewable by security stakeholders.
  • Tool allowlists per role; no blanket internet access from execution agents.
  • Structured outputs at boundaries between agents to reduce ambiguous handoffs.

Outcome

  • Durable multi-step workflows suitable for production change management and document-heavy pipelines.
  • Operators can inspect each node output before downstream agents proceed when policies require it.

← All case studies · Home · Contact