Case study

Enterprise AI knowledge platform

Semantic document retrieval, contextual search, embeddings, and AI-assisted querying—chunking pipelines, vector search, and scalable ingestion on FastAPI and Qdrant.

Systems engineer — retrieval stack, ingestion, evaluation hooks

Retrieval
Dense + metadata filters
Vector store
Qdrant
Serving
FastAPI (async)

Reference architecture

Problem

  • Teams needed answers grounded in internal PDFs, wikis, and tickets without leaking across tenant boundaries.
  • Naive chunking produced brittle retrieval: tables split incorrectly, headers orphaned from bodies, and duplicate near-identical chunks inflated latency.

Architecture

  • Ingestion workers normalize files, extract structure-aware chunks, and attach metadata (source, ACL, section path, hash).
  • Embeddings batch through OpenAI with backoff and idempotent writes keyed by content hash.
  • Qdrant stores dense vectors with payload filters for tenant, product line, and sensitivity class.
  • FastAPI exposes query, feedback, and admin reindex endpoints; LangChain composes retrievers, rerankers, and citation formatting.

Challenges

  • Chunking strategy: balance recall on long policy PDFs with precision on short runbooks.
  • Cold-start reindex: backfill millions of tokens without saturating embedding rate limits.
  • Grounding: force answer components to cite chunk IDs; surface "insufficient context" instead of hallucinating.

Technologies

  • FastAPI
  • LangChain
  • OpenAI APIs
  • Qdrant
  • PostgreSQL
  • Docker
  • AWS

Engineering decisions

  • Structured chunk metadata over larger naive pages—improved MRR on internal eval sets.
  • Server-side ACL filtering in Qdrant payloads rather than post-filtering in Python to keep latency predictable.
  • Logged retrieval traces (query, filters, top-k ids) for offline eval and regression tests on golden questions.

Outcome

  • Production retrieval path with traceable citations and tenant-safe filtering.
  • Repeatable reindex jobs and versioned embedding models for controlled upgrades.

← All case studies · Home · Contact