Menu

About Case studies Skills Experience Education Engineering Contact

Case study

Enterprise AI knowledge platform

Semantic document retrieval, contextual search, embeddings, and AI-assisted querying—chunking pipelines, vector search, and scalable ingestion on FastAPI and Qdrant.

Systems engineer — retrieval stack, ingestion, evaluation hooks

Retrieval: Dense + metadata filters
Vector store: Qdrant
Serving: FastAPI (async)

Reference architecture

Problem

—Teams needed answers grounded in internal PDFs, wikis, and tickets without leaking across tenant boundaries.
—Naive chunking produced brittle retrieval: tables split incorrectly, headers orphaned from bodies, and duplicate near-identical chunks inflated latency.

Architecture

—Ingestion workers normalize files, extract structure-aware chunks, and attach metadata (source, ACL, section path, hash).
—Embeddings batch through OpenAI with backoff and idempotent writes keyed by content hash.
—Qdrant stores dense vectors with payload filters for tenant, product line, and sensitivity class.
—FastAPI exposes query, feedback, and admin reindex endpoints; LangChain composes retrievers, rerankers, and citation formatting.

Challenges

—Chunking strategy: balance recall on long policy PDFs with precision on short runbooks.
—Cold-start reindex: backfill millions of tokens without saturating embedding rate limits.
—Grounding: force answer components to cite chunk IDs; surface "insufficient context" instead of hallucinating.

Technologies

—FastAPI
—LangChain
—OpenAI APIs
—Qdrant
—PostgreSQL
—Docker
—AWS

Engineering decisions

—Structured chunk metadata over larger naive pages—improved MRR on internal eval sets.
—Server-side ACL filtering in Qdrant payloads rather than post-filtering in Python to keep latency predictable.
—Logged retrieval traces (query, filters, top-k ids) for offline eval and regression tests on golden questions.

Outcome

—Production retrieval path with traceable citations and tenant-safe filtering.
—Repeatable reindex jobs and versioned embedding models for controlled upgrades.

← All case studies · Home · Contact