Neural Search Engine

Developed a search system that uses vector embeddings and hybrid retrieval to deliver contextually relevant results from technical documentation.

PythonFastAPIReactPineconeOpenAIKubernetes

Source Code

The Problem

Traditional keyword-based search failed to surface relevant results when developers asked natural language questions about complex codebases. Engineers were spending 20+ minutes finding relevant documentation, leading to duplicated effort and inconsistent implementations.

The Solution

I built a hybrid search engine that combines dense vector embeddings for semantic understanding with sparse BM25 retrieval for keyword precision. The system pre-processes documentation through a chunking pipeline that preserves code context and hierarchical relationships between sections.

Architecture & Stack

The ingestion pipeline runs on Kubernetes, processing documents through a DAG of transformers: markdown parsing, code extraction, hierarchical chunking, and embedding generation via OpenAI. The search API is a FastAPI service that performs hybrid retrieval from Pinecone (dense) and Elasticsearch (sparse), with a re-ranking step using a cross-encoder model. The React frontend features a conversational search interface with streaming results.

Challenges & Decisions

Balancing relevance between semantic and keyword results required extensive A/B testing of fusion algorithms. I implemented Reciprocal Rank Fusion with tunable weights and built an evaluation framework using NDCG metrics against human-labeled query sets. Another challenge was handling code snippets in embeddings, which I solved through specialized chunking strategies that preserve function boundaries.

Outcome & Learnings

Search relevance improved by 65% over the previous keyword system, measured by click-through rates and user satisfaction surveys. Average time-to-answer dropped from 22 minutes to under 3 minutes. The system now indexes 500k+ documentation pages and handles 10k queries per day.

Interested in working together on something like this?

Get in Touch