CASE STUDY 07

Rocky: Domain-Specific RAG Assistant

PythonGPT-4o miniRAGChromaDBOllamaDockerNomic Embed

Role

ML Engineer

Year

2025

Context

EPFL Rocket Team

01.
The Problem

The EPFL Rocket Team manages a massive internal knowledge base comprising over 2,420 Markdown files covering complex engineering domains like propulsion and avionics. General-purpose LLMs suffer from hallucinations in these specialized fields and lack access to private data, while manual documentation searches are time-consuming and inefficient.

I implemented a modular Advanced RAG architecture to ensure traceability and precision , utilizing a custom offline indexing pipeline that segments data into 2000-character chunks with overlap. To improve retrieval accuracy, the system uses query expansion to generate semantic variants of user questions , which are processed using a stack comprising Nomic Embed Text via Ollama for embeddings and ChromaDB for local vector storage. Retrieved documents are filtered through an LLM-based reranking step to remove noise before being fed into OpenAI's GPT-4o mini , a model chosen for its cost-efficiency and 128k token context window. The entire solution is containerized using Docker, featuring automated daily synchronization to ensure the index stays consistent with the team's internal Wiki.

02.
The Approach

03. The Result

Rocky successfully mitigates hallucinations by grounding every response in verified documentation, achieving an average response latency of ~7 seconds for complex retrieval tasks. User testing confirmed the system provides instant access to precise technical details that previously took minutes to find, while correctly admitting ignorance when data is missing rather than fabricating answers.

Read Report View Slides