
Rocky: Domain-Specific RAG Assistant
Role
ML Engineer
Year
2025
Category
AI / RAG
Context
EPFL Rocket Team
01.
The Problem
The EPFL Rocket Team manages a massive internal knowledge base comprising over 2,420 Markdown files covering complex engineering domains like propulsion and avionics. General-purpose LLMs suffer from hallucinations in these specialized fields and lack access to private data, while manual documentation searches are time-consuming and inefficient.
I implemented a modular Advanced RAG architecture to ensure traceability and precision , utilizing a custom offline indexing pipeline that segments data into 2000-character chunks with overlap. To improve retrieval accuracy, the system uses query expansion to generate semantic variants of user questions , which are processed using a stack comprising Nomic Embed Text via Ollama for embeddings and ChromaDB for local vector storage. Retrieved documents are filtered through an LLM-based reranking step to remove noise before being fed into OpenAI's GPT-4o mini , a model chosen for its cost-efficiency and 128k token context window. The entire solution is containerized using Docker, featuring automated daily synchronization to ensure the index stays consistent with the team's internal Wiki.
02.
The Approach
03. The Result
Rocky successfully mitigates hallucinations by grounding every response in verified documentation, achieving an average response latency of ~7 seconds for complex retrieval tasks. User testing confirmed the system provides instant access to precise technical details that previously took minutes to find, while correctly admitting ignorance when data is missing rather than fabricating answers.



