--- title: Codey Bryant 3.0 emoji: 🤖 colorFrom: blue colorTo: green sdk: docker app_file: app_hf.py pinned: true license: mit --- # 🤖 Codey Bryant 3.0 - SOTA RAG Coding Assistant ## Advanced RAG Architecture **Codey Bryant 3.0** implements state-of-the-art Retrieval-Augmented Generation (RAG) with four key innovations: ### 1. **HyDE (Hypothetical Document Embeddings)** Generates hypothetical answers to improve retrieval relevance for vague queries. ### 2. **Query Rewriting** Transforms casual/vague questions into specific, searchable queries. ### 3. **Multi-Query Retrieval** Searches multiple query variations to increase recall. ### 4. **Answer-Space Retrieval** Retrieves from both question AND answer embeddings for better context. ## Technical Stack - **LLM**: TinyLlama 1.1B (4-bit quantized) - **Embeddings**: all-MiniLM-L6-v2 - **Retrieval**: FAISS + BM25 hybrid - **Datasets**: OPC Educational + Evol-Instruct - **Framework**: Gradio + Hugging Face Transformers ## Performance Features - Handles vague queries like "it's not working" - Streaming responses - Context-aware generation - Hybrid dense-sparse retrieval - Persistent artifact storage ## Getting Started 1. Click **"Initialize Assistant"** (required once) 2. Ask Python coding questions 3. Get intelligent, context-aware responses ## Example Queries - "How to read a CSV file in Python?" - "Why am I getting 'list index out of range'?" - "Make this function faster..." - "Help, my code isn't working!" - "Best way to sort a dictionary by value?" ## Why This Architecture? 1. **HyDE**: Addresses the "semantic gap" between queries and documents 2. **Query Rewriting**: Improves retrieval for conversational queries 3. **Multi-Query**: Increases recall for complex questions 4. **Answer-Space**: Provides better context for generation ## 📁 Repository Structure