Spaces:
Sleeping
Sleeping
| title: QUESTRAG Backend | |
| emoji: π¦ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_port: 7860 | |
| # π¦ QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG | |
| [](https://www.python.org/downloads/) | |
| [](https://fastapi.tiangolo.com/) | |
| [](https://reactjs.org/) | |
| [](https://opensource.org/licenses/MIT) | |
| [](https://huggingface.co/spaces/eeshanyaj/questrag-backend) | |
| > An intelligent banking chatbot powered by **Retrieval-Augmented Generation (RAG)** and **Reinforcement Learning (RL)** to provide accurate, context-aware responses to Indian banking queries while optimizing token costs. | |
| --- | |
| ## π Table of Contents | |
| - [Overview](#overview) | |
| - [Key Features](#key-features) | |
| - [System Architecture](#system-architecture) | |
| - [Technology Stack](#technology-stack) | |
| - [Installation](#installation) | |
| - [Configuration](#configuration) | |
| - [Usage](#usage) | |
| - [Project Structure](#project-structure) | |
| - [Datasets](#datasets) | |
| - [Performance Metrics](#performance-metrics) | |
| - [API Documentation](#api-documentation) | |
| - [Deployment](#deployment) | |
| - [Contributing](#contributing) | |
| - [License](#license) | |
| - [Acknowledgments](#acknowledgments) | |
| - [Contact](#contact) | |
| - [Status](#status) | |
| - [Links](#links) | |
| --- | |
| ## π― Overview | |
| QUESTRAG is an **advanced banking chatbot** designed to revolutionize customer support in the Indian banking sector. By combining **Retrieval-Augmented Generation (RAG)** with **Reinforcement Learning (RL)**, the system intelligently decides when to fetch external context from a knowledge base and when to respond directly, **reducing token costs by up to 31%** while maintaining high accuracy. | |
| ### Problem Statement | |
| Existing banking chatbots suffer from: | |
| - β Limited response flexibility (rigid, rule-based systems) | |
| - β Poor handling of informal/real-world queries | |
| - β Lack of contextual understanding | |
| - β High operational costs due to inefficient token usage | |
| - β Low user satisfaction and trust | |
| ### Solution | |
| QUESTRAG addresses these challenges through: | |
| - β **Domain-specific RAG** trained on 19,000+ banking queries / support data | |
| - β **RL-optimized policy network** (BERT-based) for smart context-fetching decisions | |
| - β **Fine-tuned retriever model** (E5-base-v2) using InfoNCE + Triplet Loss | |
| - β **Groq LLM with HuggingFace fallback** for reliable, fast responses | |
| - β **Full-stack web application** with modern UI/UX and JWT authentication | |
| --- | |
| ## π Key Features | |
| ### π€ Intelligent RAG Pipeline | |
| - **FAISS-powered retrieval** for fast similarity search across 19,352 documents | |
| - **Fine-tuned embedding model** (`e5-base-v2`) trained on English + Hinglish paraphrases | |
| - **Context-aware response generation** using Llama 3 models (8B & 70B) via Groq | |
| ### π§ Reinforcement Learning System | |
| - **BERT-based policy network** (`bert-base-uncased`) for FETCH/NO_FETCH decisions | |
| - **Reward-driven optimization** (+2.0 accurate, +0.5 needed fetch, -0.5 incorrect) | |
| - **31% token cost reduction** via optimized retrieval | |
| ### π¨ Modern Web Interface | |
| - **React 18 + Vite** with Tailwind CSS | |
| - **Real-time chat**, conversation history, JWT authentication | |
| - **Responsive design** for desktop and mobile | |
| ### π Enterprise-Ready Backend | |
| - **FastAPI + MongoDB Atlas** for scalable async operations | |
| - **JWT authentication** with secure password hashing (bcrypt) | |
| - **Multi-provider LLM** (Groq β HuggingFace automatic fallback) | |
| - **Deployed on HuggingFace Spaces** with Docker containerization | |
| --- | |
| ## ποΈ System Architecture | |
| <p align="center"> | |
| <img src="./assets/system.png" alt="System Architecture Diagram" width="750"/> | |
| </p> | |
| ### π Workflow | |
| 1. **User Query** β FastAPI receives query via REST API | |
| 2. **Policy Decision** β BERT-based RL model decides FETCH or NO_FETCH | |
| 3. **Conditional Retrieval** β If FETCH β Retrieve top-5 docs from FAISS using E5-base-v2 | |
| 4. **Response Generation** β Llama 3 (via Groq) generates final answer | |
| 5. **Evaluation & Logging** β Logged in MongoDB + reward-based model update | |
| --- | |
| ## π Sequence Diagram | |
| <p align="center"> | |
| <img src="./assets/sequence_diagram.png" alt="Sequence Diagram" width="750"/> | |
| </p> | |
| --- | |
| ## π οΈ Technology Stack | |
| ### **Frontend** | |
| - βοΈ React 18.3.1 + Vite 5.4.2 | |
| - π¨ Tailwind CSS 3.4.1 | |
| - π React Context API + Axios + React Router DOM | |
| ### **Backend** | |
| - π FastAPI 0.104.1 | |
| - ποΈ MongoDB Atlas + Motor (async driver) | |
| - π JWT Auth + Passlib (bcrypt) | |
| - π€ PyTorch 2.9.1, Transformers 4.57, FAISS 1.13.0 | |
| - π¬ Groq (Llama 3.1 8B Instant / Llama 3.3 70B Versatile) | |
| - π― Sentence Transformers 5.1.2 | |
| ### **Machine Learning** | |
| - π§ **Policy Network**: BERT-base-uncased (trained with RL) | |
| - π **Retriever**: E5-base-v2 (fine-tuned with InfoNCE + Triplet Loss) | |
| - π **Vector Store**: FAISS (19,352 documents) | |
| ### **Deployment** | |
| - π³ Docker (HuggingFace Spaces) | |
| - π€ HuggingFace Hub (model storage) | |
| - βοΈ MongoDB Atlas (cloud database) | |
| - π Python 3.12 + uvicorn | |
| --- | |
| ## βοΈ Installation | |
| ### π§© Prerequisites | |
| - Python 3.12+ | |
| - Node.js 18+ | |
| - MongoDB Atlas account (or local MongoDB 6.0+) | |
| - Groq API key (or HuggingFace token) | |
| ### π§ Backend Setup (Local Development) | |
| ```bash | |
| # Navigate to backend | |
| cd backend | |
| # Create virtual environment | |
| python -m venv venv | |
| # Activate it | |
| source venv/bin/activate # Linux/Mac | |
| venv\Scripts\activate # Windows | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Create environment file | |
| cp .env.example .env | |
| # Edit .env with your credentials (see Configuration section) | |
| # Build FAISS index (one-time setup) | |
| python build_faiss_index.py | |
| # Start backend server | |
| uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| ### π» Frontend Setup | |
| ```bash | |
| # Navigate to frontend | |
| cd frontend | |
| # Install dependencies | |
| npm install | |
| # Create environment file | |
| cp .env.example .env | |
| # Update VITE_API_URL to point to your backend | |
| # Start dev server | |
| npm run dev | |
| ``` | |
| --- | |
| ## βοΈ Configuration | |
| ### π Backend `.env` (Key Parameters) | |
| | **Category** | **Key** | **Example / Description** | | |
| |-------------------|----------------------------------|--------------------------------------------------| | |
| | Environment | `ENVIRONMENT` | `development` or `production` | | |
| | MongoDB | `MONGODB_URI` | `mongodb+srv://user:[email protected]/` | | |
| | Authentication | `SECRET_KEY` | Generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"` | | |
| | | `ALGORITHM` | `HS256` | | |
| | | `ACCESS_TOKEN_EXPIRE_MINUTES` | `1440` (24 hours) | | |
| | Groq API | `GROQ_API_KEY_1` | Your primary Groq API key | | |
| | | `GROQ_API_KEY_2` | Secondary key (optional) | | |
| | | `GROQ_API_KEY_3` | Tertiary key (optional) | | |
| | | `GROQ_CHAT_MODEL` | `llama-3.1-8b-instant` | | |
| | | `GROQ_EVAL_MODEL` | `llama-3.3-70b-versatile` | | |
| | HuggingFace | `HF_TOKEN_1` | HuggingFace token (fallback LLM) | | |
| | | `HF_MODEL_REPO` | `eeshanyaj/questrag_models` (for model download) | | |
| | Model Paths | `POLICY_MODEL_PATH` | `app/models/best_policy_model.pth` | | |
| | | `RETRIEVER_MODEL_PATH` | `app/models/best_retriever_model.pth` | | |
| | | `FAISS_INDEX_PATH` | `app/models/faiss_index.pkl` | | |
| | | `KB_PATH` | `app/data/final_knowledge_base.jsonl` | | |
| | Device | `DEVICE` | `cpu` or `cuda` | | |
| | RAG Params | `TOP_K` | `5` (number of documents to retrieve) | | |
| | | `SIMILARITY_THRESHOLD` | `0.5` (minimum similarity score) | | |
| | Policy Network | `CONFIDENCE_THRESHOLD` | `0.7` (policy decision confidence) | | |
| | CORS | `ALLOWED_ORIGINS` | `http://localhost:5173` or `*` | | |
| ### π Frontend `.env` | |
| ```bash | |
| # Local development | |
| VITE_API_URL=http://localhost:8000 | |
| # Production (HuggingFace Spaces) | |
| VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space | |
| ``` | |
| --- | |
| ## π Usage | |
| ### π₯οΈ Local Development | |
| #### Start Backend Server | |
| ```bash | |
| cd backend | |
| source venv/bin/activate # or venv\Scripts\activate | |
| uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| - **Backend**: http://localhost:8000 | |
| - **API Docs**: http://localhost:8000/docs | |
| - **Health Check**: http://localhost:8000/health | |
| #### Start Frontend Dev Server | |
| ```bash | |
| cd frontend | |
| npm run dev | |
| ``` | |
| - **Frontend**: http://localhost:5173 | |
| ### π Production (HuggingFace Spaces) | |
| **Backend API**: | |
| - **Base URL**: https://eeshanyaj-questrag-backend.hf.space | |
| - **API Docs**: https://eeshanyaj-questrag-backend.hf.space/docs | |
| - **Health Check**: https://eeshanyaj-questrag-backend.hf.space/health | |
| **Frontend** (Coming Soon): | |
| - Will be deployed on Vercel/Netlify | |
| --- | |
| ## π Project Structure | |
| ``` | |
| questrag/ | |
| β | |
| βββ backend/ | |
| β βββ app/ | |
| β β βββ api/v1/ | |
| β β β βββ auth.py # Auth endpoints (register, login) | |
| β β β βββ chat.py # Chat endpoints | |
| β β βββ core/ | |
| β β β βββ llm_manager.py # Groq + HF LLM orchestration | |
| β β β βββ security.py # JWT & password hashing | |
| β β βββ ml/ | |
| β β β βββ policy_network.py # RL Policy model (BERT) | |
| β β β βββ retriever.py # E5-base-v2 retriever | |
| β β βββ db/ | |
| β β β βββ mongodb.py # MongoDB connection | |
| β β β βββ repositories/ # User & conversation repos | |
| β β βββ services/ | |
| β β β βββ chat_service.py # Orchestration logic | |
| β β βββ models/ | |
| β β β βββ best_policy_model.pth # Trained policy network | |
| β β β βββ best_retriever_model.pth # Fine-tuned retriever | |
| β β β βββ faiss_index.pkl # FAISS vector store | |
| β β βββ data/ | |
| β β β βββ final_knowledge_base.jsonl # 19,352 Q&A pairs | |
| β β βββ config.py # Settings & env vars | |
| β β βββ main.py # FastAPI app entry point | |
| β βββ Dockerfile # Docker config for HF Spaces | |
| β βββ requirements.txt | |
| β βββ .env.example | |
| β | |
| βββ frontend/ | |
| βββ src/ | |
| β βββ components/ # UI Components | |
| β βββ context/ # Auth Context | |
| β βββ pages/ # Login, Register, Chat | |
| β βββ services/api.js # Axios Client | |
| β βββ App.jsx | |
| β βββ main.jsx | |
| βββ package.json | |
| βββ .env | |
| ``` | |
| --- | |
| ## π Datasets | |
| ### 1. Final Knowledge Base | |
| - **Size**: 19,352 question-answer pairs | |
| - **Categories**: 15 banking categories | |
| - **Intents**: 22 unique intents (ATM, CARD, LOAN, ACCOUNT, etc.) | |
| - **Source**: Combination of: | |
| - Bitext Retail Banking Dataset (Hugging Face) | |
| - RetailBanking-Conversations Dataset | |
| - Manually curated FAQs from SBI, ICICI, HDFC, Yes Bank, Axis Bank | |
| ### 2. Retriever Training Dataset | |
| - **Size**: 11,655 paraphrases | |
| - **Source**: 1,665 unique FAQs from knowledge base | |
| - **Paraphrases per FAQ**: | |
| - 4 English paraphrases | |
| - 2 Hinglish paraphrases | |
| - Original FAQ | |
| - **Training**: InfoNCE Loss + Triplet Loss with E5-base-v2 | |
| ### 3. Policy Network Training Dataset | |
| - **Size**: 182 queries from 6 chat sessions | |
| - **Format**: (state, action, reward) tuples | |
| - **Actions**: FETCH (1) or NO_FETCH (0) | |
| - **Rewards**: +2.0 (correct), +0.5 (needed fetch), -0.5 (incorrect) | |
| --- | |
| ## π Performance Metrics | |
| *Coming soon: Detailed performance metrics including accuracy, response time, token cost reduction, and user satisfaction scores.* | |
| --- | |
| ## π API Documentation | |
| ### Authentication | |
| #### Register | |
| ```http | |
| POST /api/v1/auth/register | |
| Content-Type: application/json | |
| { | |
| "username": "john_doe", | |
| "email": "[email protected]", | |
| "password": "securepassword123" | |
| } | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "message": "User registered successfully", | |
| "user_id": "507f1f77bcf86cd799439011" | |
| } | |
| ``` | |
| #### Login | |
| ```http | |
| POST /api/v1/auth/login | |
| Content-Type: application/json | |
| { | |
| "username": "john_doe", | |
| "password": "securepassword123" | |
| } | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "access_token": "eyJhbGciOiJIUzI1NiIs...", | |
| "token_type": "bearer" | |
| } | |
| ``` | |
| --- | |
| ### Chat | |
| #### Send Message | |
| ```http | |
| POST /api/v1/chat/ | |
| Authorization: Bearer <token> | |
| Content-Type: application/json | |
| { | |
| "query": "What are the interest rates for home loans?", | |
| "conversation_id": "optional-session-id" | |
| } | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "response": "Current home loan interest rates range from 8.5% to 9.5% per annum...", | |
| "conversation_id": "abc123", | |
| "metadata": { | |
| "policy_action": "FETCH", | |
| "retrieval_score": 0.89, | |
| "documents_retrieved": 5, | |
| "llm_provider": "groq" | |
| } | |
| } | |
| ``` | |
| #### Get Conversation History | |
| ```http | |
| GET /api/v1/chat/conversations/{conversation_id} | |
| Authorization: Bearer <token> | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "conversation_id": "abc123", | |
| "messages": [ | |
| { | |
| "role": "user", | |
| "content": "What are the interest rates?", | |
| "timestamp": "2025-11-28T10:30:00Z" | |
| }, | |
| { | |
| "role": "assistant", | |
| "content": "Current rates are...", | |
| "timestamp": "2025-11-28T10:30:05Z", | |
| "metadata": { | |
| "policy_action": "FETCH" | |
| } | |
| } | |
| ] | |
| } | |
| ``` | |
| #### List All Conversations | |
| ```http | |
| GET /api/v1/chat/conversations | |
| Authorization: Bearer <token> | |
| ``` | |
| #### Delete Conversation | |
| ```http | |
| DELETE /api/v1/chat/conversation/{conversation_id} | |
| Authorization: Bearer <token> | |
| ``` | |
| --- | |
| ## π Deployment | |
| ### HuggingFace Spaces (Backend) | |
| The backend is deployed on HuggingFace Spaces using Docker: | |
| 1. **Models are stored** on HuggingFace Hub: `eeshanyaj/questrag_models` | |
| 2. **On first startup**, models are automatically downloaded from HF Hub | |
| 3. **Docker container** runs FastAPI with uvicorn on port 7860 | |
| 4. **Environment secrets** are securely managed in HF Space settings | |
| **Deployment Steps:** | |
| ```bash | |
| # 1. Upload models to HuggingFace Hub | |
| huggingface-cli upload eeshanyaj/questrag_models \ | |
| app/models/best_policy_model.pth \ | |
| models/best_policy_model.pth | |
| # 2. Push backend code to HF Space | |
| git remote add space https://huggingface.co/spaces/eeshanyaj/questrag-backend | |
| git push space main | |
| # 3. Add environment secrets in HF Space Settings | |
| # (MongoDB URI, Groq keys, JWT secret, etc.) | |
| ``` | |
| ### Frontend Deployment (Vercel/Netlify) | |
| ```bash | |
| # Build for production | |
| npm run build | |
| # Deploy to Vercel | |
| vercel --prod | |
| # Update .env.production with backend URL | |
| VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space | |
| ``` | |
| --- | |
| ## π€ Contributing | |
| Contributions are welcome! Please follow these steps: | |
| 1. Fork the repository | |
| 2. Create a feature branch (`git checkout -b feature/amazing-feature`) | |
| 3. Commit your changes (`git commit -m 'Add amazing feature'`) | |
| 4. Push to the branch (`git push origin feature/amazing-feature`) | |
| 5. Open a Pull Request | |
| ### Development Guidelines | |
| - Follow PEP 8 for Python code | |
| - Use ESLint + Prettier for JavaScript/React | |
| - Write comprehensive docstrings and comments | |
| - Add unit tests for new features | |
| - Update documentation accordingly | |
| --- | |
| ## π License | |
| MIT License β see [LICENSE](LICENSE) | |
| --- | |
| ## π Acknowledgments | |
| ### Research Inspiration | |
| - **Main Paper**: "Optimizing Retrieval Augmented Generation for Domain-Specific Chatbots with Reinforcement Learning" (AAAI 2024) | |
| - **Additional References**: | |
| - "Evaluating BERT-based Rewards for Question Generation with RL" | |
| - "Self-Reasoning for Retrieval-Augmented Language Models" | |
| ### Open Source Resources | |
| - [RL-Self-Improving-RAG](https://github.com/subrata-samanta/RL-Self-Improving-RAG) | |
| - [ARENA](https://github.com/ren258/ARENA) | |
| - [RAGTechniques](https://github.com/NirDiamant/RAGTechniques) | |
| - [Financial-RAG-From-Scratch](https://github.com/cse-amarjeet/Financial-RAG-From-Scratch) | |
| ### Datasets | |
| - [Bitext Retail Banking Dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset) | |
| - [RetailBanking-Conversations](https://huggingface.co/datasets/oopere/RetailBanking-Conversations) | |
| ### Technologies | |
| - [FastAPI](https://fastapi.tiangolo.com/) | |
| - [React](https://reactjs.org/) | |
| - [HuggingFace](https://huggingface.co/) | |
| - [Groq](https://groq.com/) | |
| - [MongoDB Atlas](https://www.mongodb.com/cloud/atlas) | |
| --- | |
| ## π Contact | |
| **Eeshanya Amit Joshi** | |
| π§ [Email](mailto:[email protected]) | |
| πΌ [LinkedIn](https://www.linkedin.com/in/eeshanyajoshi/) | |
| --- | |
| ## π Status | |
| ### β **Backend Deployed & Live!** | |
| - π Backend API running on [HuggingFace Spaces](https://eeshanyaj-questrag-backend.hf.space) | |
| - π API Documentation available at [/docs](https://eeshanyaj-questrag-backend.hf.space/docs) | |
| - π Health status: [Check here](https://eeshanyaj-questrag-backend.hf.space/health) | |
| ### π§ **Frontend Deployment - Coming Soon!** | |
| - Will be deployed on Vercel/Netlify | |
| - Stay tuned for full application link! β€οΈ | |
| --- | |
| ## π Links | |
| - **Live Backend API:** https://eeshanyaj-questrag-backend.hf.space | |
| - **API Documentation:** https://eeshanyaj-questrag-backend.hf.space/docs | |
| - **Health Check:** https://eeshanyaj-questrag-backend.hf.space/health | |
| - **HuggingFace Space:** https://huggingface.co/spaces/eeshanyaj/questrag-backend | |
| - **Model Repository:** https://huggingface.co/eeshanyaj/questrag_models | |
| - **Research Paper:** [AAAI 2024 Workshop](https://arxiv.org/abs/2401.06800) | |
| --- | |
| <p align="center">β¨ Made with β€οΈ for the Banking Industry β¨</p> | |
| <p align="center">Powered by HuggingFace π€| Groq β‘| MongoDB π| Docker π³| </p> | |