questrag-backend / README.md
eeshanyaj's picture
added yaml format that was missing
006830b
---
title: QUESTRAG Backend
emoji: 🏦
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
---
# 🏦 QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG
[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.104.1-green.svg)](https://fastapi.tiangolo.com/)
[![React](https://img.shields.io/badge/React-18.3.1-blue.svg)](https://reactjs.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Deployed on HuggingFace](https://img.shields.io/badge/πŸ€—-HuggingFace%20Spaces-yellow)](https://huggingface.co/spaces/eeshanyaj/questrag-backend)
> An intelligent banking chatbot powered by **Retrieval-Augmented Generation (RAG)** and **Reinforcement Learning (RL)** to provide accurate, context-aware responses to Indian banking queries while optimizing token costs.
---
## πŸ“‹ Table of Contents
- [Overview](#overview)
- [Key Features](#key-features)
- [System Architecture](#system-architecture)
- [Technology Stack](#technology-stack)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage](#usage)
- [Project Structure](#project-structure)
- [Datasets](#datasets)
- [Performance Metrics](#performance-metrics)
- [API Documentation](#api-documentation)
- [Deployment](#deployment)
- [Contributing](#contributing)
- [License](#license)
- [Acknowledgments](#acknowledgments)
- [Contact](#contact)
- [Status](#status)
- [Links](#links)
---
## 🎯 Overview
QUESTRAG is an **advanced banking chatbot** designed to revolutionize customer support in the Indian banking sector. By combining **Retrieval-Augmented Generation (RAG)** with **Reinforcement Learning (RL)**, the system intelligently decides when to fetch external context from a knowledge base and when to respond directly, **reducing token costs by up to 31%** while maintaining high accuracy.
### Problem Statement
Existing banking chatbots suffer from:
- ❌ Limited response flexibility (rigid, rule-based systems)
- ❌ Poor handling of informal/real-world queries
- ❌ Lack of contextual understanding
- ❌ High operational costs due to inefficient token usage
- ❌ Low user satisfaction and trust
### Solution
QUESTRAG addresses these challenges through:
- βœ… **Domain-specific RAG** trained on 19,000+ banking queries / support data
- βœ… **RL-optimized policy network** (BERT-based) for smart context-fetching decisions
- βœ… **Fine-tuned retriever model** (E5-base-v2) using InfoNCE + Triplet Loss
- βœ… **Groq LLM with HuggingFace fallback** for reliable, fast responses
- βœ… **Full-stack web application** with modern UI/UX and JWT authentication
---
## 🌟 Key Features
### πŸ€– Intelligent RAG Pipeline
- **FAISS-powered retrieval** for fast similarity search across 19,352 documents
- **Fine-tuned embedding model** (`e5-base-v2`) trained on English + Hinglish paraphrases
- **Context-aware response generation** using Llama 3 models (8B & 70B) via Groq
### 🧠 Reinforcement Learning System
- **BERT-based policy network** (`bert-base-uncased`) for FETCH/NO_FETCH decisions
- **Reward-driven optimization** (+2.0 accurate, +0.5 needed fetch, -0.5 incorrect)
- **31% token cost reduction** via optimized retrieval
### 🎨 Modern Web Interface
- **React 18 + Vite** with Tailwind CSS
- **Real-time chat**, conversation history, JWT authentication
- **Responsive design** for desktop and mobile
### πŸ” Enterprise-Ready Backend
- **FastAPI + MongoDB Atlas** for scalable async operations
- **JWT authentication** with secure password hashing (bcrypt)
- **Multi-provider LLM** (Groq β†’ HuggingFace automatic fallback)
- **Deployed on HuggingFace Spaces** with Docker containerization
---
## πŸ—οΈ System Architecture
<p align="center">
<img src="./assets/system.png" alt="System Architecture Diagram" width="750"/>
</p>
### πŸ”„ Workflow
1. **User Query** β†’ FastAPI receives query via REST API
2. **Policy Decision** β†’ BERT-based RL model decides FETCH or NO_FETCH
3. **Conditional Retrieval** β†’ If FETCH β†’ Retrieve top-5 docs from FAISS using E5-base-v2
4. **Response Generation** β†’ Llama 3 (via Groq) generates final answer
5. **Evaluation & Logging** β†’ Logged in MongoDB + reward-based model update
---
## πŸ”„ Sequence Diagram
<p align="center">
<img src="./assets/sequence_diagram.png" alt="Sequence Diagram" width="750"/>
</p>
---
## πŸ› οΈ Technology Stack
### **Frontend**
- βš›οΈ React 18.3.1 + Vite 5.4.2
- 🎨 Tailwind CSS 3.4.1
- πŸ”„ React Context API + Axios + React Router DOM
### **Backend**
- πŸš€ FastAPI 0.104.1
- πŸ—„οΈ MongoDB Atlas + Motor (async driver)
- πŸ”‘ JWT Auth + Passlib (bcrypt)
- πŸ€– PyTorch 2.9.1, Transformers 4.57, FAISS 1.13.0
- πŸ’¬ Groq (Llama 3.1 8B Instant / Llama 3.3 70B Versatile)
- 🎯 Sentence Transformers 5.1.2
### **Machine Learning**
- 🧠 **Policy Network**: BERT-base-uncased (trained with RL)
- πŸ” **Retriever**: E5-base-v2 (fine-tuned with InfoNCE + Triplet Loss)
- πŸ“Š **Vector Store**: FAISS (19,352 documents)
### **Deployment**
- 🐳 Docker (HuggingFace Spaces)
- πŸ€— HuggingFace Hub (model storage)
- ☁️ MongoDB Atlas (cloud database)
- 🌐 Python 3.12 + uvicorn
---
## βš™οΈ Installation
### 🧩 Prerequisites
- Python 3.12+
- Node.js 18+
- MongoDB Atlas account (or local MongoDB 6.0+)
- Groq API key (or HuggingFace token)
### πŸ”§ Backend Setup (Local Development)
```bash
# Navigate to backend
cd backend
# Create virtual environment
python -m venv venv
# Activate it
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Create environment file
cp .env.example .env
# Edit .env with your credentials (see Configuration section)
# Build FAISS index (one-time setup)
python build_faiss_index.py
# Start backend server
uvicorn app.main:app --reload --port 8000
```
### πŸ’» Frontend Setup
```bash
# Navigate to frontend
cd frontend
# Install dependencies
npm install
# Create environment file
cp .env.example .env
# Update VITE_API_URL to point to your backend
# Start dev server
npm run dev
```
---
## βš™οΈ Configuration
### πŸ”‘ Backend `.env` (Key Parameters)
| **Category** | **Key** | **Example / Description** |
|-------------------|----------------------------------|--------------------------------------------------|
| Environment | `ENVIRONMENT` | `development` or `production` |
| MongoDB | `MONGODB_URI` | `mongodb+srv://user:[email protected]/` |
| Authentication | `SECRET_KEY` | Generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"` |
| | `ALGORITHM` | `HS256` |
| | `ACCESS_TOKEN_EXPIRE_MINUTES` | `1440` (24 hours) |
| Groq API | `GROQ_API_KEY_1` | Your primary Groq API key |
| | `GROQ_API_KEY_2` | Secondary key (optional) |
| | `GROQ_API_KEY_3` | Tertiary key (optional) |
| | `GROQ_CHAT_MODEL` | `llama-3.1-8b-instant` |
| | `GROQ_EVAL_MODEL` | `llama-3.3-70b-versatile` |
| HuggingFace | `HF_TOKEN_1` | HuggingFace token (fallback LLM) |
| | `HF_MODEL_REPO` | `eeshanyaj/questrag_models` (for model download) |
| Model Paths | `POLICY_MODEL_PATH` | `app/models/best_policy_model.pth` |
| | `RETRIEVER_MODEL_PATH` | `app/models/best_retriever_model.pth` |
| | `FAISS_INDEX_PATH` | `app/models/faiss_index.pkl` |
| | `KB_PATH` | `app/data/final_knowledge_base.jsonl` |
| Device | `DEVICE` | `cpu` or `cuda` |
| RAG Params | `TOP_K` | `5` (number of documents to retrieve) |
| | `SIMILARITY_THRESHOLD` | `0.5` (minimum similarity score) |
| Policy Network | `CONFIDENCE_THRESHOLD` | `0.7` (policy decision confidence) |
| CORS | `ALLOWED_ORIGINS` | `http://localhost:5173` or `*` |
### 🌐 Frontend `.env`
```bash
# Local development
VITE_API_URL=http://localhost:8000
# Production (HuggingFace Spaces)
VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space
```
---
## πŸš€ Usage
### πŸ–₯️ Local Development
#### Start Backend Server
```bash
cd backend
source venv/bin/activate # or venv\Scripts\activate
uvicorn app.main:app --reload --port 8000
```
- **Backend**: http://localhost:8000
- **API Docs**: http://localhost:8000/docs
- **Health Check**: http://localhost:8000/health
#### Start Frontend Dev Server
```bash
cd frontend
npm run dev
```
- **Frontend**: http://localhost:5173
### 🌐 Production (HuggingFace Spaces)
**Backend API**:
- **Base URL**: https://eeshanyaj-questrag-backend.hf.space
- **API Docs**: https://eeshanyaj-questrag-backend.hf.space/docs
- **Health Check**: https://eeshanyaj-questrag-backend.hf.space/health
**Frontend** (Coming Soon):
- Will be deployed on Vercel/Netlify
---
## πŸ“ Project Structure
```
questrag/
β”‚
β”œβ”€β”€ backend/
β”‚ β”œβ”€β”€ app/
β”‚ β”‚ β”œβ”€β”€ api/v1/
β”‚ β”‚ β”‚ β”œβ”€β”€ auth.py # Auth endpoints (register, login)
β”‚ β”‚ β”‚ └── chat.py # Chat endpoints
β”‚ β”‚ β”œβ”€β”€ core/
β”‚ β”‚ β”‚ β”œβ”€β”€ llm_manager.py # Groq + HF LLM orchestration
β”‚ β”‚ β”‚ └── security.py # JWT & password hashing
β”‚ β”‚ β”œβ”€β”€ ml/
β”‚ β”‚ β”‚ β”œβ”€β”€ policy_network.py # RL Policy model (BERT)
β”‚ β”‚ β”‚ └── retriever.py # E5-base-v2 retriever
β”‚ β”‚ β”œβ”€β”€ db/
β”‚ β”‚ β”‚ β”œβ”€β”€ mongodb.py # MongoDB connection
β”‚ β”‚ β”‚ └── repositories/ # User & conversation repos
β”‚ β”‚ β”œβ”€β”€ services/
β”‚ β”‚ β”‚ └── chat_service.py # Orchestration logic
β”‚ β”‚ β”œβ”€β”€ models/
β”‚ β”‚ β”‚ β”œβ”€β”€ best_policy_model.pth # Trained policy network
β”‚ β”‚ β”‚ β”œβ”€β”€ best_retriever_model.pth # Fine-tuned retriever
β”‚ β”‚ β”‚ └── faiss_index.pkl # FAISS vector store
β”‚ β”‚ β”œβ”€β”€ data/
β”‚ β”‚ β”‚ └── final_knowledge_base.jsonl # 19,352 Q&A pairs
β”‚ β”‚ β”œβ”€β”€ config.py # Settings & env vars
β”‚ β”‚ └── main.py # FastAPI app entry point
β”‚ β”œβ”€β”€ Dockerfile # Docker config for HF Spaces
β”‚ β”œβ”€β”€ requirements.txt
β”‚ └── .env.example
β”‚
└── frontend/
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ components/ # UI Components
β”‚ β”œβ”€β”€ context/ # Auth Context
β”‚ β”œβ”€β”€ pages/ # Login, Register, Chat
β”‚ β”œβ”€β”€ services/api.js # Axios Client
β”‚ β”œβ”€β”€ App.jsx
β”‚ └── main.jsx
β”œβ”€β”€ package.json
└── .env
```
---
## πŸ“Š Datasets
### 1. Final Knowledge Base
- **Size**: 19,352 question-answer pairs
- **Categories**: 15 banking categories
- **Intents**: 22 unique intents (ATM, CARD, LOAN, ACCOUNT, etc.)
- **Source**: Combination of:
- Bitext Retail Banking Dataset (Hugging Face)
- RetailBanking-Conversations Dataset
- Manually curated FAQs from SBI, ICICI, HDFC, Yes Bank, Axis Bank
### 2. Retriever Training Dataset
- **Size**: 11,655 paraphrases
- **Source**: 1,665 unique FAQs from knowledge base
- **Paraphrases per FAQ**:
- 4 English paraphrases
- 2 Hinglish paraphrases
- Original FAQ
- **Training**: InfoNCE Loss + Triplet Loss with E5-base-v2
### 3. Policy Network Training Dataset
- **Size**: 182 queries from 6 chat sessions
- **Format**: (state, action, reward) tuples
- **Actions**: FETCH (1) or NO_FETCH (0)
- **Rewards**: +2.0 (correct), +0.5 (needed fetch), -0.5 (incorrect)
---
## πŸ“ˆ Performance Metrics
*Coming soon: Detailed performance metrics including accuracy, response time, token cost reduction, and user satisfaction scores.*
---
## πŸ“š API Documentation
### Authentication
#### Register
```http
POST /api/v1/auth/register
Content-Type: application/json
{
"username": "john_doe",
"email": "[email protected]",
"password": "securepassword123"
}
```
**Response:**
```json
{
"message": "User registered successfully",
"user_id": "507f1f77bcf86cd799439011"
}
```
#### Login
```http
POST /api/v1/auth/login
Content-Type: application/json
{
"username": "john_doe",
"password": "securepassword123"
}
```
**Response:**
```json
{
"access_token": "eyJhbGciOiJIUzI1NiIs...",
"token_type": "bearer"
}
```
---
### Chat
#### Send Message
```http
POST /api/v1/chat/
Authorization: Bearer <token>
Content-Type: application/json
{
"query": "What are the interest rates for home loans?",
"conversation_id": "optional-session-id"
}
```
**Response:**
```json
{
"response": "Current home loan interest rates range from 8.5% to 9.5% per annum...",
"conversation_id": "abc123",
"metadata": {
"policy_action": "FETCH",
"retrieval_score": 0.89,
"documents_retrieved": 5,
"llm_provider": "groq"
}
}
```
#### Get Conversation History
```http
GET /api/v1/chat/conversations/{conversation_id}
Authorization: Bearer <token>
```
**Response:**
```json
{
"conversation_id": "abc123",
"messages": [
{
"role": "user",
"content": "What are the interest rates?",
"timestamp": "2025-11-28T10:30:00Z"
},
{
"role": "assistant",
"content": "Current rates are...",
"timestamp": "2025-11-28T10:30:05Z",
"metadata": {
"policy_action": "FETCH"
}
}
]
}
```
#### List All Conversations
```http
GET /api/v1/chat/conversations
Authorization: Bearer <token>
```
#### Delete Conversation
```http
DELETE /api/v1/chat/conversation/{conversation_id}
Authorization: Bearer <token>
```
---
## πŸš€ Deployment
### HuggingFace Spaces (Backend)
The backend is deployed on HuggingFace Spaces using Docker:
1. **Models are stored** on HuggingFace Hub: `eeshanyaj/questrag_models`
2. **On first startup**, models are automatically downloaded from HF Hub
3. **Docker container** runs FastAPI with uvicorn on port 7860
4. **Environment secrets** are securely managed in HF Space settings
**Deployment Steps:**
```bash
# 1. Upload models to HuggingFace Hub
huggingface-cli upload eeshanyaj/questrag_models \
app/models/best_policy_model.pth \
models/best_policy_model.pth
# 2. Push backend code to HF Space
git remote add space https://huggingface.co/spaces/eeshanyaj/questrag-backend
git push space main
# 3. Add environment secrets in HF Space Settings
# (MongoDB URI, Groq keys, JWT secret, etc.)
```
### Frontend Deployment (Vercel/Netlify)
```bash
# Build for production
npm run build
# Deploy to Vercel
vercel --prod
# Update .env.production with backend URL
VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space
```
---
## 🀝 Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
### Development Guidelines
- Follow PEP 8 for Python code
- Use ESLint + Prettier for JavaScript/React
- Write comprehensive docstrings and comments
- Add unit tests for new features
- Update documentation accordingly
---
## πŸ“„ License
MIT License β€” see [LICENSE](LICENSE)
---
## πŸ™ Acknowledgments
### Research Inspiration
- **Main Paper**: "Optimizing Retrieval Augmented Generation for Domain-Specific Chatbots with Reinforcement Learning" (AAAI 2024)
- **Additional References**:
- "Evaluating BERT-based Rewards for Question Generation with RL"
- "Self-Reasoning for Retrieval-Augmented Language Models"
### Open Source Resources
- [RL-Self-Improving-RAG](https://github.com/subrata-samanta/RL-Self-Improving-RAG)
- [ARENA](https://github.com/ren258/ARENA)
- [RAGTechniques](https://github.com/NirDiamant/RAGTechniques)
- [Financial-RAG-From-Scratch](https://github.com/cse-amarjeet/Financial-RAG-From-Scratch)
### Datasets
- [Bitext Retail Banking Dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset)
- [RetailBanking-Conversations](https://huggingface.co/datasets/oopere/RetailBanking-Conversations)
### Technologies
- [FastAPI](https://fastapi.tiangolo.com/)
- [React](https://reactjs.org/)
- [HuggingFace](https://huggingface.co/)
- [Groq](https://groq.com/)
- [MongoDB Atlas](https://www.mongodb.com/cloud/atlas)
---
## πŸ“ž Contact
**Eeshanya Amit Joshi**
πŸ“§ [Email](mailto:[email protected])
πŸ’Ό [LinkedIn](https://www.linkedin.com/in/eeshanyajoshi/)
---
## πŸ“ˆ Status
### βœ… **Backend Deployed & Live!**
- πŸš€ Backend API running on [HuggingFace Spaces](https://eeshanyaj-questrag-backend.hf.space)
- πŸ“š API Documentation available at [/docs](https://eeshanyaj-questrag-backend.hf.space/docs)
- πŸ’š Health status: [Check here](https://eeshanyaj-questrag-backend.hf.space/health)
### 🚧 **Frontend Deployment - Coming Soon!**
- Will be deployed on Vercel/Netlify
- Stay tuned for full application link! ❀️
---
## πŸ”— Links
- **Live Backend API:** https://eeshanyaj-questrag-backend.hf.space
- **API Documentation:** https://eeshanyaj-questrag-backend.hf.space/docs
- **Health Check:** https://eeshanyaj-questrag-backend.hf.space/health
- **HuggingFace Space:** https://huggingface.co/spaces/eeshanyaj/questrag-backend
- **Model Repository:** https://huggingface.co/eeshanyaj/questrag_models
- **Research Paper:** [AAAI 2024 Workshop](https://arxiv.org/abs/2401.06800)
---
<p align="center">✨ Made with ❀️ for the Banking Industry ✨</p>
<p align="center">Powered by HuggingFace πŸ€—| Groq ⚑| MongoDB πŸƒ| Docker 🐳| </p>