Spaces:

eeshanyaj
/

questrag-backend

Sleeping

App Files Files Community

questrag-backend / README.md

eeshanyaj

added yaml format that was missing

006830b 23 days ago

preview code

raw

history blame contribute delete

18.6 kB

	---
	title: QUESTRAG Backend
	emoji: 🏦
	colorFrom: blue
	colorTo: green
	sdk: docker
	app_port: 7860
	---

	# 🏦 QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG

	[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/)
	[![FastAPI](https://img.shields.io/badge/FastAPI-0.104.1-green.svg)](https://fastapi.tiangolo.com/)
	[![React](https://img.shields.io/badge/React-18.3.1-blue.svg)](https://reactjs.org/)
	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
	[![Deployed on HuggingFace](https://img.shields.io/badge/🤗-HuggingFace%20Spaces-yellow)](https://huggingface.co/spaces/eeshanyaj/questrag-backend)

	> An intelligent banking chatbot powered by Retrieval-Augmented Generation (RAG) and Reinforcement Learning (RL) to provide accurate, context-aware responses to Indian banking queries while optimizing token costs.

	---

	## 📋 Table of Contents
	- [Overview](#overview)
	- [Key Features](#key-features)
	- [System Architecture](#system-architecture)
	- [Technology Stack](#technology-stack)
	- [Installation](#installation)
	- [Configuration](#configuration)
	- [Usage](#usage)
	- [Project Structure](#project-structure)
	- [Datasets](#datasets)
	- [Performance Metrics](#performance-metrics)
	- [API Documentation](#api-documentation)
	- [Deployment](#deployment)
	- [Contributing](#contributing)
	- [License](#license)
	- [Acknowledgments](#acknowledgments)
	- [Contact](#contact)
	- [Status](#status)
	- [Links](#links)

	---

	## 🎯 Overview
	QUESTRAG is an advanced banking chatbot designed to revolutionize customer support in the Indian banking sector. By combining Retrieval-Augmented Generation (RAG) with Reinforcement Learning (RL), the system intelligently decides when to fetch external context from a knowledge base and when to respond directly, reducing token costs by up to 31% while maintaining high accuracy.

	### Problem Statement
	Existing banking chatbots suffer from:
	- ❌ Limited response flexibility (rigid, rule-based systems)
	- ❌ Poor handling of informal/real-world queries
	- ❌ Lack of contextual understanding
	- ❌ High operational costs due to inefficient token usage
	- ❌ Low user satisfaction and trust

	### Solution
	QUESTRAG addresses these challenges through:
	- ✅ Domain-specific RAG trained on 19,000+ banking queries / support data
	- ✅ RL-optimized policy network (BERT-based) for smart context-fetching decisions
	- ✅ Fine-tuned retriever model (E5-base-v2) using InfoNCE + Triplet Loss
	- ✅ Groq LLM with HuggingFace fallback for reliable, fast responses
	- ✅ Full-stack web application with modern UI/UX and JWT authentication

	---

	## 🌟 Key Features

	### 🤖 Intelligent RAG Pipeline
	- FAISS-powered retrieval for fast similarity search across 19,352 documents
	- Fine-tuned embedding model (`e5-base-v2`) trained on English + Hinglish paraphrases
	- Context-aware response generation using Llama 3 models (8B & 70B) via Groq

	### 🧠 Reinforcement Learning System
	- BERT-based policy network (`bert-base-uncased`) for FETCH/NO_FETCH decisions
	- Reward-driven optimization (+2.0 accurate, +0.5 needed fetch, -0.5 incorrect)
	- 31% token cost reduction via optimized retrieval

	### 🎨 Modern Web Interface
	- React 18 + Vite with Tailwind CSS
	- Real-time chat, conversation history, JWT authentication
	- Responsive design for desktop and mobile

	### 🔐 Enterprise-Ready Backend
	- FastAPI + MongoDB Atlas for scalable async operations
	- JWT authentication with secure password hashing (bcrypt)
	- Multi-provider LLM (Groq → HuggingFace automatic fallback)
	- Deployed on HuggingFace Spaces with Docker containerization

	---

	## 🏗️ System Architecture

	<p align="center">
	<img src="./assets/system.png" alt="System Architecture Diagram" width="750"/>
	</p>

	### 🔄 Workflow
	1. User Query → FastAPI receives query via REST API
	2. Policy Decision → BERT-based RL model decides FETCH or NO_FETCH
	3. Conditional Retrieval → If FETCH → Retrieve top-5 docs from FAISS using E5-base-v2
	4. Response Generation → Llama 3 (via Groq) generates final answer
	5. Evaluation & Logging → Logged in MongoDB + reward-based model update

	---

	## 🔄 Sequence Diagram

	<p align="center">
	<img src="./assets/sequence_diagram.png" alt="Sequence Diagram" width="750"/>
	</p>

	---

	## 🛠️ Technology Stack

	### Frontend
	- ⚛️ React 18.3.1 + Vite 5.4.2
	- 🎨 Tailwind CSS 3.4.1
	- 🔄 React Context API + Axios + React Router DOM

	### Backend
	- 🚀 FastAPI 0.104.1
	- 🗄️ MongoDB Atlas + Motor (async driver)
	- 🔑 JWT Auth + Passlib (bcrypt)
	- 🤖 PyTorch 2.9.1, Transformers 4.57, FAISS 1.13.0
	- 💬 Groq (Llama 3.1 8B Instant / Llama 3.3 70B Versatile)
	- 🎯 Sentence Transformers 5.1.2

	### Machine Learning
	- 🧠 Policy Network: BERT-base-uncased (trained with RL)
	- 🔍 Retriever: E5-base-v2 (fine-tuned with InfoNCE + Triplet Loss)
	- 📊 Vector Store: FAISS (19,352 documents)

	### Deployment
	- 🐳 Docker (HuggingFace Spaces)
	- 🤗 HuggingFace Hub (model storage)
	- ☁️ MongoDB Atlas (cloud database)
	- 🌐 Python 3.12 + uvicorn

	---

	## ⚙️ Installation

	### 🧩 Prerequisites
	- Python 3.12+
	- Node.js 18+
	- MongoDB Atlas account (or local MongoDB 6.0+)
	- Groq API key (or HuggingFace token)

	### 🔧 Backend Setup (Local Development)

	```bash
	# Navigate to backend
	cd backend

	# Create virtual environment
	python -m venv venv

	# Activate it
	source venv/bin/activate # Linux/Mac
	venv\Scripts\activate # Windows

	# Install dependencies
	pip install -r requirements.txt

	# Create environment file
	cp .env.example .env
	# Edit .env with your credentials (see Configuration section)

	# Build FAISS index (one-time setup)
	python build_faiss_index.py

	# Start backend server
	uvicorn app.main:app --reload --port 8000
	```

	### 💻 Frontend Setup

	```bash
	# Navigate to frontend
	cd frontend

	# Install dependencies
	npm install

	# Create environment file
	cp .env.example .env
	# Update VITE_API_URL to point to your backend

	# Start dev server
	npm run dev
	```

	---

	## ⚙️ Configuration

	### 🔑 Backend `.env` (Key Parameters)

	\| Category \| Key \| Example / Description \|
	\|-------------------\|----------------------------------\|--------------------------------------------------\|
	\| Environment \| `ENVIRONMENT` \| `development` or `production` \|
	\| MongoDB \| `MONGODB_URI` \| `mongodb+srv://user:[email protected]/` \|
	\| Authentication \| `SECRET_KEY` \| Generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"` \|
	\| \| `ALGORITHM` \| `HS256` \|
	\| \| `ACCESS_TOKEN_EXPIRE_MINUTES` \| `1440` (24 hours) \|
	\| Groq API \| `GROQ_API_KEY_1` \| Your primary Groq API key \|
	\| \| `GROQ_API_KEY_2` \| Secondary key (optional) \|
	\| \| `GROQ_API_KEY_3` \| Tertiary key (optional) \|
	\| \| `GROQ_CHAT_MODEL` \| `llama-3.1-8b-instant` \|
	\| \| `GROQ_EVAL_MODEL` \| `llama-3.3-70b-versatile` \|
	\| HuggingFace \| `HF_TOKEN_1` \| HuggingFace token (fallback LLM) \|
	\| \| `HF_MODEL_REPO` \| `eeshanyaj/questrag_models` (for model download) \|
	\| Model Paths \| `POLICY_MODEL_PATH` \| `app/models/best_policy_model.pth` \|
	\| \| `RETRIEVER_MODEL_PATH` \| `app/models/best_retriever_model.pth` \|
	\| \| `FAISS_INDEX_PATH` \| `app/models/faiss_index.pkl` \|
	\| \| `KB_PATH` \| `app/data/final_knowledge_base.jsonl` \|
	\| Device \| `DEVICE` \| `cpu` or `cuda` \|
	\| RAG Params \| `TOP_K` \| `5` (number of documents to retrieve) \|
	\| \| `SIMILARITY_THRESHOLD` \| `0.5` (minimum similarity score) \|
	\| Policy Network \| `CONFIDENCE_THRESHOLD` \| `0.7` (policy decision confidence) \|
	\| CORS \| `ALLOWED_ORIGINS` \| `http://localhost:5173` or `*` \|

	### 🌐 Frontend `.env`

	```bash
	# Local development
	VITE_API_URL=http://localhost:8000

	# Production (HuggingFace Spaces)
	VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space
	```

	---

	## 🚀 Usage

	### 🖥️ Local Development

	#### Start Backend Server

	```bash
	cd backend
	source venv/bin/activate # or venv\Scripts\activate
	uvicorn app.main:app --reload --port 8000
	```

	- Backend: http://localhost:8000
	- API Docs: http://localhost:8000/docs
	- Health Check: http://localhost:8000/health

	#### Start Frontend Dev Server

	```bash
	cd frontend
	npm run dev
	```

	- Frontend: http://localhost:5173

	### 🌐 Production (HuggingFace Spaces)

	Backend API:
	- Base URL: https://eeshanyaj-questrag-backend.hf.space
	- API Docs: https://eeshanyaj-questrag-backend.hf.space/docs
	- Health Check: https://eeshanyaj-questrag-backend.hf.space/health

	Frontend (Coming Soon):
	- Will be deployed on Vercel/Netlify

	---

	## 📁 Project Structure

	```
	questrag/
	│
	├── backend/
	│ ├── app/
	│ │ ├── api/v1/
	│ │ │ ├── auth.py # Auth endpoints (register, login)
	│ │ │ └── chat.py # Chat endpoints
	│ │ ├── core/
	│ │ │ ├── llm_manager.py # Groq + HF LLM orchestration
	│ │ │ └── security.py # JWT & password hashing
	│ │ ├── ml/
	│ │ │ ├── policy_network.py # RL Policy model (BERT)
	│ │ │ └── retriever.py # E5-base-v2 retriever
	│ │ ├── db/
	│ │ │ ├── mongodb.py # MongoDB connection
	│ │ │ └── repositories/ # User & conversation repos
	│ │ ├── services/
	│ │ │ └── chat_service.py # Orchestration logic
	│ │ ├── models/
	│ │ │ ├── best_policy_model.pth # Trained policy network
	│ │ │ ├── best_retriever_model.pth # Fine-tuned retriever
	│ │ │ └── faiss_index.pkl # FAISS vector store
	│ │ ├── data/
	│ │ │ └── final_knowledge_base.jsonl # 19,352 Q&A pairs
	│ │ ├── config.py # Settings & env vars
	│ │ └── main.py # FastAPI app entry point
	│ ├── Dockerfile # Docker config for HF Spaces
	│ ├── requirements.txt
	│ └── .env.example
	│
	└── frontend/
	├── src/
	│ ├── components/ # UI Components
	│ ├── context/ # Auth Context
	│ ├── pages/ # Login, Register, Chat
	│ ├── services/api.js # Axios Client
	│ ├── App.jsx
	│ └── main.jsx
	├── package.json
	└── .env
	```

	---

	## 📊 Datasets

	### 1. Final Knowledge Base
	- Size: 19,352 question-answer pairs
	- Categories: 15 banking categories
	- Intents: 22 unique intents (ATM, CARD, LOAN, ACCOUNT, etc.)
	- Source: Combination of:
	- Bitext Retail Banking Dataset (Hugging Face)
	- RetailBanking-Conversations Dataset
	- Manually curated FAQs from SBI, ICICI, HDFC, Yes Bank, Axis Bank

	### 2. Retriever Training Dataset
	- Size: 11,655 paraphrases
	- Source: 1,665 unique FAQs from knowledge base
	- Paraphrases per FAQ:
	- 4 English paraphrases
	- 2 Hinglish paraphrases
	- Original FAQ
	- Training: InfoNCE Loss + Triplet Loss with E5-base-v2

	### 3. Policy Network Training Dataset
	- Size: 182 queries from 6 chat sessions
	- Format: (state, action, reward) tuples
	- Actions: FETCH (1) or NO_FETCH (0)
	- Rewards: +2.0 (correct), +0.5 (needed fetch), -0.5 (incorrect)

	---

	## 📈 Performance Metrics

	Coming soon: Detailed performance metrics including accuracy, response time, token cost reduction, and user satisfaction scores.

	---

	## 📚 API Documentation

	### Authentication

	#### Register

	```http
	POST /api/v1/auth/register
	Content-Type: application/json

	{
	"username": "john_doe",
	"email": "[email protected]",
	"password": "securepassword123"
	}
	```

	Response:

	```json
	{
	"message": "User registered successfully",
	"user_id": "507f1f77bcf86cd799439011"
	}
	```

	#### Login

	```http
	POST /api/v1/auth/login
	Content-Type: application/json

	{
	"username": "john_doe",
	"password": "securepassword123"
	}
	```

	Response:

	```json
	{
	"access_token": "eyJhbGciOiJIUzI1NiIs...",
	"token_type": "bearer"
	}
	```

	---

	### Chat

	#### Send Message

	```http
	POST /api/v1/chat/
	Authorization: Bearer <token>
	Content-Type: application/json

	{
	"query": "What are the interest rates for home loans?",
	"conversation_id": "optional-session-id"
	}
	```

	Response:

	```json
	{
	"response": "Current home loan interest rates range from 8.5% to 9.5% per annum...",
	"conversation_id": "abc123",
	"metadata": {
	"policy_action": "FETCH",
	"retrieval_score": 0.89,
	"documents_retrieved": 5,
	"llm_provider": "groq"
	}
	}
	```

	#### Get Conversation History

	```http
	GET /api/v1/chat/conversations/{conversation_id}
	Authorization: Bearer <token>
	```

	Response:

	```json
	{
	"conversation_id": "abc123",
	"messages": [
	{
	"role": "user",
	"content": "What are the interest rates?",
	"timestamp": "2025-11-28T10:30:00Z"
	},
	{
	"role": "assistant",
	"content": "Current rates are...",
	"timestamp": "2025-11-28T10:30:05Z",
	"metadata": {
	"policy_action": "FETCH"
	}
	}
	]
	}
	```

	#### List All Conversations

	```http
	GET /api/v1/chat/conversations
	Authorization: Bearer <token>
	```

	#### Delete Conversation

	```http
	DELETE /api/v1/chat/conversation/{conversation_id}
	Authorization: Bearer <token>
	```

	---

	## 🚀 Deployment

	### HuggingFace Spaces (Backend)

	The backend is deployed on HuggingFace Spaces using Docker:

	1. Models are stored on HuggingFace Hub: `eeshanyaj/questrag_models`
	2. On first startup, models are automatically downloaded from HF Hub
	3. Docker container runs FastAPI with uvicorn on port 7860
	4. Environment secrets are securely managed in HF Space settings

	Deployment Steps:

	```bash
	# 1. Upload models to HuggingFace Hub
	huggingface-cli upload eeshanyaj/questrag_models \
	app/models/best_policy_model.pth \
	models/best_policy_model.pth

	# 2. Push backend code to HF Space
	git remote add space https://huggingface.co/spaces/eeshanyaj/questrag-backend
	git push space main

	# 3. Add environment secrets in HF Space Settings
	# (MongoDB URI, Groq keys, JWT secret, etc.)
	```

	### Frontend Deployment (Vercel/Netlify)

	```bash
	# Build for production
	npm run build

	# Deploy to Vercel
	vercel --prod

	# Update .env.production with backend URL
	VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space
	```

	---

	## 🤝 Contributing

	Contributions are welcome! Please follow these steps:

	1. Fork the repository
	2. Create a feature branch (`git checkout -b feature/amazing-feature`)
	3. Commit your changes (`git commit -m 'Add amazing feature'`)
	4. Push to the branch (`git push origin feature/amazing-feature`)
	5. Open a Pull Request

	### Development Guidelines
	- Follow PEP 8 for Python code
	- Use ESLint + Prettier for JavaScript/React
	- Write comprehensive docstrings and comments
	- Add unit tests for new features
	- Update documentation accordingly

	---

	## 📄 License

	MIT License — see [LICENSE](LICENSE)

	---

	## 🙏 Acknowledgments

	### Research Inspiration
	- Main Paper: "Optimizing Retrieval Augmented Generation for Domain-Specific Chatbots with Reinforcement Learning" (AAAI 2024)
	- Additional References:
	- "Evaluating BERT-based Rewards for Question Generation with RL"
	- "Self-Reasoning for Retrieval-Augmented Language Models"

	### Open Source Resources
	- [RL-Self-Improving-RAG](https://github.com/subrata-samanta/RL-Self-Improving-RAG)
	- [ARENA](https://github.com/ren258/ARENA)
	- [RAGTechniques](https://github.com/NirDiamant/RAGTechniques)
	- [Financial-RAG-From-Scratch](https://github.com/cse-amarjeet/Financial-RAG-From-Scratch)

	### Datasets
	- [Bitext Retail Banking Dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset)
	- [RetailBanking-Conversations](https://huggingface.co/datasets/oopere/RetailBanking-Conversations)

	### Technologies
	- [FastAPI](https://fastapi.tiangolo.com/)
	- [React](https://reactjs.org/)
	- [HuggingFace](https://huggingface.co/)
	- [Groq](https://groq.com/)
	- [MongoDB Atlas](https://www.mongodb.com/cloud/atlas)

	---

	## 📞 Contact

	Eeshanya Amit Joshi
	📧 [Email](mailto:[email protected])
	💼 [LinkedIn](https://www.linkedin.com/in/eeshanyajoshi/)

	---

	## 📈 Status

	### ✅ Backend Deployed & Live!
	- 🚀 Backend API running on [HuggingFace Spaces](https://eeshanyaj-questrag-backend.hf.space)
	- 📚 API Documentation available at [/docs](https://eeshanyaj-questrag-backend.hf.space/docs)
	- 💚 Health status: [Check here](https://eeshanyaj-questrag-backend.hf.space/health)

	### 🚧 Frontend Deployment - Coming Soon!
	- Will be deployed on Vercel/Netlify
	- Stay tuned for full application link! ❤️

	---

	## 🔗 Links

	- Live Backend API: https://eeshanyaj-questrag-backend.hf.space
	- API Documentation: https://eeshanyaj-questrag-backend.hf.space/docs
	- Health Check: https://eeshanyaj-questrag-backend.hf.space/health
	- HuggingFace Space: https://huggingface.co/spaces/eeshanyaj/questrag-backend
	- Model Repository: https://huggingface.co/eeshanyaj/questrag_models
	- Research Paper: [AAAI 2024 Workshop](https://arxiv.org/abs/2401.06800)

	---

	<p align="center">✨ Made with ❤️ for the Banking Industry ✨</p>
	<p align="center">Powered by HuggingFace 🤗\| Groq ⚡\| MongoDB 🍃\| Docker 🐳\| </p>