Spaces:

eeshanyaj
/

questrag-backend

Running

App Files Files Community

eeshanyaj commited on 25 days ago

Commit

0c73a18

1 Parent(s): eddc8a4

hf_space readme changed

Browse files

Files changed (1) hide show

README.md +589 -21

README.md CHANGED Viewed

@@ -1,28 +1,596 @@
 ---
-title: QUESTRAG Backend
-emoji: 🏦
-colorFrom: blue
-colorTo: green
-sdk: docker
-pinned: false
 ---
-# QUESTRAG Banking Chatbot Backend
-FastAPI backend for QUESTRAG - Banking RAG Chatbot with Reinforcement Learning.
-## Features
-- 🤖 RAG Pipeline with FAISS
-- 🧠 RL-based Policy Network
-- ⚡ Groq (Llama 3) + HuggingFace fallback
-- 🔐 JWT Authentication
-- 📊 MongoDB Atlas
-## API Documentation
-Visit `/docs` for interactive Swagger UI.
-## Endpoints
-- `POST /api/v1/auth/register` - Register new user
-- `POST /api/v1/auth/login` - Login
-- `POST /api/v1/chat/` - Send message (requires auth)
-- `GET /health` - Health check

+# 🏦 QUESTRAG - Banking QUEries and Support system via Trained Reinforced RAG
+[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/)
+[![FastAPI](https://img.shields.io/badge/FastAPI-0.104.1-green.svg)](https://fastapi.tiangolo.com/)
+[![React](https://img.shields.io/badge/React-18.3.1-blue.svg)](https://reactjs.org/)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Deployed on HuggingFace](https://img.shields.io/badge/🤗-HuggingFace%20Spaces-yellow)](https://huggingface.co/spaces/eeshanyaj/questrag-backend)
+> An intelligent banking chatbot powered by **Retrieval-Augmented Generation (RAG)** and **Reinforcement Learning (RL)** to provide accurate, context-aware responses to Indian banking queries while optimizing token costs.
+---
+## 📋 Table of Contents
+- [Overview](#overview)
+- [Key Features](#key-features)
+- [System Architecture](#system-architecture)
+- [Technology Stack](#technology-stack)
+- [Installation](#installation)
+- [Configuration](#configuration)
+- [Usage](#usage)
+- [Project Structure](#project-structure)
+- [Datasets](#datasets)
+- [Performance Metrics](#performance-metrics)
+- [API Documentation](#api-documentation)
+- [Deployment](#deployment)
+- [Contributing](#contributing)
+- [License](#license)
+- [Acknowledgments](#acknowledgments)
+- [Contact](#contact)
+- [Status](#status)
+- [Links](#links)
+---
+## 🎯 Overview
+QUESTRAG is an **advanced banking chatbot** designed to revolutionize customer support in the Indian banking sector. By combining **Retrieval-Augmented Generation (RAG)** with **Reinforcement Learning (RL)**, the system intelligently decides when to fetch external context from a knowledge base and when to respond directly, **reducing token costs by up to 31%** while maintaining high accuracy.
+### Problem Statement
+Existing banking chatbots suffer from:
+- ❌ Limited response flexibility (rigid, rule-based systems)
+- ❌ Poor handling of informal/real-world queries
+- ❌ Lack of contextual understanding
+- ❌ High operational costs due to inefficient token usage
+- ❌ Low user satisfaction and trust
+### Solution
+QUESTRAG addresses these challenges through:
+- ✅ **Domain-specific RAG** trained on 19,000+ banking queries / support data
+- ✅ **RL-optimized policy network** (BERT-based) for smart context-fetching decisions
+- ✅ **Fine-tuned retriever model** (E5-base-v2) using InfoNCE + Triplet Loss
+- ✅ **Groq LLM with HuggingFace fallback** for reliable, fast responses
+- ✅ **Full-stack web application** with modern UI/UX and JWT authentication
+---
+## 🌟 Key Features
+### 🤖 Intelligent RAG Pipeline
+- **FAISS-powered retrieval** for fast similarity search across 19,352 documents
+- **Fine-tuned embedding model** (`e5-base-v2`) trained on English + Hinglish paraphrases
+- **Context-aware response generation** using Llama 3 models (8B & 70B) via Groq
+### 🧠 Reinforcement Learning System
+- **BERT-based policy network** (`bert-base-uncased`) for FETCH/NO_FETCH decisions
+- **Reward-driven optimization** (+2.0 accurate, +0.5 needed fetch, -0.5 incorrect)
+- **31% token cost reduction** via optimized retrieval
+### 🎨 Modern Web Interface
+- **React 18 + Vite** with Tailwind CSS
+- **Real-time chat**, conversation history, JWT authentication
+- **Responsive design** for desktop and mobile
+### 🔐 Enterprise-Ready Backend
+- **FastAPI + MongoDB Atlas** for scalable async operations
+- **JWT authentication** with secure password hashing (bcrypt)
+- **Multi-provider LLM** (Groq → HuggingFace automatic fallback)
+- **Deployed on HuggingFace Spaces** with Docker containerization
+---
+## 🏗️ System Architecture
+<p align="center">
+  <img src="./assets/system.png" alt="System Architecture Diagram" width="750"/>
+</p>
+### 🔄 Workflow
+1. **User Query** → FastAPI receives query via REST API
+2. **Policy Decision** → BERT-based RL model decides FETCH or NO_FETCH
+3. **Conditional Retrieval** → If FETCH → Retrieve top-5 docs from FAISS using E5-base-v2
+4. **Response Generation** → Llama 3 (via Groq) generates final answer
+5. **Evaluation & Logging** → Logged in MongoDB + reward-based model update
+---
+## 🔄 Sequence Diagram
+<p align="center">
+  <img src="./assets/sequence_diagram.png" alt="Sequence Diagram" width="750"/>
+</p>
+---
+## 🛠️ Technology Stack
+### **Frontend**
+- ⚛️ React 18.3.1 + Vite 5.4.2
+- 🎨 Tailwind CSS 3.4.1
+- 🔄 React Context API + Axios + React Router DOM
+### **Backend**
+- 🚀 FastAPI 0.104.1
+- 🗄️ MongoDB Atlas + Motor (async driver)
+- 🔑 JWT Auth + Passlib (bcrypt)
+- 🤖 PyTorch 2.9.1, Transformers 4.57, FAISS 1.13.0
+- 💬 Groq (Llama 3.1 8B Instant / Llama 3.3 70B Versatile)
+- 🎯 Sentence Transformers 5.1.2
+### **Machine Learning**
+- 🧠 **Policy Network**: BERT-base-uncased (trained with RL)
+- 🔍 **Retriever**: E5-base-v2 (fine-tuned with InfoNCE + Triplet Loss)
+- 📊 **Vector Store**: FAISS (19,352 documents)
+### **Deployment**
+- 🐳 Docker (HuggingFace Spaces)
+- 🤗 HuggingFace Hub (model storage)
+- ☁️ MongoDB Atlas (cloud database)
+- 🌐 Python 3.12 + uvicorn
+---
+## ⚙️ Installation
+### 🧩 Prerequisites
+- Python 3.12+
+- Node.js 18+
+- MongoDB Atlas account (or local MongoDB 6.0+)
+- Groq API key (or HuggingFace token)
+### 🔧 Backend Setup (Local Development)
+```bash
+# Navigate to backend
+cd backend
+# Create virtual environment
+python -m venv venv
+# Activate it
+source venv/bin/activate  # Linux/Mac
+venv\Scripts\activate     # Windows
+# Install dependencies
+pip install -r requirements.txt
+# Create environment file
+cp .env.example .env
+# Edit .env with your credentials (see Configuration section)
+# Build FAISS index (one-time setup)
+python build_faiss_index.py
+# Start backend server
+uvicorn app.main:app --reload --port 8000
+```
+### 💻 Frontend Setup
+```bash
+# Navigate to frontend
+cd frontend
+# Install dependencies
+npm install
+# Create environment file
+cp .env.example .env
+# Update VITE_API_URL to point to your backend
+# Start dev server
+npm run dev
+```
 ---
+## ⚙️ Configuration
+### 🔑 Backend `.env` (Key Parameters)
+| **Category**      | **Key**                          | **Example / Description**                        |
+|-------------------|----------------------------------|--------------------------------------------------|
+| Environment       | `ENVIRONMENT`                    | `development` or `production`                    |
+| MongoDB           | `MONGODB_URI`                    | `mongodb+srv://user:[email protected]/`   |
+| Authentication    | `SECRET_KEY`                     | Generate with `python -c "import secrets; print(secrets.token_urlsafe(32))"` |
+|                   | `ALGORITHM`                      | `HS256`                                          |
+|                   | `ACCESS_TOKEN_EXPIRE_MINUTES`    | `1440` (24 hours)                                |
+| Groq API          | `GROQ_API_KEY_1`                 | Your primary Groq API key                        |
+|                   | `GROQ_API_KEY_2`                 | Secondary key (optional)                         |
+|                   | `GROQ_API_KEY_3`                 | Tertiary key (optional)                          |
+|                   | `GROQ_CHAT_MODEL`                | `llama-3.1-8b-instant`                           |
+|                   | `GROQ_EVAL_MODEL`                | `llama-3.3-70b-versatile`                        |
+| HuggingFace       | `HF_TOKEN_1`                     | HuggingFace token (fallback LLM)                 |
+|                   | `HF_MODEL_REPO`                  | `eeshanyaj/questrag_models` (for model download) |
+| Model Paths       | `POLICY_MODEL_PATH`              | `app/models/best_policy_model.pth`               |
+|                   | `RETRIEVER_MODEL_PATH`           | `app/models/best_retriever_model.pth`            |
+|                   | `FAISS_INDEX_PATH`               | `app/models/faiss_index.pkl`                     |
+|                   | `KB_PATH`                        | `app/data/final_knowledge_base.jsonl`            |
+| Device            | `DEVICE`                         | `cpu` or `cuda`                                  |
+| RAG Params        | `TOP_K`                          | `5` (number of documents to retrieve)            |
+|                   | `SIMILARITY_THRESHOLD`           | `0.5` (minimum similarity score)                 |
+| Policy Network    | `CONFIDENCE_THRESHOLD`           | `0.7` (policy decision confidence)               |
+| CORS              | `ALLOWED_ORIGINS`                | `http://localhost:5173` or `*`                   |
+### 🌐 Frontend `.env`
+```bash
+# Local development
+VITE_API_URL=http://localhost:8000
+# Production (HuggingFace Spaces)
+VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space
+```
 ---
+## 🚀 Usage
+### 🖥️ Local Development
+#### Start Backend Server
+```bash
+cd backend
+source venv/bin/activate  # or venv\Scripts\activate
+uvicorn app.main:app --reload --port 8000
+```
+- **Backend**: http://localhost:8000
+- **API Docs**: http://localhost:8000/docs
+- **Health Check**: http://localhost:8000/health
+#### Start Frontend Dev Server
+```bash
+cd frontend
+npm run dev
+```
+- **Frontend**: http://localhost:5173
+### 🌐 Production (HuggingFace Spaces)
+**Backend API**:
+- **Base URL**: https://eeshanyaj-questrag-backend.hf.space
+- **API Docs**: https://eeshanyaj-questrag-backend.hf.space/docs
+- **Health Check**: https://eeshanyaj-questrag-backend.hf.space/health
+**Frontend** (Coming Soon):
+- Will be deployed on Vercel/Netlify
+---
+## 📁 Project Structure
+```
+questrag/
+│
+├── backend/
+│   ├── app/
+│   │   ├── api/v1/
+│   │   │   ├── auth.py              # Auth endpoints (register, login)
+│   │   │   └── chat.py              # Chat endpoints
+│   │   ├── core/
+│   │   │   ├── llm_manager.py       # Groq + HF LLM orchestration
+│   │   │   └── security.py          # JWT & password hashing
+│   │   ├── ml/
+│   │   │   ├── policy_network.py    # RL Policy model (BERT)
+│   │   │   └── retriever.py         # E5-base-v2 retriever
+│   │   ├── db/
+│   │   │   ├── mongodb.py           # MongoDB connection
+│   │   │   └── repositories/        # User & conversation repos
+│   │   ├── services/
+│   │   │   └── chat_service.py      # Orchestration logic
+│   │   ├── models/
+│   │   │   ├── best_policy_model.pth      # Trained policy network
+│   │   │   ├── best_retriever_model.pth   # Fine-tuned retriever
+│   │   │   └── faiss_index.pkl            # FAISS vector store
+│   │   ├── data/
+│   │   │   └── final_knowledge_base.jsonl # 19,352 Q&A pairs
+│   │   ├── config.py                # Settings & env vars
+│   │   └── main.py                  # FastAPI app entry point
+│   ├── Dockerfile                   # Docker config for HF Spaces
+│   ├── requirements.txt
+│   └── .env.example
+│
+└── frontend/
+    ├── src/
+    │   ├── components/              # UI Components
+    │   ├── context/                 # Auth Context
+    │   ├── pages/                   # Login, Register, Chat
+    │   ├── services/api.js          # Axios Client
+    │   ├── App.jsx
+    │   └── main.jsx
+    ├── package.json
+    └── .env
+```
+---
+## 📊 Datasets
+### 1. Final Knowledge Base
+- **Size**: 19,352 question-answer pairs
+- **Categories**: 15 banking categories
+- **Intents**: 22 unique intents (ATM, CARD, LOAN, ACCOUNT, etc.)
+- **Source**: Combination of:
+  - Bitext Retail Banking Dataset (Hugging Face)
+  - RetailBanking-Conversations Dataset
+  - Manually curated FAQs from SBI, ICICI, HDFC, Yes Bank, Axis Bank
+### 2. Retriever Training Dataset
+- **Size**: 11,655 paraphrases
+- **Source**: 1,665 unique FAQs from knowledge base
+- **Paraphrases per FAQ**:
+  - 4 English paraphrases
+  - 2 Hinglish paraphrases
+  - Original FAQ
+- **Training**: InfoNCE Loss + Triplet Loss with E5-base-v2
+### 3. Policy Network Training Dataset
+- **Size**: 182 queries from 6 chat sessions
+- **Format**: (state, action, reward) tuples
+- **Actions**: FETCH (1) or NO_FETCH (0)
+- **Rewards**: +2.0 (correct), +0.5 (needed fetch), -0.5 (incorrect)
+---
+## 📈 Performance Metrics
+*Coming soon: Detailed performance metrics including accuracy, response time, token cost reduction, and user satisfaction scores.*
+---
+## 📚 API Documentation
+### Authentication
+#### Register
+```http
+POST /api/v1/auth/register
+Content-Type: application/json
+{
+  "username": "john_doe",
+  "email": "[email protected]",
+  "password": "securepassword123"
+}
+```
+**Response:**
+```json
+{
+  "message": "User registered successfully",
+  "user_id": "507f1f77bcf86cd799439011"
+}
+```
+#### Login
+```http
+POST /api/v1/auth/login
+Content-Type: application/json
+{
+  "username": "john_doe",
+  "password": "securepassword123"
+}
+```
+**Response:**
+```json
+{
+  "access_token": "eyJhbGciOiJIUzI1NiIs...",
+  "token_type": "bearer"
+}
+```
+---
+### Chat
+#### Send Message
+```http
+POST /api/v1/chat/
+Authorization: Bearer <token>
+Content-Type: application/json
+{
+  "query": "What are the interest rates for home loans?",
+  "conversation_id": "optional-session-id"
+}
+```
+**Response:**
+```json
+{
+  "response": "Current home loan interest rates range from 8.5% to 9.5% per annum...",
+  "conversation_id": "abc123",
+  "metadata": {
+    "policy_action": "FETCH",
+    "retrieval_score": 0.89,
+    "documents_retrieved": 5,
+    "llm_provider": "groq"
+  }
+}
+```
+#### Get Conversation History
+```http
+GET /api/v1/chat/conversations/{conversation_id}
+Authorization: Bearer <token>
+```
+**Response:**
+```json
+{
+  "conversation_id": "abc123",
+  "messages": [
+    {
+      "role": "user",
+      "content": "What are the interest rates?",
+      "timestamp": "2025-11-28T10:30:00Z"
+    },
+    {
+      "role": "assistant",
+      "content": "Current rates are...",
+      "timestamp": "2025-11-28T10:30:05Z",
+      "metadata": {
+        "policy_action": "FETCH"
+      }
+    }
+  ]
+}
+```
+#### List All Conversations
+```http
+GET /api/v1/chat/conversations
+Authorization: Bearer <token>
+```
+#### Delete Conversation
+```http
+DELETE /api/v1/chat/conversation/{conversation_id}
+Authorization: Bearer <token>
+```
+---
+## 🚀 Deployment
+### HuggingFace Spaces (Backend)
+The backend is deployed on HuggingFace Spaces using Docker:
+1. **Models are stored** on HuggingFace Hub: `eeshanyaj/questrag_models`
+2. **On first startup**, models are automatically downloaded from HF Hub
+3. **Docker container** runs FastAPI with uvicorn on port 7860
+4. **Environment secrets** are securely managed in HF Space settings
+**Deployment Steps:**
+```bash
+# 1. Upload models to HuggingFace Hub
+huggingface-cli upload eeshanyaj/questrag_models \
+  app/models/best_policy_model.pth \
+  models/best_policy_model.pth
+# 2. Push backend code to HF Space
+git remote add space https://huggingface.co/spaces/eeshanyaj/questrag-backend
+git push space main
+# 3. Add environment secrets in HF Space Settings
+# (MongoDB URI, Groq keys, JWT secret, etc.)
+```
+### Frontend Deployment (Vercel/Netlify)
+```bash
+# Build for production
+npm run build
+# Deploy to Vercel
+vercel --prod
+# Update .env.production with backend URL
+VITE_API_URL=https://eeshanyaj-questrag-backend.hf.space
+```
+---
+## 🤝 Contributing
+Contributions are welcome! Please follow these steps:
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/amazing-feature`)
+3. Commit your changes (`git commit -m 'Add amazing feature'`)
+4. Push to the branch (`git push origin feature/amazing-feature`)
+5. Open a Pull Request
+### Development Guidelines
+- Follow PEP 8 for Python code
+- Use ESLint + Prettier for JavaScript/React
+- Write comprehensive docstrings and comments
+- Add unit tests for new features
+- Update documentation accordingly
+---
+## 📄 License
+MIT License — see [LICENSE](LICENSE)
+---
+## 🙏 Acknowledgments
+### Research Inspiration
+- **Main Paper**: "Optimizing Retrieval Augmented Generation for Domain-Specific Chatbots with Reinforcement Learning" (AAAI 2024)
+- **Additional References**:
+  - "Evaluating BERT-based Rewards for Question Generation with RL"
+  - "Self-Reasoning for Retrieval-Augmented Language Models"
+### Open Source Resources
+- [RL-Self-Improving-RAG](https://github.com/subrata-samanta/RL-Self-Improving-RAG)
+- [ARENA](https://github.com/ren258/ARENA)
+- [RAGTechniques](https://github.com/NirDiamant/RAGTechniques)
+- [Financial-RAG-From-Scratch](https://github.com/cse-amarjeet/Financial-RAG-From-Scratch)
+### Datasets
+- [Bitext Retail Banking Dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset)
+- [RetailBanking-Conversations](https://huggingface.co/datasets/oopere/RetailBanking-Conversations)
+### Technologies
+- [FastAPI](https://fastapi.tiangolo.com/)
+- [React](https://reactjs.org/)
+- [HuggingFace](https://huggingface.co/)
+- [Groq](https://groq.com/)
+- [MongoDB Atlas](https://www.mongodb.com/cloud/atlas)
+---
+## 📞 Contact
+**Eeshanya Amit Joshi**
+📧 [Email](mailto:[email protected])
+💼 [LinkedIn](https://www.linkedin.com/in/eeshanyajoshi/)
+---
+## 📈 Status
+### ✅ **Backend Deployed & Live!**
+- 🚀 Backend API running on [HuggingFace Spaces](https://eeshanyaj-questrag-backend.hf.space)
+- 📚 API Documentation available at [/docs](https://eeshanyaj-questrag-backend.hf.space/docs)
+- 💚 Health status: [Check here](https://eeshanyaj-questrag-backend.hf.space/health)
+### 🚧 **Frontend Deployment - Coming Soon!**
+- Will be deployed on Vercel/Netlify
+- Stay tuned for full application link! ❤️
+---
+## 🔗 Links
+- **Live Backend API:** https://eeshanyaj-questrag-backend.hf.space
+- **API Documentation:** https://eeshanyaj-questrag-backend.hf.space/docs
+- **Health Check:** https://eeshanyaj-questrag-backend.hf.space/health
+- **HuggingFace Space:** https://huggingface.co/spaces/eeshanyaj/questrag-backend
+- **Model Repository:** https://huggingface.co/eeshanyaj/questrag_models
+- **Research Paper:** [AAAI 2024 Workshop](https://arxiv.org/abs/2401.06800)
+---
+<p align="center">✨ Made with ❤️ for the Banking Industry ✨</p>
+<p align="center">Powered by HuggingFace 🤗| Groq ⚡| MongoDB 🍃| Docker 🐳| </p>