Skip to content

vishwateja231/RAG-Document-Q-A

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document Question Answering System (RAG-based)

A production-oriented Retrieval-Augmented Generation (RAG) project that lets users upload PDF documents, index document chunks in FAISS, retrieve relevant context, and generate answers with an OpenAI-compatible LLM API.

Features

  • Upload PDF documents via API or browser UI
  • Parse and chunk document text
  • Generate embeddings with Sentence Transformers
  • Persist vectors in FAISS
  • Ask natural-language questions over uploaded docs
  • Retrieve top-k relevant chunks and generate grounded answers

Architecture

+----------------------+        +--------------------------+
|      Frontend UI     | <----> |      FastAPI Backend     |
|  (HTML + JavaScript) |        |  /upload /ask /documents |
+----------+-----------+        +-----------+--------------+
           |                                |
           |                                v
           |                      +---------+----------+
           |                      |    Ingestion       |
           |                      |  PyPDF + Chunking  |
           |                      +---------+----------+
           |                                |
           |                                v
           |                      +---------+----------+
           |                      | Embedding Model    |
           |                      | SentenceTransform. |
           |                      +---------+----------+
           |                                |
           |                                v
           |                      +---------+----------+
           |                      |  Vector Store      |
           |                      |   FAISS + Metadata |
           |                      +---------+----------+
           |                                |
           |                                v
           |                      +---------+----------+
           +--------------------> |  LLM Generation    |
                                  | OpenAI-compatible  |
                                  +--------------------+

Project Structure

RAG-Document-Q-A/
│
├── app/
│   ├── main.py
│   ├── rag_pipeline.py
│   ├── ingestion.py
│   └── config.py
│
├── data/
│   ├── documents/
│   └── vectorstore/
│
├── static/
│   └── index.html
├── requirements.txt
├── README.md
└── run.sh

Setup and Run

  1. Navigate to project folder:
cd RAG-Document-Q-A
  1. Create environment and install dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
  1. Set your OpenAI-compatible key and optional endpoint/model:
export OPENAI_API_KEY="your_api_key"
export LLM_BASE_URL="https://api.openai.com/v1"
export LLM_MODEL="gpt-4o-mini"
  1. Start server:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

Or simply:

./run.sh

API Endpoints

  • POST /upload — Upload and index a PDF document
  • POST /ask — Ask a question over indexed documents
  • GET /documents — List indexed documents
  • GET /health — Health check
  • GET / — Browser UI

Sample cURL Requests

Upload a PDF:

curl -X POST "http://localhost:8000/upload" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/absolute/path/to/document.pdf"

Ask a question:

curl -X POST "http://localhost:8000/ask" \
  -H "Content-Type: application/json" \
  -d '{"question": "What are the key points in the document?"}'

List documents:

curl "http://localhost:8000/documents"

Example Response (Ask)

{
  "answer": "The report states that revenue increased 24% year-over-year [1].",
  "sources": [
    {
      "document": "report.pdf",
      "text": "...",
      "score": 0.1182
    }
  ]
}

Notes

  • Only PDF files are accepted for ingestion.
  • FAISS index and metadata are persisted under data/vectorstore/.
  • Uploaded PDFs are stored under data/documents/.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors