AI-Powered Research Assistant

A full-stack web application that allows users to upload academic PDFs and ask natural language questions about the content using AI-powered retrieval augmented generation (RAG).

🏗️ Architecture

AI-Powered Research Assistant/
├── frontend/           # HTML, CSS, JavaScript interface
├── backend/           # Flask API server
├── ml/               # Machine learning modules
├── data/             # PDF storage and FAISS index
├── docs/             # Documentation
└── requirements.txt   # Python dependencies

🚀 Features

PDF Upload & Processing: Parse academic PDFs using PyMuPDF
Semantic Search: Generate embeddings using OpenAI's text-embedding-ada-002
RAG Implementation: Retrieve relevant chunks and generate answers with GPT-4
Citation Generation: Automatic APA/MLA citation formatting
Vector Storage: FAISS index for efficient similarity search
Related Papers: Find similar papers using cosine similarity
Azure OpenAI Support: Use Azure OpenAI endpoints and keys

🛠️ Setup Instructions

Prerequisites

Python 3.8+
OpenAI API key OR Azure OpenAI configuration
pip (Python package manager)

Installation

Clone the repository

git clone <repository-url>
cd ai-research-assistant

Install dependencies
```
pip install -r requirements.txt
```

Configure API Keys

Option A: OpenAI Direct API

export OPENAI_API_KEY="your-openai-api-key-here"

Option B: Azure OpenAI

export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"

# Optional: Specify deployment names
export AZURE_OPENAI_DEPLOYMENT_NAME="your-gpt-4-deployment"
export AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME="your-embedding-deployment"

Run the application
```
python backend/app.py
```
Access the application
- Open your browser and navigate to http://localhost:5000

🔑 API Configuration

OpenAI Direct API

Get your API key from OpenAI Platform
Set environment variable: OPENAI_API_KEY

Azure OpenAI

Create an Azure OpenAI resource in Azure Portal
Get your endpoint URL and API key
Set environment variables:
- AZURE_OPENAI_ENDPOINT
- AZURE_OPENAI_API_KEY
- AZURE_OPENAI_DEPLOYMENT_NAME (optional)
- AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME (optional)

📁 Project Structure

Backend (`/backend`)

app.py: Main Flask application
routes.py: API endpoints for upload, question-answering, and citations
config.py: Configuration settings

ML Modules (`/ml`)

pdf_parser.py: PDF parsing and chunking
embedding_generator.py: OpenAI/Azure embedding generation
vector_search.py: FAISS index management and similarity search
rag_engine.py: Retrieval Augmented Generation implementation
citation_generator.py: APA/MLA citation formatting

Frontend (`/frontend`)

index.html: Main application interface
style.css: Styling and responsive design
script.js: Frontend JavaScript functionality

Data (`/data`)

uploads/: Stored PDF files
embeddings/: FAISS index files
metadata/: Document metadata storage

🔧 API Endpoints

POST /upload: Upload and process PDF files
POST /ask: Ask questions about uploaded documents
GET /documents: List uploaded documents
GET /related/<doc_id>: Find related papers

🎯 Usage

Upload PDFs: Drag and drop or select academic PDF files
Ask Questions: Type natural language questions about the content
Get Answers: Receive AI-generated answers with citations
Explore Related Papers: Discover similar academic papers

🔒 Security Notes

Store your API keys securely
Implement proper file validation for PDF uploads
Consider rate limiting for API endpoints
Add authentication for production use

📊 Performance

Supports PDFs up to 50MB
Processes documents in chunks of 1000 tokens
Retrieves top-5 most relevant chunks for each question
FAISS index enables fast similarity search

💰 Cost Considerations

OpenAI Direct API

GPT-4: ~$0.03 per 1K input tokens, ~$0.06 per 1K output tokens
text-embedding-ada-002: ~$0.0001 per 1K tokens

Azure OpenAI

Pricing varies by region and model deployment
Check Azure pricing calculator for your specific setup

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

📄 License

MIT License - see LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
data		data
docs		docs
frontend		frontend
ml		ml
.gitignore		.gitignore
QUICK_START.md		QUICK_START.md
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
VECTOR_SEARCH_README.md		VECTOR_SEARCH_README.md
debug_upload.html		debug_upload.html
demo_vector_search.py		demo_vector_search.py
example_vector_search.py		example_vector_search.py
fix_api_key.py		fix_api_key.py
requirements.txt		requirements.txt
run.py		run.py
setup.py		setup.py
setup_env.py		setup_env.py
setup_vector_search.py		setup_vector_search.py
style.css		style.css
test_env.py		test_env.py
test_setup.py		test_setup.py
test_upload_delete.html		test_upload_delete.html
test_vector_search.py		test_vector_search.py
test_vector_search_offline.py		test_vector_search_offline.py
text.py		text.py
update_deployment.py		update_deployment.py
update_env.py		update_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Powered Research Assistant

🏗️ Architecture

🚀 Features

🛠️ Setup Instructions

Prerequisites

Installation

🔑 API Configuration

OpenAI Direct API

Azure OpenAI

📁 Project Structure

Backend (`/backend`)

ML Modules (`/ml`)

Frontend (`/frontend`)

Data (`/data`)

🔧 API Endpoints

🎯 Usage

🔒 Security Notes

📊 Performance

💰 Cost Considerations

OpenAI Direct API

Azure OpenAI

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

vvemulakonda/Researchpaper-summarizer

Folders and files

Latest commit

History

Repository files navigation

AI-Powered Research Assistant

🏗️ Architecture

🚀 Features

🛠️ Setup Instructions

Prerequisites

Installation

🔑 API Configuration

OpenAI Direct API

Azure OpenAI

📁 Project Structure

Backend (/backend)

ML Modules (/ml)

Frontend (/frontend)

Data (/data)

🔧 API Endpoints

🎯 Usage

🔒 Security Notes

📊 Performance

💰 Cost Considerations

OpenAI Direct API

Azure OpenAI

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Backend (`/backend`)

ML Modules (`/ml`)

Frontend (`/frontend`)

Data (`/data`)

Packages