RAG Retrieval Evaluation

This project evaluates the performance of your RAG retrieval system using Azure AI Search and Azure OpenAI as a judge. It uses the RetrievalEvaluator from the azure-ai-evaluation SDK to qualitatively assess if retrieved documents are relevant to test queries.

Prerequisites

Python 3.9+
An Azure OpenAI resource with a model deployed (e.g., GPT-4).
An Azure AI Search resource with an index containing your documents.

Setup

Install Dependencies:
```
pip install -r requirements.txt
```
Configure Environment:

Rename or copy .env to .env.local (or just edit .env) and fill in your details:
```
AZURE_OPENAI_ENDPOINT=https://<your-resource>.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=gpt-4
AZURE_SEARCH_ENDPOINT=https://<your-service>.search.windows.net
AZURE_SEARCH_INDEX=<your-index-name>
```
Authentication:
- Note: Your Azure OpenAI resource appears to have Key-based authentication disabled.
- The script is configured to use Azure Credentials (RBAC) via DefaultAzureCredential.
- Ensure you are logged in:
```
az login
```
- Ensure your user has the "Cognitive Services OpenAI Contributor" (or User) role on the OpenAI resource.

Usage

Add Test Queries: Open evaluate_retrieval.py and modify the test_queries list with questions relevant to your document set.
Run Evaluation:
```
python evaluate_retrieval.py
```

Notes

Sanitization: The script includes logic to strip image file references (e.g., .png, ![image]) from the retrieved text. This prevents the evaluator from attempting to resolve local image paths, which can cause errors.

Retrieval: The script queries your Azure AI Search index for the top 3 documents matching each query.
Formatting: It combines the content of these documents into a context block.
Judging: It sends the query and retrieved_context to the RetrievalEvaluator (powered by GPT-4). The model judges whether the context provides sufficient information to answer the query, returning a score from 1 (Irrelevant) to 5 (Highly Relevant).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
convert_json_to_csv.py		convert_json_to_csv.py
evaluate_retrieval.py		evaluate_retrieval.py
evaluation_results.csv		evaluation_results.csv
evaluation_results_formatted.json		evaluation_results_formatted.json
inspect_docs.py		inspect_docs.py
requirements.txt		requirements.txt
test_aoai.py		test_aoai.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Retrieval Evaluation

Prerequisites

Setup

Usage

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Retrieval Evaluation

Prerequisites

Setup

Usage

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages