VERGE: Verification-Enhanced Generation of Multi-Hop Datasets for Evaluating Task-Specific RAG

Figure: VERGE dataset generation process

Overview

This repository contains the implementation of VERGE, a verification-enhanced methodology for generating multi-hop datasets to evaluate Retrieval-Augmented Generation (RAG) systems. VERGE addresses significant methodological gaps in existing RAG evaluation frameworks by generating task-specific, multi-hop reasoning dataset.

🌟 Key Features

VERGE: Implements a novel verification agent that ensures generated questions necessitate genuine multi-hop reasoning and maintain factual consistency
Hierarchical Error Taxonomy: Provides structured analysis of RAG system failure patterns specifically in multi-hop reasoning contexts

Repository Structure

Chunker/: Scripts for chunking documents
Data/: Scripts for downloading the datasets
ExamProcesser: Scripts for generated exam processor
Solver: Scripts for solving the generated exams
categorise_errors.py: Scripts for categorise the error type
generate_exam: Scripts for generating an exam
prompt_templates.py: Prompting templates for question generation, verification, and evaluation
retriever.py: Retriever class

🚀 Quick Start

Installation

pip install -r requirements.txt

Usage

Download data

python src/Data/long_bench_downloader.py
python src/Data/download_documents_sec_filings.py

Chunk, Embed and Store the data

python src/Chunker/document_chunker.py

Generate Multi-hop Datasets with Verification Agent

python src/generate_exam.py

Solve the exam

python src/Solver/solve_exam_rag.py

Categorise Error Patterns

python src/categorise_errors.py

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
imgs		imgs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VERGE: Verification-Enhanced Generation of Multi-Hop Datasets for Evaluating Task-Specific RAG

Overview

🌟 Key Features

Repository Structure

🚀 Quick Start

Installation

Usage

Download data

Chunk, Embed and Store the data

Generate Multi-hop Datasets with Verification Agent

Solve the exam

Categorise Error Patterns

License

About

Releases

Packages

Languages

License

kyosek/VERGE

Folders and files

Latest commit

History

Repository files navigation

VERGE: Verification-Enhanced Generation of Multi-Hop Datasets for Evaluating Task-Specific RAG

Overview

🌟 Key Features

Repository Structure

🚀 Quick Start

Installation

Usage

Download data

Chunk, Embed and Store the data

Generate Multi-hop Datasets with Verification Agent

Solve the exam

Categorise Error Patterns

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages