Skip to content

This repository contains prototype for ColPali: Efficient Document Retrieval with Vision Language Models 👀

Notifications You must be signed in to change notification settings

tankibaj/vision-rag-colpali

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision RAG with ColPali

This repository contains prototype code for Efficient Document Retrieval with Vision Language Models using ColPali. ColPali—a model designed for efficient document retrieval using visual embeddings—improves retrieval performance, latency, and accuracy by bypassing traditional OCR pipelines.

Features

  • Vision-Based Retrieval: Leverages visual embeddings for document retrieval.
  • ColPali Integration: Implements the ColPali architecture for efficient multi-vector embeddings.
  • End-to-End Pipeline: Demonstrates the process from document ingestion to retrieval.

Prerequisites

  • Python 3.9 or higher
  • Docker (for the PGVector container)
  • OpenAI API key for retrieval

Quick Start

1. Start PGVector Container

Ensure Docker is installed, then run:

cd pgvector
./start

2. Install Python Dependencies

pip install -r requirements.txt

3. Document Ingestion

Ingest documents into the database:

python ingestion.py

4. Document Retrieval

Set up your OpenAI API key:

export OPENAI_API_KEY=sk-proj-xxxxxxx

Retrieve documents based on a query:

python retrieval.py

About

This repository contains prototype for ColPali: Efficient Document Retrieval with Vision Language Models 👀

Resources

Stars

Watchers

Forks