Skip to content

Efficiently search and retrieve information from PDF documents using a Retrieval-Augmented Generation (RAG) approach. This project leverages DeepSeek-R1 (1.5B) for advanced language understanding, FAISS for high-speed vector search, and Hugging Face’s ecosystem for enhanced NLP capabilities. With an intuitive Streamlit interface and Ollama for mode

License

Notifications You must be signed in to change notification settings

mohd-faizy/RAG-DeepSeek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 RAG PDF Assistant: Powered by 🏆 DeepSeek-R1 ~ 1.5B , Ollama, Streamlit & FAISS

author Python 3.9+ Streamlit Ollama PDFPlumber LangChain HuggingFace FAISS


🌟 Overview

RAG PDF Assistant is an AI-powered Retrieval-Augmented Generation (RAG) chatbot that enables intelligent PDF document search and retrieval. It combines:

DeepSeek-R1 (1.5B) – Advanced AI-powered language model for accurate responses.
FAISS – Fast vector search for efficient document retrieval.
Ollama – Lightweight model serving and seamless inference.
LangChain – Modular AI framework for query execution and reasoning.
Streamlit – Intuitive and interactive web-based UI.

🎯 Use Cases

🔍 Quickly search through large PDF documents.
📄 Summarize reports, research papers, contracts, and more.
📘 Extract relevant information with AI-driven accuracy.
🤖 Ask natural language questions and get concise answers.


🚀 Demo Screenshots

📌 UI Preview

Demo

📌 Application Workflow

Workflow


🛠️ Installation & Setup

🔧 Prerequisites

Ensure you have the following installed:

  • Python 3.9+
  • pip (Python package manager)
  • Git (for cloning the repository)

📥 Step 1: Clone the Repository

$ git clone https://github.com/mohd-faizy/RAG-DeepSeek.git
$ cd RAG-DeepSeek

📦 Step 2: Install Dependencies

$ pip install -r requirements.txt

⚙️ Step 3: Run the Application

  1. Start Ollama service:

    ollama serve
  2. In a separate terminal, launch the chat interface:

    streamlit run app/main.py

The application will launch in your browser at http://localhost:11434/.


📁 Directory Structure

RAG-DeepSeek/
├── app/
│   ├── __init__.py          # Python package initialization
│   ├── main.py              # Main Streamlit application file
│   ├── utils.py             # Utility functions for PDF processing, embeddings, retrieval
├── assets/                  # Static files like images, CSS, etc.
├── requirements.txt         # Python dependencies
├── .gitignore               # Files ignored by Git
├── README.md                # Project documentation

🧠 How It Works

1️⃣ Upload a PDF → The AI extracts and indexes the content.
2️⃣ Ask a question → The system searches for the most relevant passages.
3️⃣ AI answers your query → Based on retrieved document content.

🔹 Uses FAISS for fast, efficient document retrieval.
🔹 DeepSeek-R1 ensures high-quality, context-aware answers.


🔗 Technologies Used

Technology Purpose
DeepSeek-R1 (1.5B) Language model for intelligent responses
Ollama Model serving and inference
FAISS Vector search for document retrieval
LangChain AI-driven reasoning and query handling
Streamlit User-friendly web interface
PDFPlumber Extracting text from PDFs

🤝 Steps to Contribute

  1. Fork the repository
  2. Create a new feature branch (git checkout -b feature-name)
  3. Commit your changes (git commit -m "Added new feature")
  4. Push to your fork (git push origin feature-name)
  5. Open a pull request

⚖ ➤ License

This project is licensed under the MIT License. See the LICENSE file for details.

❤️ Support

If you find this repository helpful, show your support by starring it! For questions or feedback, reach out on Twitter(X).

🔗Connect with me

➤ If you have questions or feedback, feel free to reach out!!!


About

Efficiently search and retrieve information from PDF documents using a Retrieval-Augmented Generation (RAG) approach. This project leverages DeepSeek-R1 (1.5B) for advanced language understanding, FAISS for high-speed vector search, and Hugging Face’s ecosystem for enhanced NLP capabilities. With an intuitive Streamlit interface and Ollama for mode

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages