AI-powered document summarization tool that extracts key insights from PDFs, DOCX, and plain text files.
- 🌟 Features
- 🏗️ Tech Stack
- 📂 Project Structure
- 🚀 Deployment Status
- 🚀 Installation & Setup
- 🎨 Figma Design
- 🎥 How to Run the Application
- 🎥 Demo Video
- 👥 Contributors
- 📜 License
- 🏆 Acknowledgment
✅ Extract text from PDFs & DOCX files effortlessly.
✅ AI-generated summaries using state-of-the-art transformer models.
✅ Intelligent keyword extraction for enhanced insights.
✅ Multi-model support (T5, BART, Pegasus, etc.).
✅ Vector search for precise retrieval.
✅ Sleek React.js frontend for a seamless experience.
✅ Question Answering support.
- Flask - Lightweight web framework
- Transformers (Hugging Face) - Pretrained NLP models (T5, BART, Pegasus)
- PDFPlumber - Extracts text from PDFs
- docx - Parses DOCX files
- spaCy & YAKE - NLP processing & keyword extraction
- LangChain & FAISS - AI-powered document retrieval & vector similarity search
- Sentence Transformers - Embedding generation
- Gemini 1.5 Flash API - Used for Question Answering
- React.js + Vite - Blazing fast UI development
- Tailwind CSS - Modern styling framework
- Lucide Icons - Minimalist UI enhancements
Document-Summarizer/
│
├───.gitattributes
├───.gitignore
├───eslint.config.js
├───index.html
├───LICENSE
├───package-lock.json
├───package.json
├───postcss.config.js
├───tailwind.config.js
├───tsconfig.app.json
├───tsconfig.json
├───tsconfig.node.json
├───vite.config.ts
│
├───.github/
│ └───README.md
│
├───Archive/
│ ├───Document_Summarizer_first_prototype.ipynb
│ ├───Document_Summarizer_second_prototype.ipynb
│ ├───summarizer_web.ipynb
│ └───README.md
│
├───Assets/
│ ├───Contributors/
│ │ ├───Akshit Github Photo.png
│ │ ├───Ansh Github Photo.png
│ │ ├───Bindupautra Github Photo.png
│ │ └───Rana Github Photo.png
│ │
│ ├───Videos/
│ │ ├───Demo.mp4
│ │ └───To Run in Local Machine.mp4
│ │
│ └───README.md
│
├───backend/
│ ├───.env
│ ├───app.py
│ ├───Readme.md
│ ├───requirements.txt
│ └───README.md
│
├───NoteBooks/
│ ├───Summarizer.ipynb
│ └───README.md
│
└───src/
├───App.tsx
├───index.css
├───main.tsx
├───vite-env.d.ts
└───README.md
The frontend of the application is deployed and accessible for preview. You can explore the user interface and interact with the design at the following link:
However, please note that the backend has not been deployed yet. As a result, the full functionality of the Document Summarizer, including AI-powered summarization and document processing, is not available in the live version.
To experience the complete features, you will need to set up the backend locally by following the Installation & Setup instructions.
git clone https://github.com/Jyotibrat/Document-Summarizer.git
cd document-summarizer
To use the Gemini 1.5 Flash API for question answering, you need to obtain an API key. Follow these steps:
-
Sign Up for Gemini API:
- Go to the Gemini API website and sign up for an account.
- Follow the instructions to create a new API key.
-
Set Up the API Key:
- Navigate to the
backend
directory:cd backend
- Open the
.env
file in a text editor: - Add your Gemini API key to the
.env
file:GEMINI_API_KEY='your_api_key_here'
- Save and close the file.
- Navigate to the
Navigate to the backend
directory.
cd backend
Create a virtual environment and install dependencies:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Run the Flask server:
python app.py
Navigate to the root directory and install dependencies:
npm install
Run the development server:
npm run dev
The UI/UX design for the Document Summarizer was created using Figma. You can view the design and interact with the prototype using the link below:
📌 Watch the video tutorial on setting up and running the application locally:
To.Run.in.Local.Machine.compressed.mp4
📌 Watch the live demo of the Document Summarizer in action:
Demo.compressed.mp4
This project was made possible by the contributions of these amazing individuals:
This project is licensed under the MIT License - see the LICENSE file for details.
This project is part of Neurathon '25 conducted by NIT Silchar.