This repository contains a chatbot that uses the LlamaIndex library and OpenAI's GPT models for intelligent question-answering based on textual data. The chatbot ingests .txt
files, processes them into an index, and allows for interactive querying with memory support.
- Text File Ingestion: Reads
.txt
files from a designated folder. - Indexing: Creates and persists a vector-based index using OpenAI embeddings.
- Memory Support: Includes memory buffers for conversational context.
- Interactive Chat: Provides a terminal-based chat interface for querying indexed documents.
- Persistence: Saves chat history and indexes for reusability across sessions.
- Python 3.8 or later
- OpenAI API key
-
Clone the repository:
git clone <repository-url> cd <repository-name>
-
Set up a virtual environment:
python -m venv chatbot # On Windows Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass .\chatbot\Scripts\Activate.ps1 # On macOS/Linux source chatbot/bin/activate
-
Install dependencies:
python -m pip install --upgrade pip pip install llama-index-llms-openai pip install python-dotenv
-
Add environment variables: Create a
.env
file in the project root and add your OpenAI API key:OPENAI_API_KEY=your_openai_api_key
-
Prepare the Data:
- Place all
.txt
files you want to query in a folder nameddata
within the project directory.
- Place all
-
Run the Script:
python chatbot.py
-
Interact:
- Enter your questions in the terminal.
- Type
exit
to terminate the session.
<repository-name>/
├── data/ # Directory for .txt files
├── chatbot.py # Main script
├── .env # Environment variables
└── README.md # Project documentation
-
Chatbot Memory:
- Memory is managed using a
SimpleChatStore
andChatMemoryBuffer
. - Adjust
token_limit
andchat_store_key
in the script as needed.
- Memory is managed using a
-
Embedding Model:
- The OpenAI GPT model is set to
gpt-3.5-turbo
. You can change it in the script if desired.
- The OpenAI GPT model is set to
- LlamaIndex
- Python dotenv
- OpenAI GPT models
- Requires an OpenAI API key for operation.
- Designed for
.txt
files; other formats are not supported out of the box.
This project is licensed under the MIT License. See the LICENSE file for more details.
- Built using the LlamaIndex library and OpenAI's GPT models.