-
Notifications
You must be signed in to change notification settings - Fork 206
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
source code: Add Multimodal RAG with Elasticsearch Gotham City tutori…
…al (#390) Signed-off-by: Adrian Cole <adrian.cole@elastic.co> Co-authored-by: Adrian Cole <adrian.cole@elastic.co> Co-authored-by: Jess Garson <jess.garson@elastic.co>
- Loading branch information
1 parent
24c2e81
commit 21cfcc6
Showing
29 changed files
with
1,377 additions
and
0 deletions.
There are no files selected for viewing
66 changes: 66 additions & 0 deletions
66
...orting-blog-content/building-multimodal-rag-with-elasticsearch-gotham/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Building a Multimodal RAG Pipeline with Elasticsearch: The Story of Gotham City | ||
|
||
This repository contains the code for implementing a Multimodal Retrieval-Augmented Generation (RAG) system using Elasticsearch. The system processes and analyzes different types of evidence (images, audio, text, and depth maps) to solve a crime in Gotham City. | ||
|
||
## Overview | ||
|
||
The pipeline demonstrates how to: | ||
- Generate unified embeddings for multiple modalities using ImageBind | ||
- Store and search vectors efficiently in Elasticsearch | ||
- Analyze evidence using GPT-4 to generate forensic reports | ||
|
||
## Prerequisites | ||
|
||
- Python 3.x | ||
- Elasticsearch cluster (cloud or local) | ||
- OpenAI API key - Setup an OpenAI account and create a [secret key](https://platform.openai.com/docs/quickstart) | ||
- 8GB+ RAM | ||
- GPU (optional but recommended) | ||
|
||
## Code execution | ||
|
||
We provide a Google Colab notebook that allows you to explore the entire pipeline interactively: | ||
- [Open the Multimodal RAG Pipeline Notebook](notebook/01-mmrag-blog-quick-start.ipynb) | ||
- This notebook includes step-by-step instructions and explanations for each stage of the pipeline | ||
|
||
|
||
## Project Structure | ||
|
||
``` | ||
├── README.md | ||
├── requirements.txt | ||
├── notebook/ | ||
│ ├── 01-mmrag-blog-quick-start.ipynb # Jupyter notebook execution | ||
├── src/ | ||
│ ├── embedding_generator.py # ImageBind wrapper | ||
│ ├── elastic_manager.py # Elasticsearch operations | ||
│ └── llm_analyzer.py # GPT-4 integration | ||
├── stages/ | ||
│ ├── 01-stage/ # File organization | ||
│ ├── 02-stage/ # Embedding generation | ||
│ ├── 03-stage/ # Elasticsearch indexing/search | ||
│ └── 04-stage/ # Evidence analysis | ||
└── data/ # Sample data | ||
├── images/ | ||
├── audios/ | ||
├── texts/ | ||
└── depths/ | ||
``` | ||
|
||
## Sample Data | ||
|
||
The repository includes sample evidence files: | ||
- Images: Crime scene photos and security camera footage | ||
- Audio: Suspicious sound recordings | ||
- Text: Mysterious notes and riddles | ||
- Depth Maps: 3D scene captures | ||
|
||
## How It Works | ||
|
||
1. **Evidence Collection**: Files are organized by modality in the `data/` directory | ||
2. **Embedding Generation**: ImageBind converts each piece of evidence into a 1024-dimensional vector | ||
3. **Vector Storage**: Elasticsearch stores embeddings with metadata for efficient retrieval | ||
4. **Similarity Search**: New evidence is compared against the database using k-NN search | ||
5. **Analysis**: GPT-4 analyzes the connections between evidence to identify suspects | ||
|
Binary file added
BIN
+440 KB
...log-content/building-multimodal-rag-with-elasticsearch-gotham/data/audios/joker_laugh.wav
Binary file not shown.
Binary file added
BIN
+57.5 KB
...building-multimodal-rag-with-elasticsearch-gotham/data/depths/depth_suspect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+126 KB
...uilding-multimodal-rag-with-elasticsearch-gotham/data/depths/jdancing-depth.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+201 KB
.../building-multimodal-rag-with-elasticsearch-gotham/data/images/crime_scene1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+134 KB
.../building-multimodal-rag-with-elasticsearch-gotham/data/images/crime_scene2.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+2.41 MB
...tent/building-multimodal-rag-with-elasticsearch-gotham/data/images/jdancing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+95.4 KB
...t/building-multimodal-rag-with-elasticsearch-gotham/data/images/joker_alley.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+1.98 MB
...uilding-multimodal-rag-with-elasticsearch-gotham/data/images/joker_laughing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+2.3 MB
...building-multimodal-rag-with-elasticsearch-gotham/data/images/playing-cards.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+69.9 KB
...building-multimodal-rag-with-elasticsearch-gotham/data/images/suspect_depth.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions
8
...rting-blog-content/building-multimodal-rag-with-elasticsearch-gotham/data/texts/note2.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Why so serious? | ||
|
||
The show has just begun and you're already running | ||
While clowns are dancing and the city's stunning | ||
In the abandoned theater, a surprise awaits | ||
Come play with me before it's too late! | ||
|
||
HAHAHAHAHA! |
15 changes: 15 additions & 0 deletions
15
...og-content/building-multimodal-rag-with-elasticsearch-gotham/data/texts/police_report.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
PRELIMINARY REPORT - GCPD | ||
Date: 01/28/2025 | ||
Time: 22:30 | ||
|
||
Incident: Break-in and Vandalism | ||
Location: Gotham Central Bank | ||
Evidence Found: | ||
- Playing cards scattered | ||
- Smile graffiti on walls | ||
- Suspicious audio recording | ||
- Witnesses report maniacal laughter | ||
|
||
Status: Under Investigation | ||
Priority Level: MAXIMUM | ||
Primary Suspect: Unknown (possible Joker involvement) |
16 changes: 16 additions & 0 deletions
16
...ting-blog-content/building-multimodal-rag-with-elasticsearch-gotham/data/texts/riddle.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
HAHAHA! | ||
|
||
Dear Detective, | ||
|
||
In a city of endless night, a new game unfolds | ||
Where chaos reigns and fear takes hold | ||
I left a gift at Gotham Central Bank | ||
Time's ticking, your mind goes blank | ||
|
||
The clues are there, scattered with care | ||
Each laugh echoes everywhere | ||
Midnight strikes, you won't catch me | ||
In Gotham's heart, chaos runs free! | ||
|
||
With a smile, | ||
? |
5 changes: 5 additions & 0 deletions
5
...ing-blog-content/building-multimodal-rag-with-elasticsearch-gotham/data/texts/threats.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
Incident Log: | ||
1. Gotham Central Bank - 22:15 - Alarm triggered | ||
2. Monarch Theater - 22:45 - Suspicious laughter reported | ||
3. Abandoned Amusement Park - 23:00 - Strange lights | ||
4. Ace Chemical Plant - 23:30 - Suspicious movement |
17 changes: 17 additions & 0 deletions
17
supporting-blog-content/building-multimodal-rag-with-elasticsearch-gotham/env.example
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Make a copy of this file with the name .env and assign values to variables | ||
|
||
# How you connect to Elasticsearch: change details to your instance | ||
ELASTICSEARCH_URL= | ||
ELASTICSEARCH_API_KEY= | ||
# If not using API key, uncomment these and fill them in: | ||
# ELASTICSEARCH_USER=elastic | ||
# ELASTICSEARCH_PASSWORD=elastic | ||
|
||
# OpenAI Configuration | ||
OPENAI_API_KEY= | ||
|
||
# Model Configuration | ||
|
||
# Optional Configuration | ||
# LOG_LEVEL=INFO | ||
# DEBUG=False |
Oops, something went wrong.