Build and Deploy a Generative AI solution using a RAG framework
⚠️ Warning: Make sure to replace all instances ofPROJECT_ID = "qwiklabs-gcp-00-e4970b8c386a"
with your own Google Cloud project ID. Failing to do so will result in errors or unintended behavior when accessing Google Cloud resources.
This guide provides step-by-step instructions on how to ingest documents into a vector database, create embeddings, and set up a Flask API for interacting with the data. This process involves integrating various Google Cloud services, such as Firestore, Vertex AI, and Cloud Run.
import vertexai
from vertexai.language_models import TextEmbeddingModel
from vertexai.generative_models import GenerativeModel
import pickle
from IPython.display import display, Markdown
from langchain_google_vertexai import VertexAIEmbeddings
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_experimental.text_splitter import SemanticChunker
from import firestore
from import Vector
from import DistanceMeasure
This code imports the necessary modules and libraries. It includes:
- vertexai: For utilizing Google Cloud's Vertex AI functionalities.
- pickle: For loading and saving serialized data.
- IPython.display: For displaying Markdown content in notebooks.
- Firestore: To create and manage a vector database using Firestore.
- LangChain: For integrating with the Vertex AI Embeddings model.
PROJECT_ID = "qwiklabs-gcp-00-e4970b8c386a"
LOCATION = "us-central1"
vertexai.init(project=PROJECT_ID, location=LOCATION)
embedding_model = VertexAIEmbeddings(model_name="text-embedding-004")
- Initializes Vertex AI using your project ID and location.
- Sets up the embedding model
for generating embeddings.
!gcloud storage cp gs://partner-genai-bucket/genai069/nyc_food_safety_manual.pdf .
loader = PyMuPDFLoader("./nyc_food_safety_manual.pdf")
data = loader.load()
def clean_page(page):
return page.page_content.replace("-\n","")\
.replace("\n"," ")\
.replace("fo d P R O T E C T I O N T R A I N I N G M A N U A L","")\
.replace("N E W Y O R K C I T Y D E P A R T M E N T O F H E A L T H & M E N T A L H Y G I E N E","")
- Downloads the PDF document from a Google Cloud Storage bucket.
- Uses PyMuPDF to load the document and defines a
function to remove unwanted characters and text from each page.
cleaned_pages = []
for pages in data:
text_splitter = SemanticChunker(embedding_model)
docs = text_splitter.create_documents(cleaned_pages[0:4])
chunked_content = [doc.page_content for doc in docs]
chunked_embeddings = embedding_model.embed_documents(chunked_content)
- Cleans the document pages and uses the
to split the text into meaningful chunks. - Generates embeddings for each chunk using the Vertex AI Embedding model.
db = firestore.Client(project=PROJECT_ID)
collection = db.collection('food-safety')
for i, (content, embedding) in enumerate(zip(chunked_content, chunked_embeddings)):
doc_ref = collection.document(f"doc_{i}")
"content": content,
"embedding": Vector(embedding)
- Creates a Firestore client and sets up a collection named
. - Iterates through the chunked content and embeddings, adding them as documents in Firestore.
def search_vector_database(query: str):
query_embedding = embedding_model.embed_query(query)
vector_query = collection.find_nearest(
docs =
context = [result.to_dict()['content'] for result in docs]
return context
- This function searches the vector database for the closest matching documents based on the query.
- Retrieves the top 5 documents using the Euclidean distance measure and returns the matching content.
import os
import json
import logging
from flask import Flask, render_template, request
from import firestore
from import Vector
from import DistanceMeasure
import vertexai
from vertexai.generative_models import GenerativeModel, GenerationConfig
from langchain_google_vertexai import VertexAIEmbeddings
- Imports the necessary libraries for setting up a Flask API.
- Sets up Google Cloud logging and Firestore connection.
def ask_gemini(question):
prompt_template = "Using the context provided below, answer the following question:\nContext: {context}\nQuestion: {question}\nAnswer:"
context = search_vector_database(question)
formatted_prompt = prompt_template.format(context=context, question=question)
generation_config = GenerationConfig(
contents = [{"role": "user", "parts": [{"text": formatted_prompt}]}]
response = gen_model.generate_content(
response_text = response.text if response else "{}"
response_json = json.loads(response_text)
answer = response_json.get("answer", "No answer found.")
except json.JSONDecodeError:
answer = "Error: Unable to parse response."
return answer
- The
function utilizes the Gemini model to answer user queries using the context retrieved from the vector database. - It formats a prompt with the context and question, sends it to the generative model, and returns the model’s response.
docker build -t cymbal-docker-image -f Dockerfile .
docker tag cymbal-docker-image
gcloud auth configure-docker
docker push
- Builds the Docker image for the application and tags it with the artifact repository URL.
- Configures Docker to use gcloud credentials and pushes the image to Google Cloud's Artifact Registry.
gcloud run deploy cymbal-freshbot --platform=managed --region=us-central1 --allow-unauthenticated
- Deploys the Docker image to Cloud Run, allowing unauthenticated access to the service.