Skip to content

Commit

Permalink
feat: Updating documents to highlight v2 api for Vector Similarity Se…
Browse files Browse the repository at this point in the history
…arch

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
  • Loading branch information
franciscojavierarceo committed Feb 3, 2025
1 parent 92dde13 commit eff1b65
Show file tree
Hide file tree
Showing 2 changed files with 140 additions and 61 deletions.
196 changes: 137 additions & 59 deletions docs/reference/alpha-vector-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,29 @@ Below are supported vector databases and implemented features:
|-----------------|-----------|----------|
| Pgvector | [x] | [ ] |
| Elasticsearch | [x] | [x] |
| Milvus | [ ] | [ ] |
| Milvus | [x] | [x] |
| Faiss | [ ] | [ ] |
| SQLite | [x] | [ ] |
| Qdrant | [x] | [x] |

Note: SQLite is in limited access and only working on Python 3.10. It will be updated as [sqlite_vec](https://github.com/asg017/sqlite-vec/) progresses.

## Example
{% hint style="danger" %}
We will be deprecating the `retrieve_online_documents` method in the SDK in the future.
We recommend using the `retrieve_online_documents_v2` method instead, which offers easier vector index configuration
directly in the Feature View and the ability to retrieve standard features alongside your vector embeddings for richer context injection.

See [https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag](https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag) for an example on how to use vector database.
Long term we will collapse the two methods into one, but for now, we recommend using the `retrieve_online_documents_v2` method.
Beyond that, we will then have `retrieve_online_documents` and `retrieve_online_documents_v2` simply point to `get_online_features` for
backwards compatibility and the adopt industry standard naming conventions.
{% endhint %}

**Note**: Milvus implements the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag.

## Examples

- See the v0 [Rag Demo](https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag) for an example on how to use vector database using the `retrieve_online_documents` method (planning migration and deprecation (planning migration and deprecation).
- See the v1 [Milvus Quickstart](../../examples/rag/milvus-quickstart.ipynb) for a quickstart guide on how to use Feast with Milvus using the `retrieve_online_documents_v2` method.

### **Prepare offline embedding dataset**
Run the following commands to prepare the embedding dataset:
Expand All @@ -34,25 +47,23 @@ The output will be stored in `data/city_wikipedia_summaries.csv.`
Use the feature_store.yaml file to initialize the feature store. This will use the data as offline store, and Pgvector as online store.

```yaml
project: feast_demo_local
project: local_rag
provider: local
registry:
registry_type: sql
path: postgresql://@localhost:5432/feast
registry: data/registry.db
online_store:
type: postgres
type: milvus
path: data/online_store.db
vector_enabled: true
vector_len: 384
host: 127.0.0.1
port: 5432
database: feast
user: ""
password: ""
embedding_dim: 384
index_type: "IVF_FLAT"


offline_store:
type: file
entity_key_serialization_version: 2
entity_key_serialization_version: 3
# By default, no_auth for authentication and authorization, other possible values kubernetes and oidc. Refer the documentation for more details.
auth:
type: no_auth
```
Run the following command in terminal to apply the feature store configuration:
Expand All @@ -63,75 +74,128 @@ feast apply
Note that when you run `feast apply` you are going to apply the following Feature View that we will use for retrieval later:

```python
city_embeddings_feature_view = FeatureView(
name="city_embeddings",
entities=[item],
document_embeddings = FeatureView(
name="embedded_documents",
entities=[item, author],
schema=[
Field(name="Embeddings", dtype=Array(Float32)),
Field(
name="vector",
dtype=Array(Float32),
# Look how easy it is to enable RAG!
vector_index=True,
vector_search_metric="COSINE",
),
Field(name="item_id", dtype=Int64),
Field(name="author_id", dtype=String),
Field(name="created_timestamp", dtype=UnixTimestamp),
Field(name="sentence_chunks", dtype=String),
Field(name="event_timestamp", dtype=UnixTimestamp),
],
source=source,
ttl=timedelta(hours=2),
source=rag_documents_source,
ttl=timedelta(hours=24),
)
```

Then run the following command in the terminal to materialize the data to the online store:

```shell
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME
Let's use the SDK to write a data frame of embeddings to the online store:
```python
store.write_to_online_store(feature_view_name='city_embeddings', df=df)
```

### **Prepare a query embedding**
During inference (e.g., during when a user submits a chat message) we need to embed the input text. This can be thought of as a feature transformation of the input data. In this example, we'll do this with a small Sentence Transformer from Hugging Face.

```python
from batch_score_documents import run_model, TOKENIZER, MODEL
import torch
import torch.nn.functional as F
from feast import FeatureStore
from pymilvus import MilvusClient, DataType, FieldSchema
from transformers import AutoTokenizer, AutoModel

question = "the most populous city in the U.S. state of Texas?"
from example_repo import city_embeddings_feature_view, item

TOKENIZER = "sentence-transformers/all-MiniLM-L6-v2"
MODEL = "sentence-transformers/all-MiniLM-L6-v2"

def mean_pooling(model_output, attention_mask):
token_embeddings = model_output[
0
] # First element of model_output contains all token embeddings
input_mask_expanded = (
attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
)
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(
input_mask_expanded.sum(1), min=1e-9
)

def run_model(sentences, tokenizer, model):
encoded_input = tokenizer(
sentences, padding=True, truncation=True, return_tensors="pt"
)
# Compute token embeddings
with torch.no_grad():
model_output = model(**encoded_input)

sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
return sentence_embeddings

question = "Which city has the largest population in New York?"

tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)
model = AutoModel.from_pretrained(MODEL)
query_embedding = run_model(question, tokenizer, model)
query = query_embedding.detach().cpu().numpy().tolist()[0]
query_embedding = run_model(question, tokenizer, model).detach().cpu().numpy().tolist()[0]
```

### **Retrieve the top 5 similar documents**
First create a feature store instance, and use the `retrieve_online_documents` API to retrieve the top 5 similar documents to the specified query.
### **Retrieve the top K similar documents**
First create a feature store instance, and use the `retrieve_online_documents_v2` API to retrieve the top 5 similar documents to the specified query.

```python
from feast import FeatureStore
store = FeatureStore(repo_path=".")
features = store.retrieve_online_documents(
feature="city_embeddings:Embeddings",
query=query,
top_k=5
).to_dict()

def print_online_features(features):
for key, value in sorted(features.items()):
print(key, " : ", value)

print_online_features(features)
context_data = store.retrieve_online_documents_v2(
features=[
"city_embeddings:vector",
"city_embeddings:item_id",
"city_embeddings:state",
"city_embeddings:sentence_chunks",
"city_embeddings:wiki_summary",
],
query=query_embedding,
top_k=3,
distance_metric='COSINE',
).to_df()
```
### **Generate the Response**
Let's assume we have a base prompt and a function that formats the retrieved documents called `format_documents` that we
can then use to generate the response with OpenAI's chat completion API.
```python
FULL_PROMPT = format_documents(rag_context_data, BASE_PROMPT)

### Configuration

We offer [PGVector](https://github.com/pgvector/pgvector), [SQLite](https://github.com/asg017/sqlite-vec), [Elasticsearch](https://www.elastic.co) and [Qdrant](https://qdrant.tech/) as Online Store options for Vector Databases.
from openai import OpenAI

#### Installation with SQLite
client = OpenAI(
api_key=os.environ.get("OPENAI_API_KEY"),
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": FULL_PROMPT},
{"role": "user", "content": question}
],
)

If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
```bash
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
pyenv install 3.10.14
# And this will print the content. Look at the examples/rag/milvus-quickstart.ipynb for an end-to-end example.
print('\n'.join([c.message.content for c in response.choices]))
```
And you can the Feast install package via:

### Configuration and Installation

We offer [Milvus](https://milvus.io/), [PGVector](https://github.com/pgvector/pgvector), [SQLite](https://github.com/asg017/sqlite-vec), [Elasticsearch](https://www.elastic.co) and [Qdrant](https://qdrant.tech/) as Online Store options for Vector Databases.

Milvus offers a convenient local implementation for vector similarity search. To use Milvus, you can install the Feast package with the Milvus extra.

#### Installation with Milvus

```bash
pip install feast[sqlite_vec]
pip install feast[milvus]
```

#### Installation with Elasticsearch

```bash
Expand All @@ -143,3 +207,17 @@ pip install feast[elasticsearch]
```bash
pip install feast[qdrant]
```
#### Installation with SQLite

If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
```bash
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
pyenv install 3.10.14
```

And you can the Feast install package via:
```bash
pip install feast[sqlite_vec]
```
5 changes: 3 additions & 2 deletions docs/reference/online-stores/milvus.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ The set of functionality supported by online stores is described in detail [here
Below is a matrix indicating which functionality is supported by the Milvus online store.
| | Milvus |
| :-------------------------------------------------------- |:-------|
|:----------------------------------------------------------|:-------|
| write feature values to the online store | yes |
| read feature values from the online store | yes |
| update infrastructure (e.g. tables) in the online store | yes |
Expand All @@ -59,6 +59,7 @@ Below is a matrix indicating which functionality is supported by the Milvus onli
| support for deleting expired data | yes |
| collocated by feature view | no |
| collocated by feature service | no |
| collocated by entity key | yes |
| collocated by entity key | no |
| vector similarity search | yes |
To compare this set of functionality against other online stores, please see the full [functionality matrix](overview.md#functionality-matrix).

0 comments on commit eff1b65

Please sign in to comment.