-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vector search documentation #9135
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
|:---|:---|:---|:---|:---| | ||
| Max dimensions | 16,000 | 16,000 | 16,000 | 16,000 | | ||
| Filter | Post-filter | Post-filter | Post-filter | Filter during search | | ||
| Training required | No | No | Yes | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Faiss HNSW with PQ also requires training
redirect_from: | ||
- /search-plugins/knn/knn-vector-quantization/ | ||
outside_cards: | ||
- heading: "Byte vectors" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add a card for binary vectors along with byte vectors
https://opensearch.org/docs/latest/field-types/supported-field-types/knn-vector#binary-vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naveentatikonda Thanks! I addressed both comments. Could you review this commit a5e8b8d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws changes looks good. Thanks for making those changes.
For binary vectors, we also need to add memory estimation. Here is the formula for HNSW
1.1 * (dimension / 8 + 8 * M) bytes/vector
For IVF, I guess it is 1.1 * (((dimension / 8) * num_vectors) + (nlist * dimension / 8))
. @jmazanec15 can you pls confirm ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added both formulas.
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Hi @kolchfa-aws, this looks awesome! In general, I think we should start moving the more low-level/expert details (like quantization and method configuration), out of vector search section and into detailed field reference section. Here is some high level feedback:
In performance tuning, we can mention picking a specific engine or specifying overriding method parameters for expert level fine tuning and point to reference docs.
Quantization is a bit ugly for users to have to understand. So I think its better to belong in detailed field reference. We can say, for further fine tuning of the quantization methods, see field reference. |
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
|
||
Mapping parameter | Required | Default | Updatable | Description | ||
:--- | :--- | :--- | :--- | :--- | ||
`name` | Yes | N/A | No | The nearest neighbor method. Valid values are `hnsw` and `ivf`. Not every engine combination supports each of the methods. For a list of supported methods, see the specific engine section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`name` | Yes | N/A | No | The nearest neighbor method. Valid values are `hnsw` and `ivf`. Not every engine combination supports each of the methods. For a list of supported methods, see the specific engine section. | |
`name` | Yes | N/A | No | The nearest neighbor method. Valid values are `hnsw` and `ivf`. Not every engine combination supports each of the methods. For a list of supported methods, see the section for a specific engine. |
Mapping parameter | Required | Default | Updatable | Description | ||
:--- | :--- | :--- | :--- | :--- | ||
`name` | Yes | N/A | No | The nearest neighbor method. Valid values are `hnsw` and `ivf`. Not every engine combination supports each of the methods. For a list of supported methods, see the specific engine section. | ||
`space_type` | No | `l2` | No | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section. Note: This value can also be specified at the top level of the mapping. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`space_type` | No | `l2` | No | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section. Note: This value can also be specified at the top level of the mapping. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). | |
`space_type` | No | `l2` | No | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the section for a specific engine. Note: This value can also be specified at the top level of the mapping. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). |
`name` | Yes | N/A | No | The nearest neighbor method. Valid values are `hnsw` and `ivf`. Not every engine combination supports each of the methods. For a list of supported methods, see the specific engine section. | ||
`space_type` | No | `l2` | No | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section. Note: This value can also be specified at the top level of the mapping. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). | ||
`engine` | No | `faiss` | No | The approximate k-NN library to use for indexing and search. Valid values are `faiss`, `lucene`, and `nmslib` (deprecated). | ||
`parameters` | No | `null` | No | The parameters used for the nearest neighbor method. For more information, see the specific engine section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`parameters` | No | `null` | No | The parameters used for the nearest neighbor method. For more information, see the specific engine section. | |
`parameters` | No | `null` | No | The parameters used for the nearest neighbor method. For more information, see the section for a specific engine. |
:--- | :--- | :--- | :--- | :--- | ||
`nlist` | No | 4 | No | Number of buckets to partition vectors into. Higher values may increase accuracy but increase memory and training latency. | ||
`nprobes` | No | 1 | No | Number of buckets to search during query. Higher values increase accuracy but slow searches. | ||
`encoder` | No | flat | No | Encoder definition for encoding vectors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`encoder` | No | flat | No | Encoder definition for encoding vectors. | |
`encoder` | No | flat | No | An encoder definition for encoding vectors. |
Parameter name | Required | Default | Updatable | Description | ||
:--- | :--- | :--- | :--- | :--- | ||
`ef_construction` | No | 100 | No | The size of the dynamic list used during k-NN graph creation. Higher values result in a more accurate graph but slower indexing speed. | ||
`m` | No | 16 | No | The number of bidirectional links created for each new element. Impacts memory consumption significantly. Keep between 2 and 100. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`m` | No | 16 | No | The number of bidirectional links created for each new element. Impacts memory consumption significantly. Keep between 2 and 100. | |
`m` | No | 16 | No | The number of bidirectional links created for each new element. Impacts memory consumption significantly. Keep between `2` and `100`. |
|
||
## Choosing the right method | ||
|
||
There are several options to choose from when building your `knn_vector` field. To determine the correct methods and parameters, you should first understand the requirements of your workload and what trade-offs you are willing to make. Factors to consider are (1) query latency, (2) query quality, (3) memory limits, and (4) indexing latency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several options to choose from when building your `knn_vector` field. To determine the correct methods and parameters, you should first understand the requirements of your workload and what trade-offs you are willing to make. Factors to consider are (1) query latency, (2) query quality, (3) memory limits, and (4) indexing latency. | |
There are several options to choose from when building your `knn_vector` field. To select the correct method and parameters, you should first understand the requirements of your workload and what trade-offs you are willing to make. Factors to consider are (1) query latency, (2) query quality, (3) memory limits, and (4) indexing latency. |
|
||
In a typical OpenSearch cluster, a certain portion of RAM is reserved for the JVM heap. OpenSearch allocates native library indexes to a portion of the remaining RAM. This portion's size is determined by the `circuit_breaker_limit` cluster setting. By default, the limit is set to 50%. | ||
|
||
Having a replica doubles the total number of vectors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a replica doubles the total number of vectors. | |
Using a replica doubles the total number of vectors. |
|
||
A space defines the function used to measure the distance between two points in order to determine the k-nearest neighbors. From the k-NN perspective, a lower score equates to a closer and better result. This is the opposite of how OpenSearch scores results, where a higher score equates to a better result. OpenSearch supports the following spaces. | ||
|
||
Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section in the [method documentation]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section in the [method documentation]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/). | |
Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the section for a specific engine in the [method documentation]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/). |
`type` | String | The vector field type. Must be `knn_vector`. Required. | ||
`dimension` | Integer | The size of the vectors used. Valid values are in the [1, 16,000] range. Required. | ||
`data_type` | String | The data type for the vector elements. Valid values are `binary`, `byte`, and `float`. Optional. Default is `float`. | ||
`space_type` | String | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section. Note: This value can also be specified within the `method`. Optional. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`space_type` | String | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the specific engine section. Note: This value can also be specified within the `method`. Optional. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). | |
`space_type` | String | The vector space used to calculate the distance between vectors. Valid values are `l1`, `l2`, `linf`, `cosinesimil`, `innerproduct`, `hamming`, and `hammingbit`. Not every method/engine combination supports each of the spaces. For a list of supported spaces, see the section for a specific engine. Note: This value can also be specified within the `method`. Optional. For more information, see [Spaces]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-spaces/). |
Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
link: "/vector-search/getting-started/tutorials/neural-search-tutorial/" | ||
pre_items: | ||
- heading: "Generate embeddings" | ||
description: "Generate embeddings outside of OpenSearch using your favorite embedding utility." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: "Generate embeddings outside of OpenSearch using your favorite embedding utility." | |
description: "Generate embeddings outside of OpenSearch using your favorite embedding utility." |
link: "/vector-search/searching-data/#searching-raw-vectors" | ||
auto_items: | ||
- heading: "Configure an embedding model" | ||
description: "Configure a machine learning model that will automatically generate embeddings from your text at ingest time and query time." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: "Configure a machine learning model that will automatically generate embeddings from your text at ingest time and query time." | |
description: "Configure a machine learning model that will automatically generate embeddings from your text at ingestion time and query time." |
|
||
# Bringing your own or generating embeddings | ||
|
||
In OpenSearch, you can either bring your own vectors or let OpenSearch generate them automatically from your data. Automated embedding generation integrated into OpenSearch reduces data preprocessing effort at ingestion and search time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In OpenSearch, you can either bring your own vectors or let OpenSearch generate them automatically from your data. Automated embedding generation integrated into OpenSearch reduces data preprocessing effort at ingestion and search time. | |
In OpenSearch, you can either bring your own vectors or let OpenSearch generate them automatically from your data. Letting OpenSearch automatically generate your embeddings reduces data preprocessing effort at ingestion and search time. |
|
||
### Option 2: Generate embeddings within OpenSearch | ||
|
||
OpenSearch automatically generates vector embeddings from your data using a machine learning (ML) model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenSearch automatically generates vector embeddings from your data using a machine learning (ML) model. | |
Use this option to let OpenSearch automatically generate vector embeddings from your data using a machine learning (ML) model. |
link: "/vector-search/ml-powered-search/conversational-search/" | ||
chunking_cards: | ||
- heading: "Text chunking" | ||
description: "Use text chunking to ensure adherence to token limit for embedding models." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: "Use text chunking to ensure adherence to token limit for embedding models." | |
description: "Use text chunking to ensure adherence to embedding model token limits." |
_vector-search/performance-tuning.md
Outdated
This topic provides performance tuning recommendations to improve indexing and search performance for approximate k-NN (ANN). From a high level, k-NN works according to these principles: | ||
* Vector indexes are created per knn_vector field / (Lucene) segment pair. | ||
* Queries execute on segments sequentially inside the shard (same as any other OpenSearch query). | ||
* The coordinator node picks up final size number of neighbors from the neighbors returned by each shard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The coordinator node picks up final size number of neighbors from the neighbors returned by each shard. | |
* The coordinator node selects the final `size` neighbors from the neighbors returned by each shard. |
_vector-search/settings.md
Outdated
`knn.cache.item.expiry.enabled` | Dynamic | `false` | Whether to remove native library indexes that have not been accessed for a certain duration from memory. | ||
`knn.cache.item.expiry.minutes` | Dynamic | `3h` | If enabled, the amount of idle time before a native library index is removed from memory. | ||
`knn.circuit_breaker.unset.percentage` | Dynamic | `75` | The native memory usage threshold for the circuit breaker. Memory usage must be lower than this percentage of `knn.memory.circuit_breaker.limit` in order for `knn.circuit_breaker.triggered` to remain `false`. | ||
`knn.circuit_breaker.triggered` | Dynamic | `false` | True when memory usage exceeds the `knn.circuit_breaker.unset.percentage` value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`knn.circuit_breaker.triggered` | Dynamic | `false` | True when memory usage exceeds the `knn.circuit_breaker.unset.percentage` value. | |
`knn.circuit_breaker.triggered` | Dynamic | `false` | `true` when memory usage exceeds the `knn.circuit_breaker.unset.percentage` value. |
_vector-search/settings.md
Outdated
`knn.model.index.number_of_shards`| Dynamic | `1` | The number of shards to use for the model system index, which is the OpenSearch index that stores the models used for approximate nearest neighbor (ANN) search. | ||
`knn.model.index.number_of_replicas`| Dynamic | `1` | The number of replica shards to use for the model system index. Generally, in a multi-node cluster, this value should be at least 1 in order to increase stability. | ||
`knn.model.cache.size.limit` | Dynamic | `10%` | The model cache limit cannot exceed 25% of the JVM heap. | ||
`knn.faiss.avx2.disabled` | Static | `false` | A static setting that specifies whether to disable the SIMD-based `libopensearchknn_faiss_avx2.so` library and load the non-optimized `libopensearchknn_faiss.so` library for the Faiss engine on machines with x64 architecture. For more information, see [SIMD optimization]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/#simd-optimization). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`knn.faiss.avx2.disabled` | Static | `false` | A static setting that specifies whether to disable the SIMD-based `libopensearchknn_faiss_avx2.so` library and load the non-optimized `libopensearchknn_faiss.so` library for the Faiss engine on machines with x64 architecture. For more information, see [SIMD optimization]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/#simd-optimization). | |
`knn.faiss.avx2.disabled` | Static | `false` | A static setting that specifies whether to disable the SIMD-based `libopensearchknn_faiss_avx2.so` library and load the non-optimized `libopensearchknn_faiss.so` library for the Faiss engine on machines with x64 architecture. For more information, see [Single Instruction Multiple Data (SIMD) optimization]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-methods-engines/#simd-optimization). |
_vector-search/settings.md
Outdated
|
||
## Index settings | ||
|
||
The following table lists all available index-level k-NN settings. For information about updating these settings, see [Index-level index setting]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index-settings/#index-level-index-settings). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following table lists all available index-level k-NN settings. For information about updating these settings, see [Index-level index setting]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index-settings/#index-level-index-settings). | |
The following table lists all available index-level k-NN settings. For information about updating these settings, see [Index-level index settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index-settings/#index-level-index-settings). |
|
||
Because scores can only be positive, this script ranks documents with vector fields higher than those without. | ||
|
||
With cosine similarity, it is not valid to pass a zero vector (`[0, 0, ...]`) as input. This is because the magnitude of such a vector is 0, which raises a `divide by 0` exception in the corresponding formula. Requests containing the zero vector will be rejected, and a corresponding exception will be thrown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With cosine similarity, it is not valid to pass a zero vector (`[0, 0, ...]`) as input. This is because the magnitude of such a vector is 0, which raises a `divide by 0` exception in the corresponding formula. Requests containing the zero vector will be rejected, and a corresponding exception will be thrown. | |
When using cosine similarity, it is not valid to pass a zero vector (`[0, 0, ...]`) as input. This is because the magnitude of such a vector is 0, which raises a `divide by 0` exception in the corresponding formula. Requests containing the zero vector will be rejected, and a corresponding exception will be thrown. |
Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Adds a vector search section
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.