support embedding types #3560

zane-neo · 2025-02-17T09:17:08Z

Description

BedRock titan model V2 and Cohere V2 model now supports multiple embedding types in their API, which means you can specify different embedding types as a list in the request, and model generates different results based on embedding types in response, e.g.:

import cohere
import requests
import base64

co = cohere.ClientV2()

image = requests.get("https://cohere.com/favicon-32x32.png")
stringified_buffer = base64.b64encode(image.content).decode("utf-8")
content_type = image.headers["Content-Type"]
image_base64 = f"data:{content_type};base64,{stringified_buffer}"

response = co.embed(
    model="embed-english-v3.0",
    input_type="image",
    embedding_types=["float", "binary", "int8"],
    images=[image_base64],
)

print(response)

Response:

{
  "id": "5807ee2e-0cda-445a-9ec8-864c60a06606",
  "embeddings": {
    "float": [
      [
        -0.007247925,
        -0.041229248,
        -0.023223877,
       ......
      ]
    ],
    "binary": [
        [
            12,
            -13,
            14
        ]
    ],
    "int8": [
        [
            19,
            20,
            21
        ]
    ]
  },
  "texts": [],
  "images": [
    {
      "width": 400,
      "height": 400,
      "format": "jpeg",
      "bit_depth": 24
    }
  ],
  "meta": {
    "api_version": {
      "version": "2"
    },
    "billed_units": {
      "images": 1
    }
  }
}

The current bedrock and cohere post process function has default response filter to $.embedding and $.embeddings respectively and the extracted results are both List<List<Float>> type, after they support embedding types, the response structure is different and the default response filter extracts List<List<Float>> and Map<String, List<List<Number>> type separately. This PR is to add support to user who wants to extract different types of results in model response, and this can be classified into several categories:

BedRock (The post process function applies for both v1 and v2 model)

The default case(response filter not set) is exactly same as before to maintain BWC.
When user specified the response filter to $.embeddingsByType, the response will be captured into dataAsMap in ModelTensor, and user can get all embedding types result.
When user specified the response filter to $.embeddingsByType.{concrete_embedding_type}, the response will be captured similar to default case to the data field in ModelTensor.

Cohere V2 (A new v2 post process function is added to v2 model)

The default case(response filter not set) is exactly same between v1 and v2.
When user specified the response filter to $.embeddings, the response will be captured into dataAsMap in ModelTensor, and user can get all embedding types result.
When user specified the response filter to $.embeddings.{concrete_embedding_type}, the response will be captured similar to default case to the data field in ModelTensor.

Related Issues

#3093

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: zane-neo <zaniu@amazon.com>

Zhangxunmt

lgtm, but can you add ITs for the added embedding types in this PR?

Signed-off-by: zane-neo <zaniu@amazon.com>

zane-neo · 2025-02-21T09:03:05Z

lgtm, but can you add ITs for the added embedding types in this PR?

Added more ITs, also updated the blueprint under tutorial, please help review again.

zane-neo · 2025-02-21T09:08:39Z

The change should be compatible with neural-search plugin as the post_process_function are mainly to convert the response to a format that neural-search plugin accepts so that user can use the model to create ingestion pipeline, the PR in neural-search are merged now: opensearch-project/neural-search#1007

Signed-off-by: zane-neo <zaniu@amazon.com>

jngz-es · 2025-02-24T21:57:28Z

ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/ConnectorUtils.java

@@ -244,6 +247,15 @@ public static ModelTensors processOutput(
        return ModelTensors.builder().mlModelTensors(modelTensors).build();
    }

+    private static MLResultDataType parseMLResultDataTypeFromResponseFilter(String responseFilter) {
+        for (MLResultDataType type : MLResultDataType.values()) {
+            if (responseFilter.contains("." + type.name()) || responseFilter.contains("." + type.name().toLowerCase(Locale.ROOT))) {


What about using containsIgnoreCase to support type in a case-insensitivity way?

jngz-es · 2025-02-24T22:27:53Z

...a/org/opensearch/ml/common/connector/functions/postprocess/ConnectorPostProcessFunction.java

 import org.opensearch.ml.common.output.model.ModelTensor;

-public abstract class ConnectorPostProcessFunction<T> implements Function<Object, List<ModelTensor>> {
+public abstract class ConnectorPostProcessFunction<T> implements BiFunction<Object, MLResultDataType, List<ModelTensor>> {


This interface change looks like a breaking change, and not all PostProcessFunctions are using the ResultDataType like BedrockBatchJobArnPostProcessFunction. So what I am thinking is 1. to use extend instead of implements to support additional parameter MLResultDataType, or 2. to have MLResultDataType covering all cases and all PostProcessFunctions to use/set it, for example, not use null as dateType but have a default type something like FLOAT32.

I considered to change the Function to a self-defined interface/abstract class, but that needs more code changes and I want to limit the changes in this PR for easier reviewing. I'm open to make these changes in this PR if you insist or create a separate PR for better reviewing experience.

Yeah, for this pr, I am OK. We can have more discussion about if we need refactoring later.

…ately Signed-off-by: zane-neo <zaniu@amazon.com>

Zhangxunmt · 2025-02-25T03:16:41Z

plugin/src/test/java/org/opensearch/ml/rest/RestBedRockV2PostProcessFunctionInferenceIT.java

+import java.util.Map;
+
+@Log4j2
+public class RestBedRockV2PostProcessFunctionInferenceIT extends MLCommonsRestTestCase {


Why nor merge into the BedRock Inference IT to share the IT resources, reducing the overall IT run time length?
https://github.com/opensearch-project/ml-commons/blob/main/plugin/src/test/java/org/opensearch/ml/rest/RestBedRockInferenceIT.java

The template of RestBedRockInferenceIT and RestBedRockV2PostProcessFunctionInferenceIT are different, so if we merge them together, the code will seem mess up, separating them is for better code reading experience.

agree, but this means a simple test case would need spinning up new IT cluster resources. Given the IT suites already takes 40+ mins, it's not scalable to keep adding new test classes. So organizing the same test categories into a same test suite sharing the resources is a better approach overall. Do you have any better ideas on this matter?

Ok, I can take a look again to find a way to merge them and maintain the readability in best effort.

Zhangxunmt · 2025-02-25T03:18:26Z

plugin/src/test/java/org/opensearch/ml/rest/RestCohereInferenceIT.java

+import java.util.Map;
+import java.util.Optional;
+
+public class RestCohereInferenceIT extends MLCommonsRestTestCase {


This IT can be merged into the RemoteInferenceIT?

The RemoteInferenceIT class file is already too big, I prefer we use separate file to test single feature for better maintenance.

btw, the spotless is not applied yet.

Signed-off-by: zane-neo <zaniu@amazon.com>

jngz-es

LGTM, thanks for changes.

dhrubo-os · 2025-03-01T00:20:01Z

plugin/src/test/java/org/opensearch/ml/rest/RestCohereInferenceIT.java

+import org.opensearch.ml.common.input.MLInput;
+
+public class RestCohereInferenceIT extends MLCommonsRestTestCase {
+    private final String COHERE_KEY = Optional.ofNullable(System.getenv("COHERE_KEY")).orElse("UzRF34a6gj0OKkvHOO6FZxLItv8CNpK5dFdCaUDW");


We shouldn't expose this API key!!!

cc @ylwu-amzn

@zane-neo, let's avoid adding credentials in code

It's my private API key for testing, forgot to remove it in the PR.

git-secrets could be helpful here.

* support embedding types Signed-off-by: zane-neo <zaniu@amazon.com> * Add ITs and blueprints in tutorial Signed-off-by: zane-neo <zaniu@amazon.com> * format code Signed-off-by: zane-neo <zaniu@amazon.com> * add licence header Signed-off-by: zane-neo <zaniu@amazon.com> * use containsIgnoreCase instead of checking capital & lower case separately Signed-off-by: zane-neo <zaniu@amazon.com> * merge post processing function IT to existing file Signed-off-by: zane-neo <zaniu@amazon.com> * format code Signed-off-by: zane-neo <zaniu@amazon.com> --------- Signed-off-by: zane-neo <zaniu@amazon.com>

support embedding types

c3d50eb

Signed-off-by: zane-neo <zaniu@amazon.com>

zane-neo requested review from b4sjoo, dhrubo-os, mingshl, jngz-es, model-collapse, rbhavna, ylwu-amzn, Zhangxunmt, austintlee, HenryL27 and xinyual as code owners February 17, 2025 09:17

zane-neo had a problem deploying to ml-commons-cicd-env February 17, 2025 09:18 — with GitHub Actions Error

zane-neo temporarily deployed to ml-commons-cicd-env February 17, 2025 09:18 — with GitHub Actions Inactive

zane-neo had a problem deploying to ml-commons-cicd-env February 17, 2025 09:18 — with GitHub Actions Failure

zane-neo temporarily deployed to ml-commons-cicd-env February 17, 2025 09:18 — with GitHub Actions Inactive

zane-neo mentioned this pull request Feb 17, 2025

support model embedding types in bedrock and cohere post process function #3262

Closed

5 tasks

Zhangxunmt reviewed Feb 19, 2025

View reviewed changes

zane-neo added 2 commits February 21, 2025 16:58

Add ITs and blueprints in tutorial

0a19fab

Signed-off-by: zane-neo <zaniu@amazon.com>

format code

266d627

Signed-off-by: zane-neo <zaniu@amazon.com>

zane-neo had a problem deploying to ml-commons-cicd-env February 21, 2025 09:03 — with GitHub Actions Error

zane-neo had a problem deploying to ml-commons-cicd-env February 21, 2025 09:03 — with GitHub Actions Failure

zane-neo had a problem deploying to ml-commons-cicd-env February 21, 2025 09:03 — with GitHub Actions Error

add licence header

bef2c18

Signed-off-by: zane-neo <zaniu@amazon.com>

zane-neo had a problem deploying to ml-commons-cicd-env February 21, 2025 09:13 — with GitHub Actions Failure

zane-neo had a problem deploying to ml-commons-cicd-env February 21, 2025 09:13 — with GitHub Actions Error

zane-neo had a problem deploying to ml-commons-cicd-env February 21, 2025 09:13 — with GitHub Actions Failure

zane-neo had a problem deploying to ml-commons-cicd-env February 24, 2025 03:07 — with GitHub Actions Error

zane-neo had a problem deploying to ml-commons-cicd-env February 24, 2025 03:07 — with GitHub Actions Failure

zane-neo temporarily deployed to ml-commons-cicd-env February 24, 2025 03:07 — with GitHub Actions Inactive

jngz-es reviewed Feb 24, 2025

View reviewed changes

use containsIgnoreCase instead of checking capital & lower case separ…

a0c5fdd

…ately Signed-off-by: zane-neo <zaniu@amazon.com>

Zhangxunmt reviewed Feb 25, 2025

View reviewed changes

zane-neo added 2 commits February 25, 2025 13:20

merge post processing function IT to existing file

33cc4f4

Signed-off-by: zane-neo <zaniu@amazon.com>

format code

e1e1e14

Signed-off-by: zane-neo <zaniu@amazon.com>

zane-neo had a problem deploying to ml-commons-cicd-env February 25, 2025 05:23 — with GitHub Actions Error

zane-neo had a problem deploying to ml-commons-cicd-env February 25, 2025 05:23 — with GitHub Actions Failure

jngz-es approved these changes Feb 26, 2025

View reviewed changes

Zhangxunmt approved these changes Feb 27, 2025

View reviewed changes

zane-neo merged commit 26fc493 into opensearch-project:main Feb 28, 2025
6 of 10 checks passed

dhrubo-os reviewed Mar 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support embedding types #3560

support embedding types #3560

zane-neo commented Feb 17, 2025 •

edited

Loading

Zhangxunmt left a comment

zane-neo commented Feb 21, 2025 •

edited

Loading

zane-neo commented Feb 21, 2025

jngz-es Feb 24, 2025

jngz-es Feb 24, 2025

zane-neo Feb 25, 2025

jngz-es Feb 26, 2025

Zhangxunmt Feb 25, 2025

zane-neo Feb 25, 2025

Zhangxunmt Feb 25, 2025 •

edited

Loading

zane-neo Feb 25, 2025

Zhangxunmt Feb 25, 2025

zane-neo Feb 25, 2025

Zhangxunmt Feb 25, 2025

jngz-es left a comment

dhrubo-os Mar 1, 2025

ylwu-amzn Mar 1, 2025

zane-neo Mar 1, 2025

dhrubo-os Mar 1, 2025

support embedding types #3560

support embedding types #3560

Conversation

zane-neo commented Feb 17, 2025 • edited Loading

Description

BedRock (The post process function applies for both v1 and v2 model)

Cohere V2 (A new v2 post process function is added to v2 model)

Related Issues

Check List

Zhangxunmt left a comment

Choose a reason for hiding this comment

zane-neo commented Feb 21, 2025 • edited Loading

zane-neo commented Feb 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zhangxunmt Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jngz-es left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zane-neo commented Feb 17, 2025 •

edited

Loading

zane-neo commented Feb 21, 2025 •

edited

Loading

Zhangxunmt Feb 25, 2025 •

edited

Loading