Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support embedding types #3560

Merged

Conversation

zane-neo
Copy link
Collaborator

@zane-neo zane-neo commented Feb 17, 2025

Description

BedRock titan model V2 and Cohere V2 model now supports multiple embedding types in their API, which means you can specify different embedding types as a list in the request, and model generates different results based on embedding types in response, e.g.:

import cohere
import requests
import base64

co = cohere.ClientV2()

image = requests.get("https://cohere.com/favicon-32x32.png")
stringified_buffer = base64.b64encode(image.content).decode("utf-8")
content_type = image.headers["Content-Type"]
image_base64 = f"data:{content_type};base64,{stringified_buffer}"

response = co.embed(
    model="embed-english-v3.0",
    input_type="image",
    embedding_types=["float", "binary", "int8"],
    images=[image_base64],
)

print(response)

Response:

{
  "id": "5807ee2e-0cda-445a-9ec8-864c60a06606",
  "embeddings": {
    "float": [
      [
        -0.007247925,
        -0.041229248,
        -0.023223877,
       ......
      ]
    ],
    "binary": [
        [
            12,
            -13,
            14
        ]
    ],
    "int8": [
        [
            19,
            20,
            21
        ]
    ]
  },
  "texts": [],
  "images": [
    {
      "width": 400,
      "height": 400,
      "format": "jpeg",
      "bit_depth": 24
    }
  ],
  "meta": {
    "api_version": {
      "version": "2"
    },
    "billed_units": {
      "images": 1
    }
  }
}

The current bedrock and cohere post process function has default response filter to $.embedding and $.embeddings respectively and the extracted results are both List<List<Float>> type, after they support embedding types, the response structure is different and the default response filter extracts List<List<Float>> and Map<String, List<List<Number>> type separately. This PR is to add support to user who wants to extract different types of results in model response, and this can be classified into several categories:

BedRock (The post process function applies for both v1 and v2 model)

  1. The default case(response filter not set) is exactly same as before to maintain BWC.
  2. When user specified the response filter to $.embeddingsByType, the response will be captured into dataAsMap in ModelTensor, and user can get all embedding types result.
  3. When user specified the response filter to $.embeddingsByType.{concrete_embedding_type}, the response will be captured similar to default case to the data field in ModelTensor.

Cohere V2 (A new v2 post process function is added to v2 model)

  1. The default case(response filter not set) is exactly same between v1 and v2.
  2. When user specified the response filter to $.embeddings, the response will be captured into dataAsMap in ModelTensor, and user can get all embedding types result.
  3. When user specified the response filter to $.embeddings.{concrete_embedding_type}, the response will be captured similar to default case to the data field in ModelTensor.

Related Issues

#3093

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: zane-neo <zaniu@amazon.com>
Copy link
Collaborator

@Zhangxunmt Zhangxunmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but can you add ITs for the added embedding types in this PR?

Signed-off-by: zane-neo <zaniu@amazon.com>
Signed-off-by: zane-neo <zaniu@amazon.com>
@zane-neo
Copy link
Collaborator Author

zane-neo commented Feb 21, 2025

lgtm, but can you add ITs for the added embedding types in this PR?

Added more ITs, also updated the blueprint under tutorial, please help review again.

@zane-neo
Copy link
Collaborator Author

The change should be compatible with neural-search plugin as the post_process_function are mainly to convert the response to a format that neural-search plugin accepts so that user can use the model to create ingestion pipeline, the PR in neural-search are merged now: opensearch-project/neural-search#1007

Signed-off-by: zane-neo <zaniu@amazon.com>
@@ -244,6 +247,15 @@ public static ModelTensors processOutput(
return ModelTensors.builder().mlModelTensors(modelTensors).build();
}

private static MLResultDataType parseMLResultDataTypeFromResponseFilter(String responseFilter) {
for (MLResultDataType type : MLResultDataType.values()) {
if (responseFilter.contains("." + type.name()) || responseFilter.contains("." + type.name().toLowerCase(Locale.ROOT))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using containsIgnoreCase to support type in a case-insensitivity way?

import org.opensearch.ml.common.output.model.ModelTensor;

public abstract class ConnectorPostProcessFunction<T> implements Function<Object, List<ModelTensor>> {
public abstract class ConnectorPostProcessFunction<T> implements BiFunction<Object, MLResultDataType, List<ModelTensor>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface change looks like a breaking change, and not all PostProcessFunctions are using the ResultDataType like BedrockBatchJobArnPostProcessFunction. So what I am thinking is 1. to use extend instead of implements to support additional parameter MLResultDataType, or 2. to have MLResultDataType covering all cases and all PostProcessFunctions to use/set it, for example, not use null as dateType but have a default type something like FLOAT32.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered to change the Function to a self-defined interface/abstract class, but that needs more code changes and I want to limit the changes in this PR for easier reviewing. I'm open to make these changes in this PR if you insist or create a separate PR for better reviewing experience.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for this pr, I am OK. We can have more discussion about if we need refactoring later.

…ately

Signed-off-by: zane-neo <zaniu@amazon.com>
import java.util.Map;

@Log4j2
public class RestBedRockV2PostProcessFunctionInferenceIT extends MLCommonsRestTestCase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nor merge into the BedRock Inference IT to share the IT resources, reducing the overall IT run time length?
https://github.com/opensearch-project/ml-commons/blob/main/plugin/src/test/java/org/opensearch/ml/rest/RestBedRockInferenceIT.java

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template of RestBedRockInferenceIT and RestBedRockV2PostProcessFunctionInferenceIT are different, so if we merge them together, the code will seem mess up, separating them is for better code reading experience.

Copy link
Collaborator

@Zhangxunmt Zhangxunmt Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, but this means a simple test case would need spinning up new IT cluster resources. Given the IT suites already takes 40+ mins, it's not scalable to keep adding new test classes. So organizing the same test categories into a same test suite sharing the resources is a better approach overall. Do you have any better ideas on this matter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can take a look again to find a way to merge them and maintain the readability in best effort.

import java.util.Map;
import java.util.Optional;

public class RestCohereInferenceIT extends MLCommonsRestTestCase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This IT can be merged into the RemoteInferenceIT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RemoteInferenceIT class file is already too big, I prefer we use separate file to test single feature for better maintenance.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, the spotless is not applied yet.

Signed-off-by: zane-neo <zaniu@amazon.com>
Signed-off-by: zane-neo <zaniu@amazon.com>
Copy link
Collaborator

@jngz-es jngz-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for changes.

@zane-neo zane-neo merged commit 26fc493 into opensearch-project:main Feb 28, 2025
6 of 10 checks passed
import org.opensearch.ml.common.input.MLInput;

public class RestCohereInferenceIT extends MLCommonsRestTestCase {
private final String COHERE_KEY = Optional.ofNullable(System.getenv("COHERE_KEY")).orElse("UzRF34a6gj0OKkvHOO6FZxLItv8CNpK5dFdCaUDW");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't expose this API key!!!

cc @ylwu-amzn

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zane-neo, let's avoid adding credentials in code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my private API key for testing, forgot to remove it in the PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git-secrets could be helpful here.

zane-neo added a commit to zane-neo/ml-commons that referenced this pull request Mar 4, 2025
* support embedding types

Signed-off-by: zane-neo <zaniu@amazon.com>

* Add ITs and blueprints in tutorial

Signed-off-by: zane-neo <zaniu@amazon.com>

* format code

Signed-off-by: zane-neo <zaniu@amazon.com>

* add licence header

Signed-off-by: zane-neo <zaniu@amazon.com>

* use containsIgnoreCase instead of checking capital & lower case separately

Signed-off-by: zane-neo <zaniu@amazon.com>

* merge post processing function IT to existing file

Signed-off-by: zane-neo <zaniu@amazon.com>

* format code

Signed-off-by: zane-neo <zaniu@amazon.com>

---------

Signed-off-by: zane-neo <zaniu@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants