Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclosed gRPC Channels in VertexAiTextEmbeddingModel: Channel Orphan Warnings #2059

Open
RyanHowell30 opened this issue Jan 9, 2025 · 6 comments

Comments

@RyanHowell30
Copy link

When using the Spring AI VertexAiTextEmbeddingModel class to create text embeddings, I see repeated warnings like the following in my logs:

Previous channel ManagedChannelImpl{...} was garbage collected without being shut down! Make sure to call shutdown()/shutdownNow()

These warnings indicate that a gRPC ManagedChannel created by the underlying Google Cloud client (PredictionServiceClient) is being garbage-collected without a proper call to close() or shutdown().

Inside VertexAiTextEmbeddingModel, the call(EmbeddingRequest request) method creates a new PredictionServiceClient instance on each call, but it never closes it. As a result, every ephemeral client spawns a gRPC channel that never gets shut down. Eventually, the channel is garbage-collected, triggering the “orphan channel” warnings in logs.

Other classes in Spring AI—such as VertexAiMultimodalEmbeddingModel—use a try-finally or a try-with-resources approach to ensure that each ephemeral PredictionServiceClient is closed after use, so they do not exhibit the same problem.

### This causes:

  • Logs are flooded with warnings about unclosed channels.

  • Possible resource leaks, as each PredictionServiceClient holds onto gRPC channels.

  • Performance overhead and potential memory usage issues from many channels staying alive longer than needed.

### Steps to Reproduce
Configure VertexAiTextEmbeddingModel as a Spring bean (for example, in a @configuration class).
Inject and repeatedly call the textEmbeddingModel.embed(...) or textEmbeddingModel.call(...) method on multiple requests.
Monitor application logs. Over time, you will see repeated warnings about an orphaned ManagedChannel or “Previous channel ... was garbage collected without being shut down!”

Technologies:

Spring AI version: 1.0.0-M5
Google Cloud AI libraries version: google-cloud-aiplatform: 3.40.0, gax-grpc: 2.46.1
Java version: 21
Running on local environment

@RyanHowell30
Copy link
Author

`import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;

import org.springframework.ai.embedding.EmbeddingRequest;
import org.springframework.ai.embedding.EmbeddingResponse;
import org.springframework.ai.model.ModelOptionsUtils;
import org.springframework.ai.vertexai.embedding.VertexAiEmbeddingConnectionDetails;
import org.springframework.ai.vertexai.embedding.VertexAiEmbeddingUtils;
import org.springframework.ai.vertexai.embedding.text.VertexAiTextEmbeddingModel;
import org.springframework.ai.vertexai.embedding.text.VertexAiTextEmbeddingOptions;

import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictRequest;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import lombok.extern.slf4j.Slf4j;

@slf4j
public class ClosingGRPCCHannelWarningsEmbeddingModel extends VertexAiTextEmbeddingModel {

// Hold our own local copy of connectionDetails
private final VertexAiEmbeddingConnectionDetails localConnectionDetails;

public ClientClosingTextEmbeddingModel(
        VertexAiEmbeddingConnectionDetails connectionDetails,
        VertexAiTextEmbeddingOptions defaultOptions
) {
    // Call the parent constructor
    super(connectionDetails, defaultOptions);

    // Store a reference locally
    this.localConnectionDetails = connectionDetails;
}

@Override
public EmbeddingResponse call(EmbeddingRequest request) {
    // 1) Merge the options (like the parent does)
    VertexAiTextEmbeddingOptions finalOptions = mergedOptions(request);

    // 2) Actually open and close the PredictionServiceClient via try-with-resources
    try (PredictionServiceClient client = createClientSafely()) {
        // 3) Build the PredictRequest
        EndpointName endpointName = localConnectionDetails.getEndpointName(finalOptions.getModel());
        PredictRequest.Builder predictRequestBuilder =
                getPredictRequestBuilder(request, endpointName, finalOptions);

        // 4) Call predict()
        PredictResponse rawResponse = client.predict(predictRequestBuilder.build());

        // 5) Convert the rawResponse to EmbeddingResponse
        return buildEmbeddingResponse(rawResponse, finalOptions);
    } catch (IOException e) {
        // If createClientSafely() fails
        throw new RuntimeException("Failed to create or close the PredictionServiceClient", e);
    }
}

/**
 * Create the client from connection details in a safe way for try-with-resources.
 */
private PredictionServiceClient createClientSafely() throws IOException {
    // Use localConnectionDetails, not super (which doesn't provide a getter).
    return PredictionServiceClient.create(localConnectionDetails.getPredictionServiceSettings());
}

/**
 * This is basically the parent's "mergedOptions(request)" logic.
 */
private VertexAiTextEmbeddingOptions mergedOptions(EmbeddingRequest request) {
    VertexAiTextEmbeddingOptions defaultOptions = getDefaultOptions();
    VertexAiTextEmbeddingOptions defaultOptionsCopy = VertexAiTextEmbeddingOptions.builder()
            .from(defaultOptions)
            .build();

    // The parent's code uses ModelOptionsUtils.merge(...) if request.getOptions() is non-null
    return ModelOptionsUtils.merge(request.getOptions(), defaultOptionsCopy, VertexAiTextEmbeddingOptions.class);
}

/**
 * Replicates the parent's logic for building the PredictRequest.
 */
public PredictRequest.Builder getPredictRequestBuilder(
        EmbeddingRequest request,
        EndpointName endpointName,
        VertexAiTextEmbeddingOptions finalOptions
) {
    PredictRequest.Builder predictRequestBuilder =
            PredictRequest.newBuilder().setEndpoint(endpointName.toString());

    // The parent uses VertexAiEmbeddingUtils to build parameters, e.g. dimensions, autoTruncate, etc.
    VertexAiEmbeddingUtils.TextParametersBuilder parametersBuilder =
            VertexAiEmbeddingUtils.TextParametersBuilder.of();

    if (finalOptions.getAutoTruncate() != null) {
        parametersBuilder.autoTruncate(finalOptions.getAutoTruncate());
    }
    if (finalOptions.getDimensions() != null) {
        parametersBuilder.outputDimensionality(finalOptions.getDimensions());
    }
    predictRequestBuilder.setParameters(VertexAiEmbeddingUtils.valueOf(parametersBuilder.build()));

    // For each input text
    for (int i = 0; i < request.getInstructions().size(); i++) {
        String text = request.getInstructions().get(i);
        VertexAiEmbeddingUtils.TextInstanceBuilder instanceBuilder =
                VertexAiEmbeddingUtils.TextInstanceBuilder.of(text)
                        .taskType(finalOptions.getTaskType().name());

        if (finalOptions.getTitle() != null && !finalOptions.getTitle().isBlank()) {
            instanceBuilder.title(finalOptions.getTitle());
        }
        predictRequestBuilder.addInstances(VertexAiEmbeddingUtils.valueOf(instanceBuilder.build()));
    }

    return predictRequestBuilder;
}

/**
 * Convert PredictResponse => EmbeddingResponse (replicating parent's logic).
 */
private EmbeddingResponse buildEmbeddingResponse(PredictResponse rawResponse,
        VertexAiTextEmbeddingOptions finalOptions) {
    List<org.springframework.ai.embedding.Embedding> embeddingsList = new ArrayList<>();
    int index = 0;
    int totalTokenCount = 0;

    for (com.google.protobuf.Value predictionValue : rawResponse.getPredictionsList()) {
        com.google.protobuf.Value embeddingsStruct =
                predictionValue.getStructValue().getFieldsOrThrow("embeddings");
        com.google.protobuf.Value statistics =
                embeddingsStruct.getStructValue().getFieldsOrThrow("statistics");
        com.google.protobuf.Value tokenCountVal =
                statistics.getStructValue().getFieldsOrThrow("token_count");
        totalTokenCount += (int) tokenCountVal.getNumberValue();

        com.google.protobuf.Value valuesVal =
                embeddingsStruct.getStructValue().getFieldsOrThrow("values");
        float[] vector = VertexAiEmbeddingUtils.toVector(valuesVal);
        embeddingsList.add(new org.springframework.ai.embedding.Embedding(vector, index++));
    }

    org.springframework.ai.embedding.EmbeddingResponseMetadata metadata =
            new org.springframework.ai.embedding.EmbeddingResponseMetadata();
    metadata.setModel(Objects.requireNonNull(finalOptions.getModel()));

    return new org.springframework.ai.embedding.EmbeddingResponse(embeddingsList, metadata);
}

private VertexAiTextEmbeddingOptions getDefaultOptions() {
    return super.defaultOptions;
}

}`

After writing this, the warnings have disappeared.

@markpollack
Copy link
Member

wow, thanks for the deep investigate. Getting back to triage after a break. Would you be able to create a PR to address this please?

@RyanHowell30
Copy link
Author

@markpollack Yea sure no problem

@RyanHowell30
Copy link
Author

RyanHowell30 commented Jan 22, 2025

@markpollack I am getting permission denied when trying to push my changes. Do you have any idea of how to get around this?

Change:

Modified call(...) with Try-With-Resources
try (PredictionServiceClient client = createPredictionServiceClient()) { ... }

When the block finishes, the gRPC channel is cleanly shut down.

@charlie-ang-collibra
Copy link

Thanks for your work here, @RyanHowell30 ! I walked into this issue today, and am glad you already identified it. I hope you don't mind if I lift your implementation in the meantime. There's just no easy way (with the private/package methods, the lambda call) to make just that change.

🙏

@RyanHowell30
Copy link
Author

@charlie-ang-collibra go for it man. Let me know if you need anymore assistance. Just put the prediction service client in a try-with-resources. Then everywhere you call VertexAiTextEmbeddingModel, replace that with your custom class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants