Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds initial controller implementation #71
Adds initial controller implementation #71
Changes from 2 commits
86ab274
9728720
7d7bd0d
35df174
31303db
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we want to call it
selected-backend
, the backend may not be necessarily selected by gateway. User can still specify the backend in the requestThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, the backend will be "calculated" and "selected" following the matching rule of LLMRoute, so the word "selected" makes sense regardless of who actually "select".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the internal use only, and not user facing, so it should be fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true, you might not be able to select the backend just based on the model.. For example there is case where the same anthropic models are supported by both google and aws. In this case user needs to set the backend header to determine where to route to.
see https://aws.amazon.com/bedrock/claude/ and https://cloud.google.com/solutions/anthropic?hl=en.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait a second, this is nothing to do with model...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say you havve
and clients send
curl -H "some-random-header-that-can-be-sent-directly-by-clients: foo"
then internally extproc sets the header$selectedBackendHeaderKey: some-backend-name
. That's what this does and this is the completely implementation detail. This package is not CRD but a configuration of extproc itself.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you are saying we are not providing the standard ai gateway backend routing header to user, they MUST define the matching rules on LLMRoute for each backend user creates. Is that correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes exactly currently, though we can provide some "canonical-backend-choosing-header-key" defined in the api/v1alpha package and prioritize the header value when present, which effectively ignores LLMRoute.Rules if the header exists. This is another topic we should discuss in another issue/pr if you want to support that. This configuration key is complete implementation detail regardless of whether or not we do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will raise an issue tomorrow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what exactly I was thinking to create those rules for user, if this routing header exists then we ignore the rules on llm route. I think this can provide better user experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since user can configure the model routing header name, we can say configured modelNameHeaderKey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, no, users do not choose the model routing header name at this point.
ai-gateway/api/v1alpha1/api.go
Line 206 in cb6b2e0
It should be good to make this part of the LLMRoute resource, but the comment here matches what it is now