Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: do not embed HTTPRoute as-is #47

Merged
merged 5 commits into from
Dec 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 63 additions & 9 deletions api/v1alpha1/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ import (
// This serves as a way to define a "unified" LLM API for a Gateway which allows downstream
// clients to use a single schema API to interact with multiple LLM backends.
//
// The InputSchema is used to determine the structure of the requests that the Gateway will
// The inputSchema field is used to determine the structure of the requests that the Gateway will
// receive. And then the Gateway will route the traffic to the appropriate LLMBackend based
// on the output schema of the LLMBackend while doing the other necessary jobs like
// upstream authentication, rate limit, etc.
Expand Down Expand Up @@ -47,18 +47,72 @@ type LLMRouteSpec struct {
// +kubebuilder:validation:Required
// +kubebuilder:validation:XValidation:rule="self.schema == 'OpenAI'"
APISchema LLMAPISchema `json:"inputSchema"`
// HTTPRoute is the base HTTPRouteSpec (https://gateway-api.sigs.k8s.io/api-types/httproute/) in
// the Gateway API on which this LLMRoute will be implemented. AI Gateway controller will generate a HTTPRoute based
// on the configuration given here with the additional modifications to achieve the necessary jobs,
// notably inserting the AI Gateway external processor filter.
// Rules is the list of LLMRouteRule that this LLMRoute will match the traffic to.
// Each rule is a subset of the HTTPRoute in the Gateway API (https://gateway-api.sigs.k8s.io/api-types/httproute/).
//
// In the matching rules in the HTTPRoute here, `x-envoy-ai-gateway-llm-model` header
// can be used to describe the routing behavior.
// AI Gateway controller will generate a HTTPRoute based on the configuration given here with the additional
// modifications to achieve the necessary jobs, notably inserting the AI Gateway external processor filter.
//
// Currently, only the exact header matching is supported, otherwise the configuration will be rejected.
// In the matching conditions in the LLMRouteRule, `x-envoy-ai-gateway-llm-model` header is available
// if we want to describe the routing behavior based on the model name. The model name is extracted
// from the request content before the routing decision.
//
// How multiple rules are matched is the same as the Gateway API. See for the details:
// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPRoute
//
// +kubebuilder:validation:Required
// +kubebuilder:validation:MaxItems=128
Rules []LLMRouteRule `json:"rules"`
}

// LLMRouteRule is a rule that defines the routing behavior of the LLMRoute.
type LLMRouteRule struct {
// BackendRefs is the list of LLMBackend that this rule will route the traffic to.
// Each backend can have a weight that determines the traffic distribution.
//
// The namespace of each backend is "local", i.e. the same namespace as the LLMRoute.
//
// +optional
// +kubebuilder:validation:MaxItems=128
BackendRefs []LLMRouteRuleBackendRef `json:"backendRefs,omitempty"`

// Matches is the list of LLMRouteMatch that this rule will match the traffic to.
// This is a subset of the HTTPRouteMatch in the Gateway API. See for the details:
// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPRouteMatch
//
// +optional
// +kubebuilder:validation:MaxItems=128
Matches []LLMRouteRuleMatch `json:"matches,omitempty"`
}

// LLMRouteRuleBackendRef is a reference to a LLMBackend with a weight.
type LLMRouteRuleBackendRef struct {
// Name is the name of the LLMBackend.
//
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinLength=1
Name string `json:"name"`

// Weight is the weight of the LLMBackend. This is exactly the same as the weight in
// the BackendRef in the Gateway API. See for the details:
// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.BackendRef
//
// +kubebuilder:validation:Required
HTTPRoute gwapiv1.HTTPRouteSpec `json:"httpRoute"`
// +kubebuilder:validation:Minimum=0
Weight int `json:"weight"`
}

type LLMRouteRuleMatch struct {
// Headers specifies HTTP request header matchers. See HeaderMatch in the Gateway API for the details:
// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPHeaderMatch
//
// Currently, only the exact header matching is supported.
//
// +listType=map
// +listMapKey=name
// +optional
// +kubebuilder:validation:MaxItems=16
Headers []gwapiv1.HTTPHeaderMatch `json:"headers,omitempty"`
}

// +kubebuilder:object:root=true
Expand Down
73 changes: 72 additions & 1 deletion api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading