Skip to content

Commit

Permalink
Adapt guardrails-usvc
Browse files Browse the repository at this point in the history
Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
  • Loading branch information
lianhao committed Jan 17, 2025
1 parent ab51131 commit d370598
Show file tree
Hide file tree
Showing 4 changed files with 94 additions and 41 deletions.
53 changes: 34 additions & 19 deletions helm-charts/common/guardrails-usvc/README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,54 @@
# guardrails-usvc

Helm chart for deploying LLM microservice.
Helm chart for deploying Guardrails microservice.

guardrails-usvc depends on TGI, you should set TGI_LLM_ENDPOINT as tgi endpoint.
## Installing the chart

## (Option1): Installing the chart separately
`guardrails-usvc` depends on the following inference backend services:

First, you need to install the tgi chart, please refer to the [tgi](../tgi) chart for more information. Please use model `meta-llama/Meta-Llama-Guard-2-8B` during installation.
- TGI: please refer to [tgi](../tgi) chart for more information

After you've deployted the tgi chart successfully, please run `kubectl get svc` to get the tgi service endpoint, i.e. `http://tgi`.
### Use Meta Llama Guard models(default):

To install the chart, run the following:
First, you need to install `tgi` helm chart using the model `meta-llama/Meta-Llama-Guard-2-8B`.

After you've deployed the dependent chart successfully, please run `kubectl get svc` to get the backend inference service endpoint, e.g. `http://tgi`.

To install the `guardrails-usvc` chart, run the following:

```console
cd GenAIInfra/helm-charts/common/guardrails-usvc
helm dependency update
export HFTOKEN="insert-your-huggingface-token-here"
export SAFETY_GUARD_ENDPOINT="http://tgi"
export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
helm dependency update
helm install guardrails-usvc . --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set SAFETY_GUARD_ENDPOINT=${SAFETY_GUARD_ENDPOINT} --set SAFETY_GUARD_MODEL_ID=${SAFETY_GUARD_MODEL_ID} --wait
export GUARDRAILS_BACKEND="LLAMA"
helm install guardrails-usvc . --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set SAFETY_GUARD_ENDPOINT=${SAFETY_GUARD_ENDPOINT} --set SAFETY_GUARD_MODEL_ID=${SAFETY_GUARD_MODEL_ID} --set GUARDRAILS_BACKEND=${GUARDRAILS_BACKEND} --wait
```

## (Option2): Installing the chart with dependencies automatically
### Use Allen Institute AI's WildGuard models:

First, you need to install `tgi` helm chart using the model `allenai/wildguard`.

After you've deployed the dependent chart successfully, please run `kubectl get svc` to get the backend inference service endpoint, e.g. `http://tgi`.

To install the `guardrails-usvc` chart, run the following:

```console
cd GenAIInfra/helm-charts/common/guardrails-usvc
export HFTOKEN="insert-your-huggingface-token-here"
helm dependency update
helm install guardrails-usvc . --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set tgi-guardrails.enabled=true --wait
export HFTOKEN="insert-your-huggingface-token-here"
export SAFETY_GUARD_ENDPOINT="http://tgi"
export SAFETY_GUARD_MODEL_ID="allenai/wildguard"
export GUARDRAILS_BACKEND="WILD"
helm install guardrails-usvc . --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set SAFETY_GUARD_ENDPOINT=${SAFETY_GUARD_ENDPOINT} --set SAFETY_GUARD_MODEL_ID=${SAFETY_GUARD_MODEL_ID} --set GUARDRAILS_BACKEND=${GUARDRAILS_BACKEND} --wait
```

## Verify

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/guardrails-usvc 9090:9090` to expose the llm-uservice service for access.
Then run the command `kubectl port-forward svc/guardrails-usvc 9090:9090` to expose the guardrails-usvc service for access.

Open another terminal and run the following command to verify the service if working:

Expand All @@ -47,10 +61,11 @@ curl http://localhost:9090/v1/guardrails \

## Values

| Key | Type | Default | Description |
| ------------------------------- | ------ | ------------------------------------ | ------------------------------------------------ |
| global.HUGGINGFACEHUB_API_TOKEN | string | `""` | Your own Hugging Face API token |
| image.repository | string | `"opea/guardrails-usvc"` | |
| service.port | string | `"9090"` | |
| SAFETY_GUARD_ENDPOINT | string | `""` | LLM endpoint |
| SAFETY_GUARD_MODEL_ID | string | `"meta-llama/Meta-Llama-Guard-2-8B"` | Model ID for the underlying LLM service is using |
| Key | Type | Default | Description |
| ------------------------------- | ------ | ------------------------------------ | --------------------------------------------------------------- |
| global.HUGGINGFACEHUB_API_TOKEN | string | `""` | Your own Hugging Face API token |
| image.repository | string | `"opea/guardrails-usvc"` | |
| service.port | string | `"9090"` | |
| SAFETY_GUARD_ENDPOINT | string | `""` | LLM endpoint |
| SAFETY_GUARD_MODEL_ID | string | `"meta-llama/Meta-Llama-Guard-2-8B"` | Model ID for the underlying LLM service is using |
| GUARDRAIL_BACKEND | string | `"LLAMA"` | different gaurdrail model family to use, one of `LLAMA`, `WILD` |
7 changes: 7 additions & 0 deletions helm-charts/common/guardrails-usvc/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,13 @@ data:
SAFETY_GUARD_ENDPOINT: "http://{{ .Release.Name }}-tgi-guardrails"
{{- end }}
SAFETY_GUARD_MODEL_ID: {{ .Values.SAFETY_GUARD_MODEL_ID | quote }}
{{- if eq "LLAMA" .Values.GUARDRAIL_BACKEND }}
GUARDRAILS_COMPONENT_NAME: "OPEA_LLAMA_GUARD"
{{- else if eq "WILD" .Values.GUARDRAIL_BACKEND }}
GUARDRAILS_COMPONENT_NAME: "OPEA_WILD_GUARD"
{{- else }}
{{- cat "Invalid GUARDRAIL_BACKEND:" .Values.GUARDRAIL_BACKEND | fail }}
{{- end }}
HUGGINGFACEHUB_API_TOKEN: {{ .Values.global.HUGGINGFACEHUB_API_TOKEN | quote}}
HF_HOME: "/tmp/.cache/huggingface"
LOGFLAG: {{ .Values.LOGFLAG | quote }}
Expand Down
32 changes: 31 additions & 1 deletion helm-charts/common/guardrails-usvc/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,38 @@ spec:
serviceAccountName: {{ include "guardrails-usvc.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
initContainers:
- name: wait-for-llm
envFrom:
- configMapRef:
name: {{ include "guardrails-usvc.fullname" . }}-config
{{- if .Values.global.extraEnvConfig }}
- configMapRef:
name: {{ .Values.global.extraEnvConfig }}
optional: true
{{- end }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: busybox:1.36
command: ["sh", "-c"]
args:
- |
proto=$(echo ${SAFETY_GUARD_ENDPOINT} | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\1/p');
host=$(echo ${SAFETY_GUARD_ENDPOINT} | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\2/p');
port=$(echo ${SAFETY_GUARD_ENDPOINT} | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\3/p');
if [ -z "$port" ]; then
port=80;
[[ "$proto" = "https" ]] && port=443;
fi;
retry_count={{ .Values.retryCount | default 60 }};
j=1;
while ! nc -z ${host} ${port}; do
[[ $j -ge ${retry_count} ]] && echo "ERROR: ${host}:${port} is NOT reachable in $j seconds!" && exit 1;
j=$((j+1)); sleep 1;
done;
echo "${host}:${port} is reachable within $j seconds.";
containers:
- name: {{ .Release.Name }}
- name: {{ .Chart.Name }}
envFrom:
- configMapRef:
name: {{ include "guardrails-usvc.fullname" . }}-config
Expand Down
43 changes: 22 additions & 21 deletions helm-charts/common/guardrails-usvc/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,29 @@
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

tgi-guardrails:
enabled: false
LLM_MODEL_ID: "meta-llama/Meta-Llama-Guard-2-8B"
# Configurations for OPEA microservice guardrails-usvc
# Set it as a non-null string, such as true, if you want to enable logging.
LOGFLAG: ""

replicaCount: 1
# settings for guardrails service
# guardrail model family to use:
# default is Meta's Llama Guard
GUARDRAIL_BACKEND: "LLAMA"
# Guard Model ID, should be same as the TGI's LLM_MODEL_ID
SAFETY_GUARD_MODEL_ID: "meta-llama/Meta-Llama-Guard-2-8B"

# Uncomment and set the following settings to use Allen Institute AI's WildGuard
# GUARDRAIL_BACKEND: "WILD"
# Guard Model ID, should be same as the TGI's LLM_MODEL_ID
# SAFETY_GUARD_MODEL_ID: "allenai/wildguard"

# TGI service endpoint
SAFETY_GUARD_ENDPOINT: ""
# Guard Model Id
SAFETY_GUARD_MODEL_ID: "meta-llama/Meta-Llama-Guard-2-8B"
# Set it as a non-null string, such as true, if you want to enable logging facility,
# otherwise, keep it as "" to disable it.
LOGFLAG: ""

replicaCount: 1

image:
repository: opea/guardrails-tgi
repository: opea/guardrails
# Uncomment the following line to set desired image pull policy if needed, as one of Always, IfNotPresent, Never.
# pullPolicy: ""
# Overrides the image tag whose default is the chart appVersion.
Expand Down Expand Up @@ -62,24 +69,13 @@ service:
port: 9090

resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
requests:
cpu: 100m
memory: 128Mi

livenessProbe:
httpGet:
path: v1/health_check
port: guardrails-usvc
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 24
readinessProbe:
httpGet:
path: v1/health_check
Expand Down Expand Up @@ -109,3 +105,8 @@ global:
# If set, it will overwrite serviceAccount.name.
# If set, and serviceAccount.create is false, it will assume this service account is already created by others.
sharedSAName: ""

# for CI tests only
tgi-guardrails:
enabled: false
LLM_MODEL_ID: "meta-llama/Meta-Llama-Guard-2-8B"

0 comments on commit d370598

Please sign in to comment.