Skip to content

Commit

Permalink
📝 minor updates in readme files
Browse files Browse the repository at this point in the history
Signed-off-by: Krishna Murti <krishna.murti@intel.com>
  • Loading branch information
krish918 committed Oct 10, 2024
1 parent 9b12618 commit e890448
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
4 changes: 2 additions & 2 deletions helm-charts/chatqna/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ For LLM inference, two more microservices will be required. We can either use [T
- [llm-ctrl-uservice](../common/llm-ctrl-uservice/README.md)
- [vllm](../common/vllm/README.md)

> **Note:** We shouldn't have both inference engine in our setup. We have to setup either of them. For this, conditional flags are added in the chart dependency. We will be switching off flag corresponding to one service and switching on the other, in order to have a proper setup of all ChatQnA dependencies.
> **_NOTE :_** We shouldn't have both inference engine deployed. It is required to only setup either of them. To achieve this, conditional flags are added in the chart dependency. We will be switching off flag corresponding to one service and switching on the other, in order to have a proper setup of all ChatQnA dependencies.
## Installing the Chart

Expand Down Expand Up @@ -76,7 +76,7 @@ helm install chatqna chatqna --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --
helm install chatqna chatqna --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set global.modelUseHostPath=${MODELDIR} -f chatqna/guardrails-gaudi-values.yaml
```

> **_NOTE:_** Default installation will use [TGI (Text Generation Inference)](https://github.com/huggingface/text-generation-inference) as inference engine. To use vLLM as inference engine, please see below.
> **_NOTE :_** Default installation will use [TGI (Text Generation Inference)](https://github.com/huggingface/text-generation-inference) as inference engine. To use vLLM as inference engine, please see below.
```bash
# To use vLLM inference engine on XEON device
Expand Down
12 changes: 6 additions & 6 deletions helm-charts/common/llm-ctrl-uservice/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# llm-ctrl Microservice

Helm chart for deploying a microservice which facilitates connections and handles responses from OpenVINO vLLM microservice.
Helm chart for deploying LLM controller microservice which facilitates connections and handles responses from OpenVINO vLLM microservice.

`llm-ctrl-uservice` depends on OpenVINO vLLM. You should properly set `vLLM_ENDPOINT` as the HOST URI of vLLM microservice. If not set, it will consider the default value : `http://<helm-release-name>-vllm-openvino:80`
`llm-ctrl-uservice` depends on vLLM microservice. You should properly set `vLLM_ENDPOINT` as the HOST URI of vLLM microservice. If not set, it will consider the default value : `http://<helm-release-name>-vllm:80`

As this service depends on vLLM microservice, we can proceed in either of 2 ways:

- Install both microservices separately one after another.
- Install the vLLM microservice as dependency for the our main `llm-ctrl-uservice` microservice.
- Install both microservices individually.
- Install the vLLM microservice as dependency for `llm-ctrl-uservice` microservice.

## (Option 1): Installing the chart separately:
## (Option 1): Installing the charts individually:

First, you need to install the `vllm-openvino` chart, please refer to the [vllm](../vllm) chart for more information.
First, you need to install the `vllm` chart, please refer to the [vllm](../vllm) chart for more information.

After you've deployed the `vllm` chart successfully, please run `kubectl get svc` to get the vLLM service name with port. We need to provide this to `llm-ctrl-uservice` as a value for vLLM_ENDPOINT for letting it discover and connect to the vLLM microservice.

Expand Down

0 comments on commit e890448

Please sign in to comment.