-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Helm Charts for Txt2Img and SearchQnA. #596
Comments
Any priority list for the GenAIExamples? If we don't have enough resource to add all examples, we need priority list to decide which to go first. |
Hi @yongfengdu, is your team working on this? |
As I know the GenAIExamples is under refactoring for 1.2, and helm-chart will need to change accordingly. The later one removed >80% of the code for previous PR. |
Walked through the GenAIExamples' current list, here is the summary for gaps:
|
While EdgeCraftRag may not be important, I think Intel GPU support is important. (Gaudi is too expensive for normal devs, so if Intel GPU support is not available, some other manufacturer's GPUs are used for acceleration, and that dev is lost for Intel.) FYI: there's old open PR for adding vLLM OpenVINO / GPU support for ChatQnA: #403 |
To be added to a list. Helm chart for vLLM support for DocSum service is required. |
That was merged few hours ago: #649 ChatQnA vLLM support was merged last month: #610 However, currently ChatQnA uses vLLM only for LLM. Embedding, reranking, and guardrails still uses TEI / TGI. There was PR to add vLLM embedding support, but review had comment that it might not be needed: opea-project/GenAIExamples#1237 And Helm charts for all other applications are lacking vLLM support. |
AgentQnA will add vLLM support(LLM only): #715 |
We added 2 new examples for 1.2: Txt2Img and SearchQnA. The other 3 be candidates for v1.3 release(May change according to GenAIExamples' plan): |
It should be easy to add vLLM OpenVINO support in current vllm helm chart with just a new defined openvino-values.yaml.
|
Please note that AMD will help on updating multiple GenAIExamples to use vLLM in v1.3, to avoid duplicate work let us create issues for each and ask them to self-assign half. |
No description provided.
The text was updated successfully, but these errors were encountered: