Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Helm Charts for Txt2Img and SearchQnA. #596

Closed
joshuayao opened this issue Nov 21, 2024 · 11 comments
Closed

[Feature] Helm Charts for Txt2Img and SearchQnA. #596

joshuayao opened this issue Nov 21, 2024 · 11 comments
Assignees
Labels
feature New feature or request
Milestone

Comments

@joshuayao
Copy link
Collaborator

No description provided.

@joshuayao joshuayao added this to the v1.2 milestone Nov 21, 2024
@joshuayao joshuayao added this to OPEA Nov 21, 2024
@joshuayao joshuayao added the feature New feature or request label Nov 21, 2024
@yongfengdu
Copy link
Collaborator

Any priority list for the GenAIExamples? If we don't have enough resource to add all examples, we need priority list to decide which to go first.
Besides adding more Examples, I think the most important thing to do is fix issues and enhance the high priority Examples/Components to make them ready for production.

@joshuayao
Copy link
Collaborator Author

Hi @yongfengdu, is your team working on this?

@yongfengdu
Copy link
Collaborator

As I know the GenAIExamples is under refactoring for 1.2, and helm-chart will need to change accordingly.
It's not wise to add more examples if they will be merged soon.
I'd propose to postpone the "more examples support" to 1.3 release, since it would be a waste of engineering resource if we're doing sth like this:
#590 - Add more microservices for docsum
#659 - Reduce microservices in docsum

The later one removed >80% of the code for previous PR.

@joshuayao joshuayao moved this to In progress in OPEA Jan 8, 2025
@yongfengdu
Copy link
Collaborator

Walked through the GenAIExamples' current list, here is the summary for gaps:

  • Already supported, will follow up with update
    AgentQnA, AudioQnA, ChatQnA, CodeGen, CodeTrans, DocSum, FaqGen, VisualQnA

  • Not yet supported, but is planning to do (5 Examples).
    AvatarChatBot - AudioQnA+wav2lip+animation, 2 more microservices required.
    DBQnA - Use LLM to Generate SQL. CPU only. More components required postpres, text2sql-service
    GraphRAG - Variants of ChatQnA, neo4j graph database required, Gaudi only
    SearchQnA - Use webretriever. Web-retriever supported, CPU only
    Text2Image - No compose file, Components only

  • Not support. no plan to support.
    DocIndexRetriever - Not an E2E example, part of AgentQnA
    EdgeCraftRag - ARC GPU only, all functions are implemented in edgecraftrag-server, not using GenAIComps. Less likely to run on k8s environmetn, defer the support or more discuss/requirement needed.
    InstructionTuning/RerankingTuning - Not E2E example, No compose file
    MultimodalQnA - To be merged with VisualQnA
    ProductivitySuite - To be merged with ERAG
    Translation - To be merged with CodeTrans
    VideoQnA - To be merged with VisualQnA

@eero-t
Copy link
Contributor

eero-t commented Jan 15, 2025

  • Already supported, will follow up with update
    AgentQnA, AudioQnA, ChatQnA, CodeGen, CodeTrans, DocSum, FaqGen, VisualQnA
    ...
  • Not support. no plan to support.
    DocIndexRetriever - Not an E2E example, part of AgentQnA
    EdgeCraftRag - ARC GPU only, all functions are implemented in edgecraftrag-server, not using GenAIComps. Less likely to run on k8s environmetn, defer the support or more discuss/requirement needed.

While EdgeCraftRag may not be important, I think Intel GPU support is important.

(Gaudi is too expensive for normal devs, so if Intel GPU support is not available, some other manufacturer's GPUs are used for acceleration, and that dev is lost for Intel.)

FYI: there's old open PR for adding vLLM OpenVINO / GPU support for ChatQnA: #403

@vrantala
Copy link

To be added to a list. Helm chart for vLLM support for DocSum service is required.

@joshuayao joshuayao changed the title [Feature] Helm Charts for all remaining GenAIExamples [Feature] Helm Charts for remaining GenAIExamples Jan 16, 2025
@eero-t
Copy link
Contributor

eero-t commented Jan 16, 2025

To be added to a list. Helm chart for vLLM support for DocSum service is required.

That was merged few hours ago: #649

ChatQnA vLLM support was merged last month: #610

However, currently ChatQnA uses vLLM only for LLM. Embedding, reranking, and guardrails still uses TEI / TGI. There was PR to add vLLM embedding support, but review had comment that it might not be needed: opea-project/GenAIExamples#1237

And Helm charts for all other applications are lacking vLLM support.

@yongfengdu
Copy link
Collaborator

AgentQnA will add vLLM support(LLM only): #715

@yongfengdu
Copy link
Collaborator

We added 2 new examples for 1.2: Txt2Img and SearchQnA.

The other 3 be candidates for v1.3 release(May change according to GenAIExamples' plan):
AvatarChatBot - AudioQnA+wav2lip+animation, 2 more microservices required.
DBQnA - Use LLM to Generate SQL. CPU only. More components required postpres, text2sql-service
GraphRAG - Variants of ChatQnA, neo4j graph database required, Gaudi only

@yongfengdu
Copy link
Collaborator

It should be easy to add vLLM OpenVINO support in current vllm helm chart with just a new defined openvino-values.yaml.
To make sure the parameters are set correctly and tests are the trouble.(No GPUs env)

While EdgeCraftRag may not be important, I think Intel GPU support is important.

(Gaudi is too expensive for normal devs, so if Intel GPU support is not available, some other manufacturer's GPUs are used for acceleration, and that dev is lost for Intel.)

FYI: there's old open PR for adding vLLM OpenVINO / GPU support for ChatQnA: #403

@joshuayao joshuayao changed the title [Feature] Helm Charts for remaining GenAIExamples [Feature] Helm Charts for Txt2Img and SearchQnA. Jan 21, 2025
@github-project-automation github-project-automation bot moved this from In progress to Done in OPEA Jan 22, 2025
@mkbhanda
Copy link
Collaborator

Please note that AMD will help on updating multiple GenAIExamples to use vLLM in v1.3, to avoid duplicate work let us create issues for each and ask them to self-assign half.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
Status: Done
Development

No branches or pull requests

6 participants