Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatbot-rag-app: adds Kubernetes manifest and instructions #396

Merged
merged 7 commits into from
Mar 2, 2025

Conversation

codefromthecrypt
Copy link
Collaborator

Decided to action this so that we have a coherent experience between docker compose and k8s. This is as close as I could get it. If folks have feedback or a different direction, do tell!

Fixes #366

@codefromthecrypt
Copy link
Collaborator Author

note: each thing we do runs back into this. it would be great to have a way to quickly initialize elser not just installing it, but first time use without timeouts for several minutes #307

@codefromthecrypt
Copy link
Collaborator Author

I have work almost done to make this "normal k8s" local, but wanted to solve the timeout first. so I'll push commit after #397 is merged

@codefromthecrypt
Copy link
Collaborator Author

will bump this tomorrow or when an approver looks at #397

@codefromthecrypt codefromthecrypt changed the base branch from main to recover-from-timeout February 21, 2025 03:57
@codefromthecrypt
Copy link
Collaborator Author

rebased and changed to non-host network k8s. will leave this in draft until #397 is merged as using not-yet-deployed images in k8s is a pain.

Base automatically changed from recover-from-timeout to main February 21, 2025 12:13
@codefromthecrypt
Copy link
Collaborator Author

waiting to get the docker image smaller before "ready for review", as I noticed my network lagging #407

@codefromthecrypt
Copy link
Collaborator Author

ok things work in general, but I'm not seeing traces in kibana. I have to put this down for a bit as I have other more urgent things to address.

k8s/README.md Outdated

Note: If you haven't checked out this repository, all you need is one file:
```bash
wget https://raw.githubusercontent.com/elastic/elasticsearch-labs/refs/heads/main/docker/docker-compose-elastic.yml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is wrong file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

@codefromthecrypt
Copy link
Collaborator Author

Due to elasticon singapore and Sydney... while excited about this i am not finishing it this weekend. Maybe Tuesday

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
@codefromthecrypt
Copy link
Collaborator Author

hmm getting gcp auth errors will look into it

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
@codefromthecrypt
Copy link
Collaborator Author

GCP vertex now works. I will look into why traces aren't.

@bshetti I can't hold this PR captive for all issues, as once this is in it is easy to complete other topics. So, let's leave elastic cloud commentary for the next PR #379 This one is solving as-is for k8s, and it has been dozens of hours just on that!

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
@codefromthecrypt codefromthecrypt marked this pull request as ready for review March 2, 2025 03:20
@codefromthecrypt
Copy link
Collaborator Author

Also verified the kubernetes without chatbot-rag-app, rather with pydantic-ai and works fine

Screenshot 2025-03-02 at 11 19 51 AM

@codefromthecrypt
Copy link
Collaborator Author

create-index (doesn't use an LLM, just elastic)

Screenshot 2025-03-02 at 11 24 05 AM

chat (proves vertex works)

Screenshot 2025-03-02 at 11 25 12 AM

@codefromthecrypt
Copy link
Collaborator Author

in this case I followed the directions in the README with a completely blown away k8s (colima delete; colima start --cpu 8 --memory 16 --network-address --dns 8.8.8.8 --dns 8.8.4.4 --kubernetes --k3s-arg '--disable=local-storage,traefik,metrics-server@server:*'), so I'm very confident the GCP stuff works as nothing was dirty. Thanks for the tips, folks!

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
@codefromthecrypt
Copy link
Collaborator Author

OK, what I did was ran with the normal instructions, but azure openai (so no secret). It worked fine.
Screenshot 2025-03-02 at 2 03 44 PM

Then, I deleted the configmap and edited in the vertex settings to recreate it, then added the secret as README said, then applied and worked fine.

Screenshot 2025-03-02 at 2 05 54 PM

Thanks for the eagle eyes @anuraaga I think finally this one is ready to merge!

- name: gcloud-credentials
secret:
secretName: gcloud-credentials
optional: true # only read when `LLM_TYPE=vertex`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part allows vertex config to work, but others to not block on it. the optional applies indirectly to a mount that uses it, so no worries.

@codefromthecrypt codefromthecrypt merged commit 72835b0 into main Mar 2, 2025
4 checks passed
@codefromthecrypt codefromthecrypt deleted the k8s-chatbot-rag-app branch March 2, 2025 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chatbot-rag-app: add instructions for k8s deployment
4 participants