Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sending traces to Jaeger with minikube #11

Closed
wants to merge 1 commit into from

Conversation

helio-frota
Copy link
Contributor

now getting the following:

received message with invalid compression flag: 60 (valid flags are 0 and 1)

2024-12-18T20:33:22.232029Z  INFO actix_web::middleware::logger: ::ffff:10.244.0.1 "GET /health/ready HTTP/1.1" 200 34 "-" "kube-probe/1.31" 0.000054
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Internal error): , detailed error message: protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 200 OK
2024-12-18T20:33:32.232518Z  INFO actix_web::middleware::logger: ::ffff:10.244.0.1 "GET /health/live HTTP/1.1" 200 2 "-" "kube-probe/1.31" 0.000101
2024-12-18T20:33:32.232969Z  INFO actix_web::middleware::logger: ::ffff:10.244.0.1 "GET /health/ready HTTP/1.1" 200 34 "-" "kube-probe/1.31" 0.000075
2024-12-18T20:33:42.232622Z  INFO actix_web::middleware::logger: ::ffff:10.244.0.1 "GET /health/ready HTTP/1.1" 200 34 "-" "kube-probe/1.31" 0.000101
2024-12-18T20:33:42.233321Z  INFO actix_web::middleware::logger: ::ffff:10.244.0.1 "GET /health/live HTTP/1.1" 200 2 "-" "kube-probe/1.31" 0.000130
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Internal error): , detailed error message: protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 200 OK

@helio-frota
Copy link
Contributor Author

converted to draft...

2024-12-18_21-27

➜  ~ kubectl port-forward svc/infrastructure-jaeger-collector 4317:4317 -n trustify
Forwarding from 127.0.0.1:4317 -> 4317
Forwarding from [::1]:4317 -> 4317
Handling connection for 4317

Good part -> jaeger all in one inside minikube is working

  • sending traces from trustify-locally to -> jaeger-minikube - OK
  • sending traces from trustify-minikube to -> jaeger-minikube - Error

Bad part -> now I'm not 100% sure if we should merge this pr..

@@ -44,6 +44,8 @@ Arguments (dict):
value: parentbased_traceidratio
- name: OTEL_TRACES_SAMPLER_ARG
value: "0.1"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://jaeger{{ .root.Values.appDomain }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah great!

Can we introduce a new define for this. Maybe in the endpoints. And allow overriding this default from the tracing field/object in the values file?

Copy link
Contributor Author

@helio-frota helio-frota Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷

I have no idea what you are talking about 😁

enabling this ... apparently ... trustify-minikube -- can send to --> jaeger-minikube as I understood 'localhost' doesn't exist inside that environment [??] . but we end up with received message with invalid compression flag: 60 (valid flags are 0 and 1) which apparently is a thing caused/handled by tonic https://github.com/helio-frota/otel-actix-example/blob/main/src/otel.rs#L27 , and then:

Copy link
Contributor

@ctron ctron Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, my proposal is, instead of hardcoding http://jaeger{{ .root.Values.appDomain }}, we'd need some way to provide a link to the jaeger Service (cluster internal) address, not necessarily the Ingress (cluster external) address. Technically ingress works too, but that might mean all traffic goes through the external load balancer. Which is unnecessary.

If the jaeger instance is in the same namespace, something like http://jeager (where jaeger is the name of the Service object) would work. If it is in a different k8s namespace, then we need something like http://<jaeger.><namespace>.svc.cluster.local (where jaeger is the Service name, and namespace the namespace).

However, if the instance is something completely different, we should support that as well. Meaning, allow the user to provide the full endpoint and just accept is as it. If we can provide a static default in the values files, helm already provides us with the override. So we should:

  • Define a reasonable default in the values file, based on the setup of the infrastructure helm chart. e.g. in .tracing.jaegerEndpoint.
  • Allow the user to override that value through helm

In order to not hardcode this in every place, we should:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you right, the pods are in the same namespace trustify and we can't hardcode this http://jaeger
Thanks for the clarification 👍

@helio-frota
Copy link
Contributor Author

helio-frota commented Dec 19, 2024

When using this name infrastructure-jaeger-collector:

kubectl get svc -n trustify
NAME                               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                    AGE
infrastructure-jaeger-agent        ClusterIP   None            <none>        5775/UDP,5778/TCP,6831/UDP,6832/UDP                        46m
infrastructure-jaeger-collector    ClusterIP   None            <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP,4317/TCP,4318/TCP   46m
infrastructure-jaeger-query        ClusterIP   None            <none>        16686/TCP,16685/TCP                                        46m
infrastructure-keycloak            ClusterIP   10.105.252.69   <none>        80/TCP                                                     46m
infrastructure-keycloak-headless   ClusterIP   None            <none>        8080/TCP                                                   46m
infrastructure-postgresql          ClusterIP   10.106.105.32   <none>        5432/TCP                                                   46m
infrastructure-postgresql-hl       ClusterIP   None            <none>        5432/TCP                                                   46m
server                             ClusterIP   10.98.51.199    <none>        80/TCP                                                     34m
- name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
  value: http://infrastructure-jaeger-collector{{ .root.Values.appDomain }}

Got a different error message:

OpenTelemetry trace error occurred. 
Exporter otlp encountered the following error(s): the grpc server returns error (Internal error): ,
detailed error message: 
protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) 
while receiving response with status: 404 Not Found

The previous error message:

OpenTelemetry trace error occurred. 
Exporter otlp encountered the following error(s): the grpc server returns error (Internal error): , 
detailed error message: 
protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) 
while receiving response with status: 200 OK  < -----------------------------

And this the original issue message:

OpenTelemetry trace error occurred. 
Exporter otlp encountered the following error(s): the grpc server returns error 
(The service is currently unavailable): ,
detailed error message: 
tcp connect error: Connection refused (os error 111)

The default is localhost:4317 which is currently unavailable inside minikube with the current configs... then once we config something we have 200 ok and 404 Not Found... if the error messages are not trolling us, then probably next steps are

a) merge this PR
b) to start the investigation around the compression message

@helio-frota
Copy link
Contributor Author

helio-frota commented Dec 19, 2024

Apparently we can discard ubi9 native dependencies as this example works
https://github.com/helio-frota/otel-actix-example?tab=readme-ov-file#otel-actix-example

@helio-frota
Copy link
Contributor Author

helio-frota commented Dec 23, 2024

progress, able to reproduce the error https://gist.github.com/helio-frota/15caa779a349df7777d1c9dfbb934742

update:

Actually none of these worked https://github.com/helio-frota/otel-actix-example/blob/main/src/otel.rs#L26-L28 when I deploy on minikube and part of trustify namespace...

And I got no flag: 60 error, only the errors listed here https://gist.github.com/helio-frota/15caa779a349df7777d1c9dfbb934742

@helio-frota
Copy link
Contributor Author

This comment start to make sense to me: hyperium/tonic#1690

  • a) Flag 60 error following these steps
  • b) When we port-forward the jaeger-collector directly, the error gone
    • port-forward svc/infra-jaeger-collector 4317:4317 -n trustify

Seems that a) is handled by nginx without TLS.

@helio-frota
Copy link
Contributor Author

helio-frota commented Jan 2, 2025

Ok, my proposal is, instead of hardcoding http://jaeger{{ .root.Values.appDomain }}, we'd need some way to provide a link to the jaeger Service (cluster internal) address, not necessarily the Ingress (cluster external) address. Technically ingress works too, but that might mean all traffic goes through the external load balancer. Which is unnecessary.

this worked, no flag 60 error 👍

I installed my example with

  • helm install otel-actix ./charts/app --set otlpEndpoint="http://infra-jaeger-collector:4317"
    instead of
  • helm install otel-actix ./charts/app --set otlpEndpoint="http://jaeger.192.168.39.78.nip.io"

we'd need some way to provide a link to the jaeger Service (cluster internal) address, not necessarily the Ingress (cluster external) address.

Based on the list bellow:

  • jaeger Service (cluster internal) address -> infra-jaeger-collector
  • not necessarily the Ingress (cluster external) address -> infra-jaeger-query
otel-actix-example git:(main) ✗ kubectl get ingress
NAME                 CLASS   HOSTS                          ADDRESS          PORTS   AGE
infra-jaeger-query   nginx   jaeger.192.168.39.237.nip.io   192.168.39.237   80      64sotel-actix-example git:(main) ✗ kubectl get svc
NAME                     TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                                                    AGE
infra-jaeger-agent       ClusterIP   None         <none>        5775/UDP,5778/TCP,6831/UDP,6832/UDP                        110s
infra-jaeger-collector   ClusterIP   None         <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP,4317/TCP,4318/TCP   110s
infra-jaeger-query       ClusterIP   None         <none>        16686/TCP,16685/TCP                                        110s

@helio-frota
Copy link
Contributor Author

helio-frota commented Jan 2, 2025

PR updated and it works with the extra argument:

--set-string otelCollector="http://infrastructure-jaeger-collector:4317"

2025-01-02_15-45

@helio-frota helio-frota marked this pull request as ready for review January 3, 2025 10:05
@helio-frota helio-frota changed the title fix: (partial) now it connects at least fix: sending traces to collector with minikube Jan 3, 2025
@helio-frota helio-frota changed the title fix: sending traces to collector with minikube fix: sending traces to Jaeger with minikube Jan 7, 2025
@helio-frota helio-frota marked this pull request as draft January 8, 2025 12:03
@helio-frota
Copy link
Contributor Author

Converted to draft ... need to check this trustification/trustify#1181 first 👍

@helio-frota
Copy link
Contributor Author

I'll close this ✌️

@helio-frota helio-frota deleted the otel branch January 13, 2025 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants