Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for listener level warnings #10458

Merged
merged 19 commits into from
Dec 13, 2024
Merged

Conversation

davidjumani
Copy link

@davidjumani davidjumani commented Dec 9, 2024

Description

Adds support for listener-level warnings. This way when a listener or its plugin returns an error, it can be checked if it is a configuration error that can be treated as a warning and processed accordingly.

API changes

Added the warnings field to the HttpListenerReport && TcpListenerReport

Context

This is introduced to resolve Upstream not found when configuring opentelemetry collector should be a warning, not an error
TLDR;
When the upstream is not found in a tracing collector, it throws an error instead of a warning (Invalid Destination)

Testing steps

Run the following steps :

kubectl apply -f- << EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-agent-conf
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-agent-conf
data:
  otel-agent-config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    exporters:
      otlp:
        endpoint: "otel-collector.default:4317"
        tls:
          insecure: true
        sending_queue:
          num_consumers: 4
          queue_size: 100
        retry_on_failure:
          enabled: true
    processors:
      batch:
      memory_limiter:
        # 80% of maximum memory up to 2G
        limit_mib: 400
        # 25% of limit up to 2G
        spike_limit_mib: 100
        check_interval: 5s
    extensions:
      zpages: {}
      memory_ballast:
        # Memory Ballast size should be max 1/3 to 1/2 of memory.
        size_mib: 165
    service:
      extensions: [zpages, memory_ballast]
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlp]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-agent
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-agent
spec:
  selector:
    matchLabels:
      app: opentelemetry
      component: otel-agent
  template:
    metadata:
      labels:
        app: opentelemetry
        component: otel-agent
    spec:
      containers:
      - command:
          - "/otelcol"
          - "--config=/conf/otel-agent-config.yaml"
        image: otel/opentelemetry-collector:0.68.0
        name: otel-agent
        resources:
          limits:
            cpu: 500m
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - containerPort: 55679 # ZPages endpoint.
        - containerPort: 4317 # Default OpenTelemetry receiver port.
        - containerPort: 8888  # Metrics.
        volumeMounts:
        - name: otel-agent-config-vol
          mountPath: /conf
      volumes:
        - configMap:
            name: otel-agent-conf
            items:
              - key: otel-agent-config
                path: otel-agent-config.yaml
          name: otel-agent-config-vol
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-conf
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-collector-conf
data:
  otel-collector-config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:
      memory_limiter:
        # 80% of maximum memory up to 2G
        limit_mib: 1500
        # 25% of limit up to 2G
        spike_limit_mib: 512
        check_interval: 5s
    extensions:
      zpages: {}
      memory_ballast:
        # Memory Ballast size should be max 1/3 to 1/2 of memory.
        size_mib: 683
    exporters:
      logging:
        loglevel: debug
      zipkin:
        endpoint: "http://zipkin:9411/api/v2/spans"
        tls:
          insecure: true
    service:
      extensions: [zpages, memory_ballast]
      pipelines:
        metrics:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [logging]
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [logging, zipkin]
---
apiVersion: v1
kind: Service
metadata:
  name: otel-collector
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-collector
spec:
  ports:
  - name: otlp-grpc # Default endpoint for OpenTelemetry gRPC receiver.
    port: 4317
    protocol: TCP
    targetPort: 4317
  - name: otlp-http # Default endpoint for OpenTelemetry HTTP receiver.
    port: 4318
    protocol: TCP
    targetPort: 4318
  - name: metrics # Default endpoint for querying metrics.
    port: 8888
  selector:
    component: otel-collector
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-collector
spec:
  selector:
    matchLabels:
      app: opentelemetry
      component: otel-collector
  minReadySeconds: 5
  progressDeadlineSeconds: 120
  replicas: 1 #TODO - adjust this to your own requirements
  template:
    metadata:
      labels:
        app: opentelemetry
        component: otel-collector
    spec:
      containers:
      - command:
          - "/otelcol"
          - "--config=/conf/otel-collector-config.yaml"
        image: otel/opentelemetry-collector:0.68.0
        name: otel-collector
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 200m
            memory: 400Mi
        ports: # Comment out ports for platforms as needed.
        - containerPort: 55679 # Default endpoint for ZPages.
        - containerPort: 4317 # Default endpoint for OpenTelemetry receiver.
        - containerPort: 14250 # Default endpoint for Jaeger gRPC receiver.
        - containerPort: 14268 # Default endpoint for Jaeger HTTP receiver.
        - containerPort: 9411 # Default endpoint for Zipkin receiver.
        - containerPort: 8888  # Default endpoint for querying metrics.
        volumeMounts:
        - name: otel-collector-config-vol
          mountPath: /conf
      volumes:
        - configMap:
            name: otel-collector-conf
            items:
              - key: otel-collector-config
                path: otel-collector-config.yaml
          name: otel-collector-config-vol
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: echo-server
  name: echo-server
  namespace: gloo-system
spec:
  selector:
    matchLabels:
      app: echo-server
  replicas: 1
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
      # remember to change port to 80 and disable http2 if using httpbin
      # - image: kennethreitz/httpbin
      - image: jmalloc/echo-server
        name: echo-server
        env:
        - name: LOG_HTTP_HEADERS
          value: "true"
        - name: LOG_HTTP_BODY
          value: "true"
        ports:
        - containerPort: 8080
          name: grpc
---
apiVersion: v1
kind: Service
metadata:
  name: echo-server
  namespace: gloo-system
  labels:
    service: echo-server
spec:
  ports:
  - port: 8080
    # targetPort: 80
    protocol: TCP
  selector:
    app: echo-server
---
# gloo resources
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: echo-server
  namespace: gloo-system
spec:
  useHttp2: true
  static:
    hosts:
    - addr: echo-server
      port: 8080
---
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: "opentelemetry-collector"
  namespace: gloo-system
spec:
  # REQUIRED FOR OPENTELEMETRY COLLECTION
  useHttp2: true
  kube:
    # selector:
    serviceName: otel-collector
    serviceNamespace: gloo-system
    servicePort: 4317
---
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: default
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - '*'
    options:
      stagedTransformations:
        regular:
          requestTransforms:
            - requestTransformation:
                transformationTemplate:
                  headers:
                    test_header:
                      text: '{{header("ABCD")}}'
                  spanTransformer:
                    name:
                      text: '{{header("Host")}}'
    routes:
    - matchers:
       - prefix: /route1
      routeAction:
        single:
          upstream:
            name: echo-server
            namespace: gloo-system
      options:
        autoHostRewrite: true
    - matchers:
       - prefix: /route2
      routeAction:
        single:
          upstream:
            name: echo-server
            namespace: gloo-system
      options:
        autoHostRewrite: true
    - matchers:
       - prefix: /
      routeAction:
        single:
          upstream:
            name: echo-server
            namespace: gloo-system
      options:
        autoHostRewrite: true
EOF

Now create a gateway with an invalid otel collector upstream :

kubectl apply -f- << EOF
apiVersion: gateway.solo.io/v1
kind: Gateway
metadata:
  labels:
    app: gloo
  name: gateway-proxy
  namespace: gloo-system
spec:
  bindAddress: '::'
  bindPort: 8080
  httpGateway:
    options:
      httpConnectionManagerSettings:
        tracing:
          openTelemetryConfig:
            collectorUpstreamRef:
              namespace: gloo-system
              name: opentelemetry-collectorv2
EOF

After this fix, the gateway should be accepted with a warning

kubectl -n gloo-system get gateway.gateway.solo.io/gateway-proxy -o yaml
apiVersion: gateway.solo.io/v1
kind: Gateway
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"gateway.solo.io/v1","kind":"Gateway","metadata":{"annotations":{},"labels":{"app":"gloo"},"name":"gateway-proxy","namespace":"gloo-system"},"spec":{"bindAddress":"::","bindPort":8080,"httpGateway":{"options":{"httpConnectionManagerSettings":{"tracing":{"openTelemetryConfig":{"collectorUpstreamRef":{"name":"opentelemetry-collectorv2","namespace":"gloo-system"}}}}}}}}
  creationTimestamp: "2024-12-09T13:40:55Z"
  generation: 7
  labels:
    app: gloo
  name: gateway-proxy
  namespace: gloo-system
  resourceVersion: "53410"
  uid: da348bab-0334-4065-9ec5-4c6f5ac5d7d1
spec:
  bindAddress: '::'
  bindPort: 8080
  httpGateway:
    options:
      httpConnectionManagerSettings:
        tracing:
          openTelemetryConfig:
            collectorUpstreamRef:
              name: opentelemetry-collectorv2
              namespace: gloo-system
status:
  statuses:   << See the warning here
    gloo-system:
      reason: "warning: \n  HttpListener Warning: InvalidDestinationWarning. Reason:
        *v1.Upstream { gloo-system.opentelemetry-collectorv2 } not found"
      reportedBy: gloo
      state: Warning

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@davidjumani davidjumani requested a review from a team as a code owner December 9, 2024 14:28
@davidjumani davidjumani changed the title Add support for listener level warnings [WIP] Add support for listener level warnings Dec 9, 2024
Copy link

github-actions bot commented Dec 9, 2024

Visit the preview URL for this PR (updated for commit 6baba8c):

https://gloo-edge--pr10458-allow-listener-warni-13gjo7fg.web.app

(expires Fri, 20 Dec 2024 12:10:43 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 77c2b86e287749579b7ff9cadb81e099042ef677

@davidjumani davidjumani changed the title [WIP] Add support for listener level warnings Add support for listener level warnings Dec 10, 2024
@solo-changelog-bot
Copy link

Issues linked to changelog:
kgateway-dev#10293

Copy link

@ashishb-solo ashishb-solo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - agree with Nathan's comments regarding the function names, but happy to approve once that's addressed

@davidjumani
Copy link
Author

/kick

Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m------------------------------�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;9m• [FAILED] [60.011 seconds]�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[0mDebug �[38;5;9m�[1m[It] should support the top level debug command and should populate the kube-state.log file�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m/workspace/gloo/projects/gloo/cli/pkg/cmd/debug/root_test.go:36�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243mTimeline >>�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   u want to proceed? [y/N]:  y                                                    
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/pkg/cliutil/testutil/testutil.go:55 �[38;5;243m@ 12/11/24 10:59:28.996�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[1mAttempt #1 �[38;5;9mFailed�[0m�[1m.  Retrying ↺�[0m �[38;5;243m@ 12/11/24 10:59:28.996�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   u want to proceed? [y/N]:  y                                                    
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/pkg/cliutil/testutil/testutil.go:55 �[38;5;243m@ 12/11/24 10:59:48.999�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[1mAttempt #2 �[38;5;9mFailed�[0m�[1m.  Retrying ↺�[0m �[38;5;243m@ 12/11/24 10:59:48.999�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   u want to proceed? [y/N]:  y                                                    
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/pkg/cliutil/testutil/testutil.go:55 �[38;5;243m@ 12/11/24 11:00:09.003�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243m<< Timeline�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED] test timed out�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/workspace/gloo/pkg/cliutil/testutil/testutil.go:55�[0m �[38;5;243m@ 12/11/24 11:00:09.003�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9mFull Stack Trace�[0m
Step #1 - "run-tests": Step #7 - "run-tests":     github.com/solo-io/gloo/pkg/cliutil/testutil.ExpectInteractive(0x8b1f668, 0x8b1f670, 0xbed0560)
Step #1 - "run-tests": Step #7 - "run-tests":     	/workspace/gloo/pkg/cliutil/testutil/testutil.go:55 +0x67d
Step #1 - "run-tests": Step #7 - "run-tests":     github.com/solo-io/gloo/projects/gloo/cli/pkg/cmd/debug_test.init.func2.3()
Step #1 - "run-tests": Step #7 - "run-tests":     	/workspace/gloo/projects/gloo/cli/pkg/cmd/debug/root_test.go:37 +0x32
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   There were �[1m�[38;5;9madditional failures�[0m detected.  To view them in detail run �[1mginkgo -vv�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m------------------------------�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;14mS�[0m�[38;5;14mS�[0m

@davidjumani
Copy link
Author

/kick same failure

Copy link

@ashishb-solo ashishb-solo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments you added are extremely helpful, thanks

@davidjumani
Copy link
Author

/kick

Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;9m• [FAILED] [0.004 seconds]�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[0mvalidation utils �[38;5;243mGetProxyWarning �[38;5;9m�[1m[It] aggregates the warnings at every level for aggregate listener�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m/workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:445�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243mTimeline >>�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 �[38;5;243m@ 12/11/24 13:55:46.356�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[1mAttempt #1 �[38;5;9mFailed�[0m�[1m.  Retrying ↺�[0m �[38;5;243m@ 12/11/24 13:55:46.356�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 �[38;5;243m@ 12/11/24 13:55:46.357�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[1mAttempt #2 �[38;5;9mFailed�[0m�[1m.  Retrying ↺�[0m �[38;5;243m@ 12/11/24 13:55:46.358�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 �[38;5;243m@ 12/11/24 13:55:46.358�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243m<< Timeline�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED] Expected
Step #1 - "run-tests": Step #7 - "run-tests":       <[]string | len:10, cap:10>: [
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "Listener Warning: SSLConfigWarning. Reason: invalid ssl config",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "HttpListener Warning: InvalidDestinationWarning. Reason: testing invalid destination warning",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "TcpListener Warning: UnknownWarning. Reason: unknown warning",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "TcpHost Warning: InvalidDestinationWarning. Reason: testing invalid destination warning",
Step #1 - "run-tests": Step #7 - "run-tests":           "TcpHost Warning: InvalidDestinationWarning. Reason: testing invalid destination warning",
Step #1 - "run-tests": Step #7 - "run-tests":       ]
Step #1 - "run-tests": Step #7 - "run-tests":   to have length 5�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501�[0m �[38;5;243m@ 12/11/24 13:55:46.358�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9mFull Stack Trace�[0m
Step #1 - "run-tests": Step #7 - "run-tests":     github.com/solo-io/gloo/projects/gloo/pkg/utils/validation_test.init.func1.7.2()
Step #1 - "run-tests": Step #7 - "run-tests":     	/workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 +0x10b5
Step #1 - "run-tests": Step #7 - "run-tests": 

@davidjumani
Copy link
Author

/kick

@davidjumani
Copy link
Author

/kick

unexpected status from POST request to https://quay.io/v2/auth: 502 Bad Gateway

Copy link

@sam-heilbron sam-heilbron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add an explicit e2e test to demonstrate this behavior? We have a good suite of these tests (https://github.com/solo-io/gloo/tree/main/test/kubernetes/e2e/features/validation) so if we added one, I imagined it would go into this space

How does this functionality play with invalid route replacement: https://docs.solo.io/gloo-edge/latest/guides/traffic_management/configuration_validation/invalid_route_replacement/

@sam-heilbron
Copy link

/kick

Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;9m• [FAILED] [0.004 seconds]�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[0mvalidation utils �[38;5;243mGetProxyWarning �[38;5;9m�[1m[It] aggregates the warnings at every level for aggregate listener�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m/workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:445�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243mTimeline >>�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 �[38;5;243m@ 12/11/24 13:55:46.356�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[1mAttempt #1 �[38;5;9mFailed�[0m�[1m.  Retrying ↺�[0m �[38;5;243m@ 12/11/24 13:55:46.356�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 �[38;5;243m@ 12/11/24 13:55:46.357�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[1mAttempt #2 �[38;5;9mFailed�[0m�[1m.  Retrying ↺�[0m �[38;5;243m@ 12/11/24 13:55:46.358�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED]�[0m in [It] - /workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 �[38;5;243m@ 12/11/24 13:55:46.358�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243m<< Timeline�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9m[FAILED] Expected
Step #1 - "run-tests": Step #7 - "run-tests":       <[]string | len:10, cap:10>: [
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "Listener Warning: SSLConfigWarning. Reason: invalid ssl config",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "HttpListener Warning: InvalidDestinationWarning. Reason: testing invalid destination warning",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "TcpListener Warning: UnknownWarning. Reason: unknown warning",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "",
Step #1 - "run-tests": Step #7 - "run-tests":           "TcpHost Warning: InvalidDestinationWarning. Reason: testing invalid destination warning",
Step #1 - "run-tests": Step #7 - "run-tests":           "TcpHost Warning: InvalidDestinationWarning. Reason: testing invalid destination warning",
Step #1 - "run-tests": Step #7 - "run-tests":       ]
Step #1 - "run-tests": Step #7 - "run-tests":   to have length 5�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501�[0m �[38;5;243m@ 12/11/24 13:55:46.358�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;9mFull Stack Trace�[0m
Step #1 - "run-tests": Step #7 - "run-tests":     github.com/solo-io/gloo/projects/gloo/pkg/utils/validation_test.init.func1.7.2()
Step #1 - "run-tests": Step #7 - "run-tests":     	/workspace/gloo/projects/gloo/pkg/utils/validation/proxy_validation_test.go:501 +0x10b5
Step #1 - "run-tests": Step #7 - "run-tests": 

It scares me that we are kicking a test that theoretically is validating the behavior we are adding. I may be missing some context, but I think if we added plakes, or changes behavior and are seeing flakes, we should do our best to address them

@sam-heilbron
Copy link

Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m------------------------------�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;9m• [FAILED] [60.011 seconds]�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[0mDebug �[38;5;9m�[1m[It] should support the top level debug command and should populate the kube-state.log file�[0m
Step #1 - "run-tests": Step #7 - "run-tests": �[38;5;243m/workspace/gloo/projects/gloo/cli/pkg/cmd/debug/root_test.go:36�[0m
Step #1 - "run-tests": Step #7 - "run-tests": 
Step #1 - "run-tests": Step #7 - "run-tests":   �[38;5;243mTimeline >>�[0m
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure yo
Step #1 - "run-tests": Step #7 - "run-tests":   ? This command will overwrite the "debug" directory, if present. Are you sure  

Kicking tests like this is a way to enable test flakes to continue to permeate the codebase. Let's identify issues and assignees and understand how recently code merged that could have affected this.

Based on the error, I see kgateway-dev#10400 as the relevant issue. I think it's worth following up with the people involved there to let them know how frequently it's happening

Copy link

@sam-heilbron sam-heilbron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@davidjumani
Copy link
Author

Kicking tests like this is a way to enable test flakes to continue to permeate the codebase. Let's identify issues and assignees and understand how recently code merged that could have affected this.

Based on the error, I see k8sgateway#10400 as the relevant issue. I think it's worth following up with the people involved there to let them know how frequently it's happening

Thanks Sam. I planned to update this issue once the PR was merged since this was a recurring failure. I've disabled the tests for now and commented the same on the issue

@davidjumani
Copy link
Author

Do we want to add an explicit e2e test to demonstrate this behavior? We have a good suite of these tests (https://github.com/solo-io/gloo/tree/main/test/kubernetes/e2e/features/validation) so if we added one, I imagined it would go into this space

I didn't think it was necessary since unit tests have been added, however I've added e2e tests as well

How does this functionality play with invalid route replacement: https://docs.solo.io/gloo-edge/latest/guides/traffic_management/configuration_validation/invalid_route_replacement/

This should have no effect since it only replaces invalid routes but here the warnings are just reported on the listener

Copy link

@ashishb-solo ashishb-solo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the new test doesn't add a huge amount of value, but since it's done, let's move on to other things

@soloio-bulldozer soloio-bulldozer bot merged commit 31379b9 into main Dec 13, 2024
20 checks passed
@soloio-bulldozer soloio-bulldozer bot deleted the allow-listener-warnings branch December 13, 2024 15:30
davidjumani added a commit that referenced this pull request Dec 16, 2024
Co-authored-by: changelog-bot <changelog-bot>
Co-authored-by: Sam Heilbron <SamHeilbron@gmail.com>
davidjumani added a commit that referenced this pull request Dec 16, 2024
Co-authored-by: changelog-bot <changelog-bot>
Co-authored-by: Sam Heilbron <SamHeilbron@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upstream not found when configuring opentelemetry collector should be a warning, not an error
4 participants