Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e/features/tracing: reduce flakiness of entire suite, capture technical debt #10409

Merged
merged 9 commits into from
Nov 27, 2024
17 changes: 17 additions & 0 deletions changelog/v1.18.0-rc3/span-name-flake.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
changelog:
- type: NON_USER_FACING
issueLink: https://github.com/solo-io/gloo/issues/10365
resolvesIssue: false
description: >-
Remove the chance of a flake occurring due to an UpstreamNotFound error
- type: NON_USER_FACING
issueLink: https://github.com/k8sgateway/k8sgateway/issues/10327
resolvesIssue: false
description: >-
Remove the chance of a flake occurring due to an UpstreamNotFound error
- type: NON_USER_FACING
issueLink: https://github.com/k8sgateway/k8sgateway/issues/10293
resolvesIssue: false
description: >-
Mask a product UX issue by enabling some retry logic in a test. The attached
issue should resolve the bad UX -and- remove the retry logic introduced here.
29 changes: 28 additions & 1 deletion test/kubernetes/e2e/debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,34 @@ The entry point for an e2e test is a Go test function of the form `func TestXyz(

Each feature suite is invoked as a subtest of the top level suite. The subtests use [testify](https://github.com/stretchr/testify) to structure the tests in the feature's test suite and make use of the library's assertions.

## Workflows
## Step 1: Setting Up A Cluster
### Using a previously released version
It is possible to run these tests against a previously released version of Gloo Gateway. This is useful for testing a release candidate, or a nightly build.

There is no setup required for this option, as the test suite will download the helm chart archive and `glooctl` binary from the specified release. You will use the `RELEASED_VERSION` environment variable when running the tests. See the [variable definition](/test/testutils/env.go) for more details.

### Using a locally built version
For these tests to run, we require the following conditions:
- Gloo Gateway Helm chart archive is present in the `_test` folder,
- `glooctl` is built in the `_output` folder
- A KinD cluster is set up and loaded with the images to be installed by the helm chart

[ci/kind/setup-kind.sh](/ci/kind/setup-kind.sh) gets run in CI to setup the test environment for the above requirements.
It accepts a number of environment variables, to control the creation of a kind cluster and deployment of Gloo resources to that kind cluster. Please refer to the script itself to see what variables are available.

Example:
```bash
CLUSTER_NAME=solo-test-cluster CLUSTER_NODE_VERSION=v1.30.0 VERSION=v1.0.0-solo-test ci/kind/setup-kind.sh
```

## Step 2: Running Tests
_To run the regression tests, your kubeconfig file must point to a running Kubernetes cluster:_
```
kubectl config current-context`
```
_should run `kind-<CLUSTER_NAME>`_

> Note: If you are running tests against a previously released version, you must set RELEASED_VERSION when invoking the tests

### Running a single feature's suite

Expand Down
24 changes: 22 additions & 2 deletions test/kubernetes/e2e/features/tracing/suite.go
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,19 @@ func (s *testingSuite) BeforeTest(string, string) {
otelcolSelector,
)

err = s.testInstallation.Actions.Kubectl().ApplyFile(s.ctx, tracingConfigManifest)
s.NoError(err, "can apply gloo tracing resources")
// Technical Debt!!
// https://github.com/k8sgateway/k8sgateway/issues/10293
// There is a bug in the Control Plane that results in an Error reported on the status
// when the Upstream of the Tracing Collector is not found. This results in the VirtualService
// that references that Upstream being rejected. What should occur is a Warning is reported,
// and the resource is accepted since validation.allowWarnings=true is set.
// We have plans to fix this in the code itself. But for a short-term solution, to reduce the
// noise in CI/CD of this test flaking, we perform some simple retry logic here.
s.EventuallyWithT(func(c *assert.CollectT) {
err = s.testInstallation.Actions.Kubectl().ApplyFile(s.ctx, tracingConfigManifest)
assert.NoError(c, err, "can apply gloo tracing resources")
}, time.Second*5, time.Second*1, "can apply tracing resources")

// accept the upstream
// Upstreams no longer report status if they have not been translated at all to avoid conflicting with
// other syncers that have translated them, so we can only detect that the objects exist here
Expand All @@ -109,6 +120,7 @@ func (s *testingSuite) BeforeTest(string, string) {
otelcolUpstream.Namespace, otelcolUpstream.Name, clients.ReadOpts{Ctx: s.ctx})
},
)

// accept the virtual service
s.testInstallation.Assertions.EventuallyResourceStatusMatchesState(
func() (resources.InputResource, error) {
Expand Down Expand Up @@ -156,6 +168,10 @@ func (s *testingSuite) TestSpanNameTransformationsWithoutRouteDecorator() {
curl.WithHostHeader(testHostname),
curl.WithPort(gatewayProxyPort),
curl.WithPath(pathWithoutRouteDescriptor),
// We are asserting that a request is consistent. To prevent flakes with that assertion,
// we should have some basic retries built into the request
curl.WithRetryConnectionRefused(true),
curl.WithRetries(3, 0, 10),
curl.Silent(),
},
&matchers.HttpResponse{
Expand Down Expand Up @@ -183,6 +199,10 @@ func (s *testingSuite) TestSpanNameTransformationsWithRouteDecorator() {
curl.WithHostHeader("example.com"),
curl.WithPort(gatewayProxyPort),
curl.WithPath(pathWithRouteDescriptor),
// We are asserting that a request is consistent. To prevent flakes with that assertion,
// we should have some basic retries built into the request
curl.WithRetryConnectionRefused(true),
curl.WithRetries(3, 0, 10),
curl.Silent(),
},
&matchers.HttpResponse{
Expand Down
Loading