Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid invalid retries on multiple replicas when querying #17370

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kkewwei
Copy link
Contributor

@kkewwei kkewwei commented Feb 16, 2025

Description

Related Issues

Resolves #17361

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for f235073: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kkewwei
Copy link
Contributor Author

kkewwei commented Feb 16, 2025

❌ Gradle check result for f235073: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

MetadataCreateIndexServiceTests.testCreateIndexWithContextDisabled #17291

@kkewwei
Copy link
Contributor Author

kkewwei commented Feb 18, 2025

@msfroh, please have a look in your spare time. Thank you very much.

Copy link
Collaborator

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @kkewwei ! This is a nice improvement.

I left a couple of minor cleanup comments. Overall, it looks good.

@kkewwei kkewwei force-pushed the avoid_invalid_query branch 3 times, most recently from 589f4d4 to c34cedc Compare February 19, 2025 01:21
Copy link
Contributor

❌ Gradle check result for c34cedc: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kkewwei
Copy link
Contributor Author

kkewwei commented Feb 19, 2025

❌ Gradle check result for c34cedc: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

The flaky test may be related to #17357, I'm still finding the reason.

@kkewwei
Copy link
Contributor Author

kkewwei commented Feb 20, 2025

❌ Gradle check result for c34cedc: null
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

The flaky test may be related to #17357, I'm still finding the reason.

PendingTasksBlocksIT.testPendingTasksWithClusterNotRecoveredBlock #17369

Copy link
Contributor

❌ Gradle check result for 17a63ef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kkewwei kkewwei force-pushed the avoid_invalid_query branch from 17a63ef to ea2e674 Compare February 20, 2025 14:23
return (OpenSearchException.status(e).getStatus() / 100 != 4) && (e.getCause() instanceof TaskCancelledException == false)
// There exists a scenario where a primary shard (0 replicas) relocates and is in POST_RECOVERY on the
// target node but already deleted on the source node. Search request should still work.
|| (e.getCause() instanceof IndexNotFoundException);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msfroh please help check whether is ok or not:
IndexNotFoundException is supported for testSearchWithRelocationAndSlowClusterStateProcessing
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would 429 circuit breaker exceptions be considered non-retriable?

@kkewwei
Copy link
Contributor Author

kkewwei commented Feb 20, 2025

❌ Gradle check result for 17a63ef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

RemotePrimaryRelocationIT.testMixedModeRelocation_FailInFinalize #17364
ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation #14293

@kkewwei kkewwei force-pushed the avoid_invalid_query branch from ea2e674 to fcfb2ba Compare February 20, 2025 14:42
Copy link
Contributor

❌ Gradle check result for fcfb2ba: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kkewwei kkewwei force-pushed the avoid_invalid_query branch from fcfb2ba to e965796 Compare February 20, 2025 16:02
Copy link
Contributor

❌ Gradle check result for e965796: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: kkewwei <kewei.11@bytedance.com>
Signed-off-by: kkewwei <kkewwei@163.com>
@kkewwei kkewwei force-pushed the avoid_invalid_query branch from e965796 to d278191 Compare February 21, 2025 07:58
Copy link
Contributor

❌ Gradle check result for d278191: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Feature Request] Avoid invalid retries on multiple replicas when querying
3 participants