Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch all resources before watching them individually in KubernetesEndpointGroup #6154

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ikhoon
Copy link
Contributor

@ikhoon ikhoon commented Mar 11, 2025

Motivation:

If whenReady() in KubernetesEndpointGroup is complete with partial endpoints, the nodes could receive heavy loads unexpectedly. To prevent this, it would be better to fetch all resources using GET methods at the beginning instead of watching, and then use WATCH incrementally to update the information.

Modifications:

  • Add start() method to initialize the resources at the beginning.
  • Remove the debounce logic, which is no longer necessary.
  • Add failInit() in DynamicEndpointGroup to notify an initialization failure.
  • Breaking) KubernetesEndpointGroup now requires get verbs in the RBAC rules to fetch endpoints.

Result:

KubernetesEndpointGroup.whenReady() is now complete after collecting all endpoints.

…ndpointGroup`

Motivation:

If `whenReady()` in `KubernetesEndpointGroup` completes with partial
endpoints, the nodes could receive heavy loads. To prevent this
, it would be better to fetch all resources using `GET` methods at the
beginning instead of watching, and then use `WATCh` incrementally to
update the information.

Modifications:

- Add `start()` method to initialize the resources at the beginning.
- Remove the debounce logic which is no longer necessary.
- Add `failInit()` in `DyanmicEndpointGroup` to notify an initialization
  failure.
- Breaking) `KubernetesEndpointGroup` now requires `get` verbs in the
  RBAC rules to fetch endpoints.

Result:

`KubernetesEndpointGroup.whenReady()` now completes after collecting all
endpoints.
@ikhoon ikhoon added the defect label Mar 11, 2025
@ikhoon ikhoon added this to the 1.32.1 milestone Mar 11, 2025
Copy link
Contributor

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍

@@ -251,8 +239,54 @@ public static KubernetesEndpointGroupBuilder builder(Config kubeConfig) {
this.portName = portName;
this.nodeAddressFilter = nodeAddressFilter;
this.autoClose = autoClose;
watchJob(this::watchNode);
watchJob(this::start);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to safeRun?

Suggested change
watchJob(this::start);
safeRun(this::start);

Copy link
Contributor Author

@ikhoon ikhoon Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safeRun sounds like the job is executed in the current thread. Let me rename it to executeJob.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. 👍

@minwoox
Copy link
Contributor

minwoox commented Mar 11, 2025

Please also fix the test failure. 😉

@minwoox minwoox self-requested a review March 13, 2025 00:58
newServiceWatch.close();
return;
initial = false;
// Repeat until all `start()` requests are handled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't it just to this only once when start() method is called concurrently? I'm not sure why it needs to handle all of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants