Skip to content

xds/cdsbalancer: changed the setupManagementServer helper to take listener and OnStreamReq as parameter#8467

Merged
eshitachandwani merged 12 commits intogrpc:masterfrom
eshitachandwani:eshitachandwani-patch-2
Aug 18, 2025
Merged

xds/cdsbalancer: changed the setupManagementServer helper to take listener and OnStreamReq as parameter#8467
eshitachandwani merged 12 commits intogrpc:masterfrom
eshitachandwani:eshitachandwani-patch-2

Conversation

@eshitachandwani
Copy link
Copy Markdown
Member

@eshitachandwani eshitachandwani commented Jul 22, 2025

RELEASE NOTES: N/A

Fixes: #8462

The main issue was that the requests were getting dropped since we use a non-blocking send for resources in test along with buffer size of just one which was resulting in resource request updates being dropped if the receiver is not executing at the exact moment.
Fix:
Changed the setupManagementServer to take listener and OnStreamReq function as a parameter and in the TestWatcher added a blocking send whenever a cluster resource is requested.

@eshitachandwani eshitachandwani added this to the 1.75 Release milestone Jul 22, 2025
@eshitachandwani eshitachandwani added Type: Testing Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Jul 22, 2025
@eshitachandwani eshitachandwani requested a review from easwars July 22, 2025 07:06
@codecov
Copy link
Copy Markdown

codecov bot commented Jul 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.97%. Comparing base (62ec29f) to head (97b05ce).
⚠️ Report is 13 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8467      +/-   ##
==========================================
- Coverage   82.36%   81.97%   -0.39%     
==========================================
  Files         413      413              
  Lines       40532    40518      -14     
==========================================
- Hits        33383    33216     -167     
- Misses       5781     5941     +160     
+ Partials     1368     1361       -7     

see 47 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@eshitachandwani
Copy link
Copy Markdown
Member Author

After an offline discussion we have decided to change it to be a blocking send , and have new helper functions which will not write to the channel on resource request to be used by tests that do not want to verify resource requests.

@arjan-bal arjan-bal requested review from arjan-bal and removed request for arjan-bal July 24, 2025 11:50
Copy link
Copy Markdown
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissing my previous review.

Copy link
Copy Markdown
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissing my previous review.

Copy link
Copy Markdown
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissing my previous review.

@arjan-bal arjan-bal requested review from arjan-bal and removed request for arjan-bal July 24, 2025 11:53
@arjan-bal arjan-bal dismissed their stale review July 24, 2025 12:00

Changes requested by main reviewer. Will need a re-review

@arjan-bal arjan-bal removed their request for review July 24, 2025 12:00
Copy link
Copy Markdown
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you please verify that with your changes, all tests in the cds balancer package runs without flakes on forge. There is a way to directly import changes from a PR. Let me know if you cannot figure it out. Thanks.

// - Creates a manual resolver that configures the cds LB policy as the
// top-level policy, and pushes an initial configuration to it
// - Creates a gRPC channel with the above manual resolver
// - Executes OnStreamRequest callback provided by the caller
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense. This should be part of the first bullet point. Something like

  • Spins up an xDS management server and passes it the onStreamRequest callback that will get invoked every time a request is received on the ADS stream.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want the bullet point "// - Executes OnStreamRequest callback provided by the caller", because we don't directly call it. We only pass it to function that creates the management server.

case 0:
select {
case cdsResourceCanceledCh <- struct{}{}:
default:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this default case here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

if len(req.GetResourceNames()) == 0 {
select {
case cdsResourceCanceledCh <- struct{}{}:
default:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

@easwars easwars assigned eshitachandwani and unassigned easwars Aug 13, 2025
Copy link
Copy Markdown
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo one minor comment.

// - Creates a manual resolver that configures the cds LB policy as the
// top-level policy, and pushes an initial configuration to it
// - Creates a gRPC channel with the above manual resolver
// - Executes OnStreamRequest callback provided by the caller
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want the bullet point "// - Executes OnStreamRequest callback provided by the caller", because we don't directly call it. We only pass it to function that creates the management server.

@eshitachandwani eshitachandwani merged commit 9ac0ec8 into grpc:master Aug 18, 2025
15 checks passed
@eshitachandwani eshitachandwani changed the title xds/cdsbalancer: increase buffer size of requested resource channel in test xds/cdsbalancer: changed the setupManagementServer helper to take listener and OnStreamReq as parameter Aug 19, 2025
@eshitachandwani eshitachandwani deleted the eshitachandwani-patch-2 branch August 22, 2025 09:51
dimpavloff pushed a commit to dimpavloff/grpc-go that referenced this pull request Aug 22, 2025
…n test (grpc#8467)

RELEASE NOTES: N/A

Fixes: grpc#8462

The main issue was that the requests were getting dropped since we use a
[non-blocking
send](https://github.com/grpc/grpc-go/blob/a5e7cd6d4c2c31b1e6649789c2ddc9a82ad6b5fa/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go#L222C5-L227C6)
for resources in test along with buffer size of just
[one](https://github.com/grpc/grpc-go/blob/a5e7cd6d4c2c31b1e6649789c2ddc9a82ad6b5fa/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go#L210)
which was resulting in resource request updates being dropped if the
receiver is not executing at the exact moment.
Fix:
Changed the `setupManagementServer` to take `listener` and `OnStreamReq`
function as a parameter and in the `TestWatcher` added a blocking send
whenever a cluster resource is requested.
eshitachandwani added a commit to eshitachandwani/grpc-go that referenced this pull request Aug 26, 2025
…n test (grpc#8467)

RELEASE NOTES: N/A

Fixes: grpc#8462

The main issue was that the requests were getting dropped since we use a
[non-blocking
send](https://github.com/grpc/grpc-go/blob/a5e7cd6d4c2c31b1e6649789c2ddc9a82ad6b5fa/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go#L222C5-L227C6)
for resources in test along with buffer size of just
[one](https://github.com/grpc/grpc-go/blob/a5e7cd6d4c2c31b1e6649789c2ddc9a82ad6b5fa/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go#L210)
which was resulting in resource request updates being dropped if the
receiver is not executing at the exact moment.
Fix:
Changed the `setupManagementServer` to take `listener` and `OnStreamReq`
function as a parameter and in the `TestWatcher` added a blocking send
whenever a cluster resource is requested.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: Test/Watchers

4 participants