xdsclient: create LRSClient at time of initialisation#8483
xdsclient: create LRSClient at time of initialisation#8483eshitachandwani merged 12 commits intogrpc:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8483 +/- ##
==========================================
+ Coverage 81.86% 82.04% +0.17%
==========================================
Files 412 412
Lines 40518 40465 -53
==========================================
+ Hits 33172 33200 +28
+ Misses 5953 5887 -66
+ Partials 1393 1378 -15
🚀 New features to boost your workflow:
|
| TransportBuilder: gConfig.TransportBuilder, | ||
| }) | ||
| if err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
Now that we have moved it out of report load, we might have think a bit more here. Should error in lrs client creation be fatal? because not everyone is going to use internal xdsclient for load reporting.
There was a problem hiding this comment.
In my opinion, we should keep the same behaviour as before the xDS client migration changes. If there are certain users who need to ignore LRS client creation failures, we can create a new issue to discuss if the bahviour changes makes sense.
There was a problem hiding this comment.
I think it makes sense to retain the behavior of making LRS client creation failures be fatal. But we might want to change one minor thing in lrsclient.New. Currently it fails if node ID is empty in the configuration. We recently removed that check for the xDS client creation. I'm guessing other languages might not treat this as fatal for LRS creation.
@eshitachandwani : Could you please check what the other languages do and if required remove the check for empty node ID in lrsclient.New. Thanks.
There was a problem hiding this comment.
If I understand correctly , Java is checking for not null here
There was a problem hiding this comment.
That check is for the whole node proto or struct. Not just the node ID field.
There was a problem hiding this comment.
Sorry , that was a mistake , I think this is where they check for node id not null in Java
https://github.com/grpc/grpc-java/blob/f50726d32e216746642513add28e086094ce5506/xds/src/main/java/io/grpc/xds/client/EnvoyProtoData.java#L79
There was a problem hiding this comment.
That is a private constructor. It's being called from the builder below which sets the id to an empty string by default. In Java empty and null strings are different.
There was a problem hiding this comment.
Ahh! Okay! Got it!
| TransportBuilder: gConfig.TransportBuilder, | ||
| }) | ||
| if err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
I think it makes sense to retain the behavior of making LRS client creation failures be fatal. But we might want to change one minor thing in lrsclient.New. Currently it fails if node ID is empty in the configuration. We recently removed that check for the xDS client creation. I'm guessing other languages might not treat this as fatal for LRS creation.
@eshitachandwani : Could you please check what the other languages do and if required remove the check for empty node ID in lrsclient.New. Thanks.
| for i := 0; i < numGoroutines; i++ { | ||
| go func() { | ||
| defer wg.Done() | ||
| _, cancelStore := client.ReportLoad(serverConfig) |
There was a problem hiding this comment.
Would it make sense to have a loop here as well?
There was a problem hiding this comment.
I am not sure if I understand, loop for what?
There was a problem hiding this comment.
A loop inside the goroutine to start reporting load and subsequently canceling it. What I'm asking for is:
for i := 0; i < numGoroutines; i++ {
go func() {
defer wg.Done()
for j := 0; j < 100; j++ {
_, cancelStore := client.ReportLoad(serverConfig)
cancelStore(ctx)
}
}()
}There was a problem hiding this comment.
You mean several ReportLoad() calls from one goroutine? Is this is better repro the real life case or to increase the chances of catching the race?
| case config.Node.ID == "": | ||
| return nil, errors.New("lrsclient: node ID in node is empty") |
There was a problem hiding this comment.
I believe reverting this behaviour change should also be mentioned in the release notes.
There was a problem hiding this comment.
The release notes should also mention the bug that is fixed by this PR.
Fixes: grpc#8474 The race is in [ReportLoad](https://github.com/grpc/grpc-go/blob/9186ebd774370e3b3232d1b202914ff8fc2c56d6/xds/internal/xdsclient/clientimpl_loadreport.go#L35C2-L44C21) function of clientImpl. The implementation was recently changed as the part of [xds client migration](grpc@082a927). The [comment](https://github.com/grpc/grpc-go/blob/85240a5b02defe7b653ccba66866b4370c982b6a/xds/internal/xdsclient/clientimpl.go#L86C2-L87C16) says that `lrsclient.LRSClient` should be initialized only at creation time but that was not the case. It was being initialized at the time of calling `ReportLoad` function. RELEASE NOTES: - lrsclient: - Fix a race condition where the `LRSClient` was not initialized at creation time but it was being initialized at the time of calling the `ReportLoad` function. - Creating an `LRSClient` no longer requires a node ID.
Fixes: grpc#8474 The race is in [ReportLoad](https://github.com/grpc/grpc-go/blob/9186ebd774370e3b3232d1b202914ff8fc2c56d6/xds/internal/xdsclient/clientimpl_loadreport.go#L35C2-L44C21) function of clientImpl. The implementation was recently changed as the part of [xds client migration](grpc@082a927). The [comment](https://github.com/grpc/grpc-go/blob/85240a5b02defe7b653ccba66866b4370c982b6a/xds/internal/xdsclient/clientimpl.go#L86C2-L87C16) says that `lrsclient.LRSClient` should be initialized only at creation time but that was not the case. It was being initialized at the time of calling `ReportLoad` function. RELEASE NOTES: - lrsclient: - Fix a race condition where the `LRSClient` was not initialized at creation time but it was being initialized at the time of calling the `ReportLoad` function. - Creating an `LRSClient` no longer requires a node ID.
Original PRs : #8476 , #8483 Related issues : #8473 , #8474 RELEASE NOTES: - xds: Revert to allowing empty node ID in xDS bootstrap configuration - lrsclient: - Fix a race condition where the LRSClient was not initialized at creation time but it was being initialized at the time of calling the ReportLoad function. - Creating an LRSClient no longer requires a node ID. --------- Co-authored-by: Sotiris Nanopoulos <sotiris.nanopoulos@reddit.com>
Fixes: grpc#8474 The race is in [ReportLoad](https://github.com/grpc/grpc-go/blob/9186ebd774370e3b3232d1b202914ff8fc2c56d6/xds/internal/xdsclient/clientimpl_loadreport.go#L35C2-L44C21) function of clientImpl. The implementation was recently changed as the part of [xds client migration](grpc@082a927). The [comment](https://github.com/grpc/grpc-go/blob/85240a5b02defe7b653ccba66866b4370c982b6a/xds/internal/xdsclient/clientimpl.go#L86C2-L87C16) says that `lrsclient.LRSClient` should be initialized only at creation time but that was not the case. It was being initialized at the time of calling `ReportLoad` function. RELEASE NOTES: - lrsclient: - Fix a race condition where the `LRSClient` was not initialized at creation time but it was being initialized at the time of calling the `ReportLoad` function. - Creating an `LRSClient` no longer requires a node ID.
Fixes: grpc#8474 The race is in [ReportLoad](https://github.com/grpc/grpc-go/blob/9186ebd774370e3b3232d1b202914ff8fc2c56d6/xds/internal/xdsclient/clientimpl_loadreport.go#L35C2-L44C21) function of clientImpl. The implementation was recently changed as the part of [xds client migration](grpc@082a927). The [comment](https://github.com/grpc/grpc-go/blob/85240a5b02defe7b653ccba66866b4370c982b6a/xds/internal/xdsclient/clientimpl.go#L86C2-L87C16) says that `lrsclient.LRSClient` should be initialized only at creation time but that was not the case. It was being initialized at the time of calling `ReportLoad` function. RELEASE NOTES: - lrsclient: - Fix a race condition where the `LRSClient` was not initialized at creation time but it was being initialized at the time of calling the `ReportLoad` function. - Creating an `LRSClient` no longer requires a node ID.
Fixes: #8474
The race is in ReportLoad function of clientImpl. The implementation was recently changed as the part of xds client migration.
The comment says that
lrsclient.LRSClientshould be initialized only at creation time but that was not the case. It was being initialized at the time of callingReportLoadfunction.RELEASE NOTES:
LRSClientwas not initialized at creation time but it was being initialized at the time of calling theReportLoadfunction.LRSClientno longer requires a node ID.