fallback to GetAddrInfoW if GetAddrInfoExW fails#45816
Conversation
|
Tagging subscribers to this area: @dotnet/ncl Issue DetailsWhile root cause why GetAddrInfoExW() is failing from impersonificated context is still investigated this change should make Async DNS working again on Windows when running from impersonificated context. We do not have infrastructure for automated authenticatiobn tests but I run the repro code manually on my Windows 10 and it works now eg. both explicit DNS lookup and HttpClient Async fetch works.
|
|
Also, is it possible to add a test for these changes? |
scalablecory
left a comment
There was a problem hiding this comment.
have we talked to Windows team to see if there's a way we can keep using async here?
no, not really. This was part of the Enterprise-auth project but it got de-prioritized. |
Co-authored-by: Cory Nelson <phrosty@gmail.com>
…olutionPal.Windows.cs Co-authored-by: Jan Kotas <jkotas@microsoft.com>
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| return result; | ||
| if (NameResolutionPal.TryGetAddrInfoAsync(hostName, justAddresses, addressFamily, cancellationToken, out t)) | ||
| { | ||
| t.ContinueWith((resultTask) => { |
There was a problem hiding this comment.
We should be returning the Task returned by ContinueWith, or the ActivityId following the Dns resolution wouldn't reflect the Stop event.
This was the issue with the original Telemetry instrumentation of Dns, pointed out by @ManickaP in #41670 (comment) (and fixed in #42188).
There was a problem hiding this comment.
Yep, this will break the telemetry once again.
There was a problem hiding this comment.
I had different change where telemetry test was failing. With this on, telemetry tests are still passing. The log still be edited but I'm not sire about the ID. If we return the ContinueWith, we will need to copy result (or failure). I will look into it.
There was a problem hiding this comment.
The tests will only validate that the Start/Stop are logged with the same ActivityId.
The issue here appears when you have another Start/Stop pair following the name resolution:
- HttpRequestStart starts with ID
/1/ - NameResolutionStart starts with nested ID
/1/1/ - NameResolutionStop correctly sees
/1/1/and updates it to/1/. - SocketConnectStart starts on the same async context as
NameResolutionStart, but it never observes the change of the ID back to/1/, so it starts with a nested ID/1/1/1
It will appear as if the Socket connect is an operation nested under the NameResolution Start/Stop, when in reality it's only following it.
There was a problem hiding this comment.
I re-factor the code and the activity ID is right as far as I can tell as well as running impersonificated still works.
I went back to the await path as the ContinueWith was getting complicated more than I wanted. The only downside is that we will miss time of the call it self.
I feel it probably does not matter comparing duration of the actual resolution.
We could probably start the timer upfront but I'm not sure if that is worth of the complication.
There was a problem hiding this comment.
after chatting with @MihaZupan , I decided to fix the stopwatch as well.
This should be ready for another review.
| // instead of calling the synchronous version in the ThreadPool. | ||
| // If the OS supports it and 'hostName' is not an IP Address, resolve the name asynchronously | ||
| // instead of calling the synchronous version in the ThreadPool. | ||
| // If it fails, we will fall-back to ThreadPool as well. |
There was a problem hiding this comment.
| // If it fails, we will fall-back to ThreadPool as well. | |
| // If it fails, we will fall back to ThreadPool as well. |
| { | ||
| // WSATRY_AGAIN indicates possible problem with reachability according to docs. | ||
| // However, if servers are really unreachable, we would still get IOPending here | ||
| // and final result would be posted via overlapped IO. |
There was a problem hiding this comment.
I'm not understanding the above comment.
There was a problem hiding this comment.
WSATRY_AGAIN is valued error code for failures other than we see in the impersonification case. What I'm trying to say here is based on my testing and understanding, the other WSATRY_AGAIN would not be synchronous failure.
In such cases, we should return transient failure to caller rather than re-trying on thread-pool IMHO.
| { | ||
| result = await ((Task<T>)NameResolutionPal.GetAddrInfoAsync(hostName, justAddresses, addressFamily, cancellationToken)).ConfigureAwait(false); | ||
| return result; | ||
| //return new Tuple<bool, Task>(true, (Task)InternalGetAddrInfoWithTelemetryAsync(task, hostName)); |
|
System.Speech.Tests test failure seems unrelated. |
|
/backport to release/5.0 |
|
Started backporting to release/5.0: https://github.com/dotnet/runtime/actions/runs/479577114 |
|
@wfurt backporting to release/5.0 failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: fallback to GetAddrInfoW if GetAddrInfoExW fails
error: sha1 information is lacking or useless (src/libraries/System.Net.NameResolution/src/System/Net/Dns.cs).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 fallback to GetAddrInfoW if GetAddrInfoExW fails
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128Please backport manually! |
While root cause why GetAddrInfoExW() is failing from impersonificated context is still investigated this change should make Async DNS working again on Windows when running from impersonificated context.
We do not have infrastructure for automated authenticatiobn tests but I run the repro code manually on my Windows 10 and it works now eg. both explicit DNS lookup and HttpClient Async fetch works.
fixes #29935