Replace MD5-based deterministic GUIDs with XxHash128#7723
Conversation
…hm and add backward compatibility warning
…generation since we do not need to be backward compatible there
…nd future removal
…tion and improve memory management
…age, handle empty input efficiently with precomputed GUID, and improve overall performance.
…istency across multiple scopes
…GetMaxByteCount for buffer allocation
…nd future removal; add AppContext switch remarks for backward compatibility
c1baf2f to
b54f2f6
Compare
@danielmarbach why not be more explicit with the key its implementation and intend like:
As HostID is not really used by customers and only by our platform the concern that a user did not even consciously opted into this new behavior is not a big risk. |
|
Unfortunately, this is not private to the platform. We have guidance on how to set it over the APIs too, which by default "makes it public". We have looked at this in detail, and the consequence of just flipping it means once you upgrade, you effectively have doubled your number of endpoints. That surprising element plus the support cases that might create in us first thinking something in ServiceControl might be broken is enough reason to make this a more "cautious" change. |
Co-authored-by: David Boike <david.boike@gmail.com>
Introduces a new
DeterministicGuidimplementation based on XxHash128 (non-cryptographic, RFC 9562 version 8 GUIDs) and migrates host ID generation and the learning saga persister away from the legacy MD5-based algorithm.Why move away from MD5?
FIPSAlgorithmIsSuitedExceptionon systems with FIPS policy enabled (seeHostingComponent.Settings.cs). Using a cryptographic hash for something that doesn't require cryptographic properties is incorrect.Alternatives considered
DeterministicGuidinto each package that needs it. Rejected because it creates maintenance burden and divergent implementations.DeterministicGuidpublic in NServiceBus.Core but leaveLegacyDeterministicGuidforever: Rejected because keeping MD5 around indefinitely means carrying a FIPS-incompatible, unnecessarily expensive code path that shouldn't be used by anyone.DeterministicGuidin core, migrate callers where IDs are local/ephemeral (learning persister), use an AppContext switch for host IDs where backward compatibility matters, and mark the legacy path for removal in v12.New
DeterministicGuidAPIThe
NServiceBus.Utils.DeterministicGuidclass provides several overloads:Create(string)/Create(ReadOnlySpan<char>)— hash the UTF-8 encoding of the stringCreate(ReadOnlySpan<byte>)— hash raw binary data directlyCreate(params ReadOnlySpan<string>)— hash multiple values with length-prefix framing to prevent concatenation ambiguity (e.g.,Create("ab", "c")≠Create("a", "bc"))All overloads produce deterministic version 8 GUIDs. The same input always yields the same GUID.
Performance optimizations in
Create(params ReadOnlySpan<string>)GetMaxByteCount(O(1)) for buffer sizing instead ofGetByteCount(O(n)), enabling oneArrayPool.Rentfor the entire operation instead of per-value rent/return cycles.GetBytesdirectly into the contiguous buffer — no separateGetByteCountscan.XxHash128.Appendcalls from two per value to one.Create()with no values returns a precomputedstatic readonly Guidinstead of allocating a hash state.Breaking change: Learning Saga Persister IDs
The
LearningSagaIdGeneratornow usesDeterministicGuidinstead ofLegacyDeterministicGuid(MD5). This means saga IDs in the learning persister will change. Existing learning persister saga state files will not be found after upgrading.This is an acceptable breaking change because:
Host ID changes (opt-in via AppContext switch)
Host ID generation (
HostingComponent.Settings.GenerateHostId) now defaults to the legacy MD5 algorithm for backward compatibility. Endpoint host IDs will remain unchanged unless the AppContext switchNServiceBus.Core.Hosting.UseV2DeterministicGuidis explicitly set totrue.A
PreObsoleteannotation has been added to the legacy switch andLegacyDeterministicGuidclass, documenting that in v11 the new algorithm becomes the default and in v12 both will be removed.Technically using
PreObsoleteis slightly awkward because we will never be turning this into a real obsolete but it seemed still be the most pragmatic choice because we have explicit processes that are supposed to look forPreObsoleteattributes.