Decode legacy fixed-range TcfCaV1 (Canada) strings (backwards compatibility)#107
Open
chuff wants to merge 2 commits into
Open
Decode legacy fixed-range TcfCaV1 (Canada) strings (backwards compatibility)#107chuff wants to merge 2 commits into
chuff wants to merge 2 commits into
Conversation
Per the GPP Consent String Specification, the Canada section encodes several fields as OptimizedRange / N-ArrayOfRanges, both of which use Fibonacci coding. They were using fixed-integer encoders. - VendorExpressConsent, VendorImpliedConsent, DisclosedVendors (OptimizedRange): EncodableOptimizedFixedRange -> EncodableOptimizedFibonacciRange (new), backed by a new OptimizedFibonacciRangeEncoder. - PubRestrictions (N-ArrayOfRanges(6,2), whose ids are an OptimizedRange): EncodableArrayOfFixedIntegerRanges -> EncodableArrayOfOptimizedFibonacciRanges (new). The fixed variant is left untouched for TCF EU. Output is byte-identical to the master-based fix (PR IABTechLab#105): updated test vectors match (e.g. vendor section ...BhADVqxGAD0AILVgAA, PubRestrictions ...CCgAS7o). This PR is the encoding fix only; backwards-compatible decoding of pre-fix (fixed-range) strings follows in a separate PR stacked on this one. mvn test: 344 tests, 0 failures; spotless:check clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Strings produced by the previous encoder used fixed-integer ranges for the Canada OptimizedRange / N-ArrayOfRanges fields. Decode them by trying the spec-compliant (Fibonacci) interpretation first and, when the consumed bits do not re-encode to it, re-reading them under the legacy interpretation: - OptimizedFibonacciRangeEncoder.decode (VendorExpressConsent, VendorImpliedConsent, DisclosedVendors): in the range branch, fall back from a Fibonacci range to a fixed-integer range. - EncodableArrayOfOptimizedFibonacciRanges.decode (PubRestrictions): fall back from OptimizedRange ids to a plain fixed-integer range (the previous EncodableArrayOfFixedIntegerRanges layout). Adds BitString get/setReadIndex to mark/reset the read cursor for the round-trip check. Empty / bitfield-form data is unambiguous and unaffected. Tests decode real pre-fix strings: a fixed-range PubRestrictions string and a full real-world string whose vendor lists use fixed-integer ranges. Stacked on the encoding-fix PR; review/merge that first. mvn test: 346 tests, 0 failures; spotless:check clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #106 (the Canada OptimizedRange → Fibonacci encoding fix). This makes the decoder backwards compatible: strings produced by the previous encoder (fixed-integer ranges) still decode correctly.
Approach
Round-trip disambiguation at the datatype level — decode the spec-compliant (Fibonacci) way first and, if the consumed bits don't re-encode to it, reset the read cursor and re-read under the legacy interpretation:
OptimizedFibonacciRangeEncoder.decode(VendorExpressConsent,VendorImpliedConsent,DisclosedVendors): in the range branch, fall back from a Fibonacci range to a fixed-integer range.EncodableArrayOfOptimizedFibonacciRanges.decode(PubRestrictions): the whole ids structure changed (plain fixed range →OptimizedRange), so fall back fromOptimizedRangeids to a plain fixed-integer range (the previousEncodableArrayOfFixedIntegerRangeslayout).Adds
BitString#getReadIndex/setReadIndexto mark/reset the read cursor for the round-trip check. Empty / bitfield-form data is unambiguous and takes the fast path unchanged.Test plan
mvn test— 346 tests, 0 failuresmvn spotless:check— cleanPubRestrictionsstring, and a full real-world string whose vendor lists use fixed-integer ranges (14 express + 46 implied vendors)🤖 Generated with Claude Code