Swift: Extend swift/weak-sensitive-data-hashing, swift/weak-password-hashing sinks#21898
Swift: Extend swift/weak-sensitive-data-hashing, swift/weak-password-hashing sinks#21898geoffw0 wants to merge 10 commits into
Conversation
…est stubs (this is needed for new cases to work as intended).
…that are calls through a metatype.
There was a problem hiding this comment.
Pull request overview
This PR extends the Swift security models for weak hashing to correctly treat CryptoKit hashing calls as sinks even when the call is qualified by a metatype (for example Insecure.MD5.Type / Insecure.MD5.self), and updates the Swift query tests and expected results accordingly.
Changes:
- Added additional CryptoKit sink rows for
hash(bufferPointer:)across relevant weak-hashing queries. - Introduced metatype-qualified sink matching for weak sensitive-data hashing and weak password hashing.
- Updated CWE-328 Swift tests and
.expectedoutputs; added a change note documenting the analysis-impacting change.
Show a summary per file
| File | Description |
|---|---|
| swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingExtensions.qll | Adds CryptoKit hash(bufferPointer:) sinks and a metatype-qualified sink class for MD5/SHA1. |
| swift/ql/lib/codeql/swift/security/WeakPasswordHashingExtensions.qll | Adds CryptoKit hash(bufferPointer:) sinks and a metatype-qualified sink class for SHA256/SHA384/SHA512. |
| swift/ql/test/query-tests/Security/CWE-328/testCryptoKit.swift | Makes stubs more CryptoKit-like, adds new test cases for metatype-qualified calls, and fixes a test function name typo. |
| swift/ql/test/query-tests/Security/CWE-328/WeakSensitiveDataHashing.expected | Updates expected results to reflect new sink coverage and added test cases. |
| swift/ql/test/query-tests/Security/CWE-328/WeakPasswordHashing.expected | Updates expected results to reflect new sink coverage and added test cases. |
| swift/ql/src/change-notes/2026-05-26-hashing-sinks.md | Documents that the queries may produce additional results after improved sink recognition. |
Copilot's findings
- Files reviewed: 6/6 changed files
- Comments generated: 3
| @@ -76,3 +78,21 @@ private class DefaultWeakSenitiveDataHashingSink extends WeakSensitiveDataHashin | |||
|
|
|||
| let value1 = Data(cardNumber.utf8); | ||
| let _digest1 = Insecure.MD5.hash(data: value1); // BAD | ||
|
|
||
| let value2 = Data(cardNumber.utf8); | ||
| let hasher2 = Insecure.MD5.self; // metatype | ||
| let _digest2 = hasher2.hash(data: value2); // BAD | ||
|
|
||
| let value3 = Data(cardNumber.utf8); | ||
| let _digest3 = (Insecure.MD5.self).hash(data: value3); // BAD | ||
|
|
||
| let value4 = Data(cardNumber.utf8); | ||
| testReceiver1(value: value4); | ||
|
|
||
| let value5 = Data(cardNumber.utf8); | ||
| testReceiver2(hasher: Insecure.MD5.self, value: value5); | ||
|
|
||
| let value6 = Data(cardNumber.utf8); | ||
| testReceiver3(hasher: Insecure.MD5.self, value: value6); |
There was a problem hiding this comment.
I see what you mean, but consistency is not a goal for tests like these. We actually want as much variation as we can to maximize test coverage (though it is unlikely the semicolons will make any difference).
|
Is there actually a bug in MaD here, and should we be using the predicate I added here: #21831? |
|
LGTM if DCA is happy. You might want to address at least the two typos Copilot spotted, and fix the formatting of course. |
Yes, this can probably be described as either a bug or a missing feature, possibly in
What do you think
Concerned addressed, or in one case, answered. |
Extend sinks for
swift/weak-sensitive-data-hashingandswift/weak-password-hashing, specifically the sinks that modelCryptoKit. It turns out many calls to these functions are (implicitly) made with a metatype as the qualifier (e.g.Insecure.MD5.Typerather thanInsecure.MD5), which the models-as-data sinks don't match (reliably?). The new sinks added here accept calls that have the metatype as qualifier.This was seen in the wild but I had to make the tests more realistic before it affected tests - see commit-by-commit, where you can see many results lost when I made the tests closer to reality, then gained back with the new models. On the real world database we simply gain a result that was previously missed.
MRVA also shows a handful of new results (for
swift/weak-sensitive-data-hashing); I'll start a DCA run shortly as well.It seems likely we miss other sources / sinks besides these ones due to similar patterns involving metatypes. There's an opportunity to improve modelling here - but a better solution would be to resolve metatypes for models-as-data, so that these new models are not required. This is left as future work.