do not dedupe alternate schema source expresions#478
Conversation
There was a problem hiding this comment.
Hey! Thanks for the contribution. I'm unable to reproduce this behaviour. I set up a standard Rails 7 app with a scaffolded resource named posts and added the following SecureHeaders initializer:
SecureHeaders::Configuration.default do |config|
config.csp = {
default_src: %w(* 'unsafe-inline' blob:),
script_src: %w('self' blob: 'unsafe-inline' data:),
connect_src: %w('self' *.example.com wss://*.example.com),
}
endAnd the following inline script in the resources #index page:
<script>
fetch('https://example.com');
</script>
<script>
webSocket = new WebSocket('wss://example.com');
</script>The CSP returned from Secure Headers appears to be:
Content-Security-Policy: default-src * 'unsafe-inline' blob:; block-all-mixed-content; connect-src 'self' *.example.com wss://*.example.com; script-src 'self' blob: 'unsafe-inline' data:And the WebSocket/fetch appear to succeed (insofar as they can on example.com).
Copying the test from spec/lib/secure_headers/headers/content_security_policy_spec.rb to master passes. Could you provide a failing test case? Happy to hop on a call and discuss if that's easier!
|
On Slack, Keith and I determined that the bug occurs only with a combination of a non-wildcard |
|
Thanks for discussing over Slack @JackMc! I've addressed both the badly written test and the precomputation of schemes. |
|
@keithamus @JackMc are we cool to merge this change? |
|
I don't see any problem with it! |
|
I don't have merge access. Could you click the button? |
This PR removes `dedup_source_list` and replaces it with a simple `.uniq` call. This resolves #491, which is only the latest in a series of ongoing issues with source expression deduplication. `secure_headers` has had this feature [since 2015](32bb3f5) that [deduplicates redundant URL source expressions](https://github.com/github/secure_headers/blob/494b75ff927464ed8d1c43e98e41fe4d15ce2bdf/lib/secure_headers/headers/content_security_policy.rb#L157-L170). For example, if `*.github.com` is listed as a source expression for a given [directive](https://w3c.github.io/webappsec-csp/#framework-directives), then the addition of `example.github.com` would have no effect, and so the latter can be safely removed by `secure_headers` to save bytes. Unfortunately, this implementation has had various bugs due to the use of "impedance mismatched" APIs like [`URI`](https://docs.ruby-lang.org/en/2.1.0/URI.html)[^1] and [`File.fnmatch`](https://apidock.com/ruby/v2_5_5/File/fnmatch/class)[^2]. For example, it made incorrect assumptions about source expression schemes, leading to the following series of events: [^1]: Which allows wildcards in domains but not for ports, as it is not designed to parse URL source expressions. [^2]: Which has general glob matching that is not designed for URL source expressions either. - 2017-03: A [bug was reported and confirmed](#317) - 2022-04: The bug was finally [fixed by `@keithamus` (a Hubber) in 2022](#478) due to our use of web sockets. - 2022-06: This fix in turn triggered a [new bug](#491) with source expressions like `data:`. - 2022-06: An external contributor [submitted a fix for the bew bug](#490), but this still doesn't address some of the "fast and loose" semantic issues of the underlying implementation. - 2022-08: `@lgarron` [drafted a new implementation](#498) that semantically parses and compares source expressions based on the specification for source expressions. - This implementation already proved to have some value in early testing, as its stricter validation caught an issue in `github.com`'s CSP. However, it would take additional work to make this implementation fully aware of CSP syntax (e.g. not allowing URL source expressions in a source directive when only special keywords are allowed, and vice-versa), and it relies on a new regex-based implementation of source expression parsing that may very well lead to more subtle bugs. In effect, this is a half feature whose maintenance cost has outweighed its functionality: - The relevant code has suffered from continued bugs, described as above. - Deduplication is purely a "nice-to-have" — it is not necessary for the security or correct functionality of `secure_headers`. - It was [introduced by `@oreoshake` (the then-maintainer) without explanation in 2015](32bb3f5), never "officially" documented. We have no concrete data on whether it has any performance impact on any real apps — for all we know, uncached deduplication calculations might even cost more than the saved header bytes. - Further, in response to the first relevant bug, `@oreoshake` himself [said](#317 (comment)): > I've never been a fan of the deduplication based on `*` anyways. Maybe we should just rip that out. > Like people trying to save a few bytes can optimize elsewhere. So this PR completely removes the functionality. If we learn of a use case where this was very important (and the app somehow can't preprocess the list before passing it to `secure_headers`), we can always resume consideration of one of: - #490 - #498
All PRs:
Documentation updatedN/AAdding a new header
Generally, adding a new header is always OK.
Adding a new CSP directive
This correct a bug in the
dedup_source_listfunction, which causes sources with different protocols to be removed, which breaks possible CSP as headers as the spec only allows for current protocol for schemaless source expressions.Consider the following
connect-src: *.example.com wss://*.example.com. This effectively allows for http and websocket connections. The current version ofsecure_headerswill minify this to beconnect-src: *.example.comwhich only allows for http connections, disallowing websocket connections.The fix here is to check that the schemes match for wildcard sources, using
URI(source).scheme. For a scheme-less URI like*.example.com, the URI will come back with aschemeofnil, which is fine. There is a step beforededup_source_listwhich strips known schemes likehttp/https.