Skip to content

do not dedupe alternate schema source expresions #478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

keithamus
Copy link
Contributor

All PRs:

  • Has tests
  • Documentation updated N/A

Adding a new header

Generally, adding a new header is always OK.

  • Is the header supported by any user agent? If so, which? N/A
  • What does it do? N/A
  • What are the valid values for the header? N/A
  • Where does the specification live? N/A

Adding a new CSP directive

  • Is the directive supported by any user agent? If so, which? N/A
  • What does it do? N/A
  • What are the valid values for the directive? N/A

This correct a bug in the dedup_source_list function, which causes sources with different protocols to be removed, which breaks possible CSP as headers as the spec only allows for current protocol for schemaless source expressions.

Consider the following connect-src: *.example.com wss://*.example.com. This effectively allows for http and websocket connections. The current version of secure_headers will minify this to be connect-src: *.example.com which only allows for http connections, disallowing websocket connections.

The fix here is to check that the schemes match for wildcard sources, using URI(source).scheme. For a scheme-less URI like *.example.com, the URI will come back with a scheme of nil, which is fine. There is a step before dedup_source_list which strips known schemes like http/https.

Copy link
Contributor

@JackMc JackMc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Thanks for the contribution. I'm unable to reproduce this behaviour. I set up a standard Rails 7 app with a scaffolded resource named posts and added the following SecureHeaders initializer:

SecureHeaders::Configuration.default do |config|
  config.csp = {
    default_src: %w(* 'unsafe-inline' blob:),
    script_src: %w('self' blob: 'unsafe-inline' data:),
    connect_src: %w('self' *.example.com wss://*.example.com),
  }
end

And the following inline script in the resources #index page:

<script>
  fetch('https://example.com');
</script>

<script>
  webSocket = new WebSocket('wss://example.com');
</script>

The CSP returned from Secure Headers appears to be:

Content-Security-Policy: default-src * 'unsafe-inline' blob:; block-all-mixed-content; connect-src 'self' *.example.com wss://*.example.com; script-src 'self' blob: 'unsafe-inline' data:

And the WebSocket/fetch appear to succeed (insofar as they can on example.com).

Copying the test from spec/lib/secure_headers/headers/content_security_policy_spec.rb to master passes. Could you provide a failing test case? Happy to hop on a call and discuss if that's easier!

@JackMc
Copy link
Contributor

JackMc commented Apr 1, 2022

On Slack, Keith and I determined that the bug occurs only with a combination of a non-wildcard wss:// URL and a wildcard schemaless URL. With some test changes plus the small nitpick about the source URL pattern I think we should be good to go!

@keithamus
Copy link
Contributor Author

Thanks for discussing over Slack @JackMc! I've addressed both the badly written test and the precomputation of schemes.

@srt32
Copy link
Member

srt32 commented Jun 1, 2022

@keithamus @JackMc are we cool to merge this change?

@JackMc
Copy link
Contributor

JackMc commented Jun 2, 2022

I don't see any problem with it!

@srt32
Copy link
Member

srt32 commented Jun 2, 2022

I don't have merge access. Could you click the button?

@JackMc JackMc merged commit 0dc7fee into github:main Jun 15, 2022
lgarron added a commit that referenced this pull request Oct 24, 2022
This PR removes `dedup_source_list` and replaces it with a simple
`.uniq` call. This resolves
#491, which is only the
latest in a series of ongoing issues with source expression
deduplication.

`secure_headers` has had this feature [since
2015](32bb3f5)
that [deduplicates redundant URL source
expressions](https://github.com/github/secure_headers/blob/494b75ff927464ed8d1c43e98e41fe4d15ce2bdf/lib/secure_headers/headers/content_security_policy.rb#L157-L170).
For example, if `*.github.jpy.wang` is listed as a source expression for a
given
[directive](https://w3c.github.io/webappsec-csp/#framework-directives),
then the addition of `example.github.com` would have no effect, and so
the latter can be safely removed by `secure_headers` to save bytes.

Unfortunately, this implementation has had various bugs due to the use
of "impedance mismatched" APIs like
[`URI`](https://docs.ruby-lang.org/en/2.1.0/URI.html)[^1] and
[`File.fnmatch`](https://apidock.com/ruby/v2_5_5/File/fnmatch/class)[^2].
For example, it made incorrect assumptions about source expression
schemes, leading to the following series of events:

[^1]: Which allows wildcards in domains but not for ports, as it is not
designed to parse URL source expressions.
[^2]: Which has general glob matching that is not designed for URL
source expressions either.

- 2017-03: A [bug was reported and
confirmed](#317)
- 2022-04: The bug was finally [fixed by `@keithamus` (a Hubber) in
2022](#478) due to our use
of web sockets.
- 2022-06: This fix in turn triggered a [new
bug](#491) with source
expressions like `data:`.
- 2022-06: An external contributor [submitted a fix for the bew
bug](#490), but this still
doesn't address some of the "fast and loose" semantic issues of the
underlying implementation.
- 2022-08: `@lgarron` [drafted a new
implementation](#498) that
semantically parses and compares source expressions based on the
specification for source expressions.
- This implementation already proved to have some value in early
testing, as its stricter validation caught an issue in `github.com`'s
CSP. However, it would take additional work to make this implementation
fully aware of CSP syntax (e.g. not allowing URL source expressions in a
source directive when only special keywords are allowed, and
vice-versa), and it relies on a new regex-based implementation of source
expression parsing that may very well lead to more subtle bugs.

In effect, this is a half feature whose maintenance cost has outweighed
its functionality:

- The relevant code has suffered from continued bugs, described as
above.
- Deduplication is purely a "nice-to-have" — it is not necessary for the
security or correct functionality of `secure_headers`.
- It was [introduced by `@oreoshake` (the then-maintainer) without
explanation in
2015](32bb3f5),
never "officially" documented. We have no concrete data on whether it
has any performance impact on any real apps — for all we know, uncached
deduplication calculations might even cost more than the saved header
bytes.
- Further, in response to the first relevant bug, `@oreoshake` himself
[said](#317 (comment)):

> I've never been a fan of the deduplication based on `*` anyways. Maybe
we should just rip that out.

> Like people trying to save a few bytes can optimize elsewhere.

So this PR completely removes the functionality. If we learn of a use
case where this was very important (and the app somehow can't preprocess
the list before passing it to `secure_headers`), we can always resume
consideration of one of:

- #490
- #498
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants