You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR removes `dedup_source_list` and replaces it with a simple
`.uniq` call. This resolves#491, which is only the
latest in a series of ongoing issues with source expression
deduplication.
`secure_headers` has had this feature [since
2015](32bb3f5)
that [deduplicates redundant URL source
expressions](https://github.com/github/secure_headers/blob/494b75ff927464ed8d1c43e98e41fe4d15ce2bdf/lib/secure_headers/headers/content_security_policy.rb#L157-L170).
For example, if `*.github.jpy.wang` is listed as a source expression for a
given
[directive](https://w3c.github.io/webappsec-csp/#framework-directives),
then the addition of `example.github.com` would have no effect, and so
the latter can be safely removed by `secure_headers` to save bytes.
Unfortunately, this implementation has had various bugs due to the use
of "impedance mismatched" APIs like
[`URI`](https://docs.ruby-lang.org/en/2.1.0/URI.html)[^1] and
[`File.fnmatch`](https://apidock.com/ruby/v2_5_5/File/fnmatch/class)[^2].
For example, it made incorrect assumptions about source expression
schemes, leading to the following series of events:
[^1]: Which allows wildcards in domains but not for ports, as it is not
designed to parse URL source expressions.
[^2]: Which has general glob matching that is not designed for URL
source expressions either.
- 2017-03: A [bug was reported and
confirmed](#317)
- 2022-04: The bug was finally [fixed by `@keithamus` (a Hubber) in
2022](#478) due to our use
of web sockets.
- 2022-06: This fix in turn triggered a [new
bug](#491) with source
expressions like `data:`.
- 2022-06: An external contributor [submitted a fix for the bew
bug](#490), but this still
doesn't address some of the "fast and loose" semantic issues of the
underlying implementation.
- 2022-08: `@lgarron` [drafted a new
implementation](#498) that
semantically parses and compares source expressions based on the
specification for source expressions.
- This implementation already proved to have some value in early
testing, as its stricter validation caught an issue in `github.com`'s
CSP. However, it would take additional work to make this implementation
fully aware of CSP syntax (e.g. not allowing URL source expressions in a
source directive when only special keywords are allowed, and
vice-versa), and it relies on a new regex-based implementation of source
expression parsing that may very well lead to more subtle bugs.
In effect, this is a half feature whose maintenance cost has outweighed
its functionality:
- The relevant code has suffered from continued bugs, described as
above.
- Deduplication is purely a "nice-to-have" — it is not necessary for the
security or correct functionality of `secure_headers`.
- It was [introduced by `@oreoshake` (the then-maintainer) without
explanation in
2015](32bb3f5),
never "officially" documented. We have no concrete data on whether it
has any performance impact on any real apps — for all we know, uncached
deduplication calculations might even cost more than the saved header
bytes.
- Further, in response to the first relevant bug, `@oreoshake` himself
[said](#317 (comment)):
> I've never been a fan of the deduplication based on `*` anyways. Maybe
we should just rip that out.
> Like people trying to save a few bytes can optimize elsewhere.
So this PR completely removes the functionality. If we learn of a use
case where this was very important (and the app somehow can't preprocess
the list before passing it to `secure_headers`), we can always resume
consideration of one of:
- #490
- #498
0 commit comments