Skip to content

layer: numerous overlayfs whiteout fixes #572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 6, 2025

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Apr 25, 2025

It turns out that a lot of our overlayfs whiteout conversion code is actually quite broken, and quite a few things are not properly supported. Here are the set of issues we needed to fix:

  • GenerateLayer would produce whiteouts with the wrong pathname (it would use the full path rather than the layer path), leaking information about the host and producing incorrect images as output.
  • GenerateLayer would skip any non-whiteouts files when operating in TranslateOverlayfsWhiteout mode, silently losing data.
  • Our testing of TranslateOverlayfsWhiteout was quite crude.
  • The GenerateInsertLayer code would not use fsEval in most places, which would cause rootless mode to not work properly.
  • We need to support converting trusted.overlay.opaque=y since users might use that to specify opaque whiteouts (there is no other way to specify this when umoci generates archives).
  • Extracting overlayfs whiteouts doesn't work if there are missing parent directories or the filetype of the opaque dir is wrong. In practice this is not an issue with most images but for completeness we should handle this case properly.
  • Currently extracting regular whiteouts in an opaque whiteout directory will lead to confusing effects with overlayfs (readdir will report the existence of files that are whited out).
    • We will probably need to switch (*tarExtractor).upperPaths to be a trie that will allow us to remove already-extracted regular whiteouts from a directory that then gets tagged as an opaque whiteout, as well as skip extracting regular whiteouts from a directory that is already and opaque whiteout. None of this is necessary for regular extractions but overlayfs kind of forces us to do this.
  • 🔀 user.overlay.* support feature moved to layer: overlay userxattr support #576.
  • We need to support the trusted.overlay. escaping so that:
    • Repacking a regular non-overlayfs layer with trusted.overlay.* will keep the same xattr name.
    • Repacking an overlayfs layer with trusted.overlay.* will strip the xattrs but trusted.overlay.overlay.* will have one layer of escaping stripped when generating the tar archive.
    • Ditto when unpacking -- for non-overlayfs we extract as normal but for overlayfs we need to add escaping.

@cyphar cyphar added this to the 0.5.0 milestone Apr 25, 2025
@codecov-commenter
Copy link

codecov-commenter commented Apr 25, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 59.64523% with 182 lines in your changes missing coverage. Please review.

Project coverage is 72.62%. Comparing base (a79788c) to head (9a1cefa).
Report is 77 commits behind head on main.

Files with missing lines Patch % Lines
oci/layer/tar_extract.go 18.89% 100 Missing and 3 partials ⚠️
oci/layer/xattr.go 46.66% 31 Missing and 1 partial ⚠️
oci/layer/generate.go 71.21% 15 Missing and 4 partials ⚠️
oci/layer/utils_unix.go 64.44% 13 Missing and 3 partials ⚠️
oci/layer/tar_generate.go 68.57% 11 Missing ⚠️
pkg/system/xattr_unix.go 66.66% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #572      +/-   ##
==========================================
- Coverage   72.72%   72.62%   -0.11%     
==========================================
  Files          65       67       +2     
  Lines        5089     5391     +302     
==========================================
+ Hits         3701     3915     +214     
- Misses       1010     1100      +90     
+ Partials      378      376       -2     
Files with missing lines Coverage Δ
oci/layer/utils.go 71.42% <ø> (+1.15%) ⬆️
pkg/fseval/fseval_default.go 90.38% <100.00%> (ø)
pkg/fseval/fseval_rootless.go 95.74% <100.00%> (ø)
pkg/pathtrie/trie.go 100.00% <100.00%> (ø)
pkg/unpriv/unpriv.go 69.81% <100.00%> (+2.05%) ⬆️
pkg/system/xattr_unix.go 69.69% <66.66%> (+10.60%) ⬆️
oci/layer/tar_generate.go 65.83% <68.57%> (-1.27%) ⬇️
oci/layer/utils_unix.go 64.44% <64.44%> (ø)
oci/layer/generate.go 72.54% <71.21%> (+6.78%) ⬆️
oci/layer/xattr.go 46.66% <46.66%> (ø)
... and 1 more

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch from f7c61f8 to 325c907 Compare April 25, 2025 11:42
@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch 2 times, most recently from 7053dda to 396d080 Compare April 25, 2025 23:34
@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch 5 times, most recently from e5da54b to 29ce3ca Compare April 28, 2025 16:21
There is no need to open-code the clearing logic because both
Llistxattrs and Lremovexattr are called on the same path, so doing a
single Wrap call is fine.

Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch 2 times, most recently from c190487 to ad47308 Compare April 28, 2025 17:25
cyphar added 11 commits April 30, 2025 01:57
For more complicated filtering it would be preferable to just have a
callback function to tell lclearxattrs whether a given xattr should be
skipped from clearing.

Signed-off-by: Aleksa Sarai <[email protected]>
This mirrors TarExtractor, and lets us provide information about
overlayfs whiteouts to the generator.

Note that this is a bit of a temporary fix -- ideally we would unify the
on-disk format configuration between TarExtractor and tarGenerator.
Unfortunately this is a little complicated because WhiteoutMode was
added to umoci.json back in commit 9ab5eff ("meta: include the
WhiteoutMode") so it will require some compatibility considerations to
fully implement in a later patchset.

Signed-off-by: Aleksa Sarai <[email protected]>
GenerateInsertLayer using different naming to GenerateLayer makes it
annoying when writing patches that affect both.

Signed-off-by: Aleksa Sarai <[email protected]>
Using the full path for the in-archive path results in uselss whiteout
entries that arguably also disclose some information about the host
system.

Fixes: a425f92 ("layer: add support for translating overlay whiteouts to OCI ones")
Signed-off-by: Aleksa Sarai <[email protected]>
A misplaced continue statement meant that TranslateOverlayWhiteouts
would actually end up producing a layer with only overlayfs whiteout
entries and no other files.

Fixes: a425f92 ("layer: add support for translating overlay whiteouts to OCI ones")
Signed-off-by: Aleksa Sarai <[email protected]>
We should actually explicitly verify that all of the entries in the
generated archive actually match what we want in one go with
assert.ElementsMatch. In the case of TranslateOverlayWhiteouts our tests
were very bare-bones and missed several bugs in our implementation.

Signed-off-by: Aleksa Sarai <[email protected]>
Fix up the helpers a bit, and for the opaque tests we really should've
split out the parts that need root (trusted.overlay.* xattrs) from the
other tests.

Signed-off-by: Aleksa Sarai <[email protected]>
There's no need to have two simplified tar.Header representations for
our tests, we can just have tarDentry be used for both checking the
contents of a tar archive and for helping generate it.

Signed-off-by: Aleksa Sarai <[email protected]>
If we're asked to generate a regular whiteout, adding an opaque entry as
well is a no-op and could in theory confuse some extractors.

Signed-off-by: Aleksa Sarai <[email protected]>
There's no need to modify the rootfs if we are privileged.

Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch from ad47308 to 4610a0c Compare April 29, 2025 15:58
@cyphar cyphar marked this pull request as ready for review April 29, 2025 15:58
@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch from 4610a0c to d296fde Compare April 29, 2025 16:11
@cyphar cyphar requested a review from tych0 April 29, 2025 16:35
tych0
tych0 previously approved these changes Apr 29, 2025
// opaque whiteouts
val, err := fsEval.Lgetxattr(path, "trusted.overlay.opaque")
if err != nil {
// If we are missing privileges to read the xattr (ENODATA) or the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ignoring ENODATA safe? If someone has marked an /etc/foo.d directory as a whiteout, lower layers might be able to inject config that higher layers tried to delete. Or am I misunderstanding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the usecase is that you wouldn't give untrusted user access to the raw overlayfs-extracted lowerdir, you would give them access to a merged overlayfs right? You can't set those xattrs through a merged overlayfs AFAIK.

(Also for the moment it's trusted.overlay.opaque so an unprivileged user couldn't write to it anyway. But this is the concern I had with enabling user.overlay.* support by default without making it a config option.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the usecase is that you wouldn't give untrusted user access to the raw overlayfs-extracted lowerdir, you would give them access to a merged overlayfs right?

Maybe, but in stacker's case it will extract them with the unprivileged uid/gid, so it's not impossible. Though I agree, it requires a container escape, and at that point maybe it doesn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I would prefer it if we could avoid this by just checking llistxattr (like with whiteout) but sadly that's not how that xattr works...

But I also think that ENODATA here is ultimately fine because it just means that either umoci isn't privileged enough (we need userxattr support to fix this -- #576) or the rootfs has an empty xattr (which could happen for a variety of reasons, and an attacker that can do that can do plenty of other bad things imho).

cyphar added 5 commits May 1, 2025 19:22
…attrs

As these are officially support ways of creating whiteouts with
overlayfs, we should support converting them. While the
trusted.overlay.whiteout xattr is probably very rare to see in the wild,
trusted.overlay.opaque=y is probably something we are going to encounter
and so we should support it.

It's not clear what we should do about trusted.overlay.opaque=x -- the
documentation (and code) imply that it has similar behaviour to
trusted.overlay.opaque=y but in practice directories marked with that
xattr don't actuall act like an opaque directory. We'll just ignore it
for now.

Signed-off-by: Aleksa Sarai <[email protected]>
The main thing to consider here is that when extracting stuff for
overlayfs we are dealing with one layer in the layer stack, which means
that we may be creating whiteouts for paths that don't exist or have the
wrong type.

In practice most archives will contain parent directories, but it's not
a given and we should be able to handle these kinds of edge cases (like
we do in the regular extraction mode).

Signed-off-by: Aleksa Sarai <[email protected]>
As per the overlayfs documentation[1], extracting plain overlayfs
whiteouts inside an opaque whiteout directory is not recommended:

> A directory is made opaque by setting the xattr
> "trusted.overlay.opaque" to "y". Where the upper filesystem contains
> an opaque directory, any directory in the lower filesystem with the
> same name is ignored.
>
> An opaque directory should not contain any whiteouts, because they do
> not serve any purpose.

In practice, it results in unfortunate readdir behaviour where the plain
whiteouts are listed by readdir despite not being accessible. Aside from
potention information leaks (proving that lower layers contain a file
that has been masked) it also can lead to user confusion.

Unfortunately the solution for this is kind of complicated, as in order
to support the OCI requirements for opaque directories we need to keep
any existing paths we have unpacked but we need to exclude whiteout
inodes from being created inside opaque whiteout directories.

The simplest solution is to have a trie (similar to upperPaths) which
tracks the whiteouts we have extracted, and we can then deleting
existing whiteouts or avoid extracting whiteouts based on the state of
whiteouts in the upper layer. While this could've been merged with
upperPaths, upperPaths contains far more entries and is stored more
efficiently than upperWhiteouts -- for whiteout handling in overlayfs
it's more efficient for us to only walk the subset of whiteout entries
(which should be very small for most images).

[1]: https://www.kernel.org/doc/html/v6.14/filesystems/overlayfs.html#whiteouts-and-opaque-directories

Signed-off-by: Aleksa Sarai <[email protected]>
The previous logic would only prune unused branches of the trie in very
specific (and trivial) circumstances, but it is quite easy for us to
optimise the pruning by simply tracking how many children there are
all in all descendants of the trie.

This lets us quickly decide whether a branch is dead and should be
pruned.

Signed-off-by: Aleksa Sarai <[email protected]>
Since Linux 6.7[1], overlayfs supports representations of an overlayfs
lower directory in a merged overlayfs through xattr escaping. Supporting
this is preferable to silently ignoring any trusted.overlay.* xattrs we
see in an archive.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dad02fad84cbce30f317b69a4f2391f90045f79d

Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar cyphar force-pushed the overlayfs-whiteout-fixes branch from d296fde to 9a1cefa Compare May 1, 2025 10:00
Copy link
Contributor

@hallyn hallyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(some offline in-depth discussion later...)

lgtm, thanks.

I'm hoping to test stacker against this later this week. Will open issues if I run into anything, but if there are issues i don't believe they'd be regressions, rather things this didn't fix (which the stacker fork did).

@cyphar cyphar merged commit 5563388 into opencontainers:main May 6, 2025
18 checks passed
@cyphar cyphar deleted the overlayfs-whiteout-fixes branch May 6, 2025 03:36
cyphar added a commit to cyphar/stacker that referenced this pull request May 31, 2025
This allows us to switch away from our umoci fork now that upstream
supports OverlayfsRootfs and the various features we need. The key
changes that allow us to switch away from our fork are:

 * opencontainers/umoci#572 which implemented a large number of fixes
   to overlayfs handling, such as opaque whiteouts and several features
   not implemented in our fork (xattr escaping, handling of missing
   parent directories, improved rootless support, handling of nested
   whiteouts inside an opaque whiteout).

 * opencontainers/umoci#581 which switched to a Docker-friendly gzip
   block size by default, removing the need to configure it (as
   suggested in opencontainers/umoci#509).

 * opencontainers/umoci#587 which implemented full configurable
   userxattr (user.overlay.*) support.

Signed-off-by: Aleksa Sarai <[email protected]>
rchincha pushed a commit to project-stacker/stacker that referenced this pull request May 31, 2025
* feat: update to skopeo v1.13.0

We need to update skopeo to match the pgzip version between skopeo and
umoci.

Signed-off-by: Aleksa Sarai <[email protected]>

* feat: update to github.com/opencontainers/[email protected]

This allows us to switch away from our umoci fork now that upstream
supports OverlayfsRootfs and the various features we need. The key
changes that allow us to switch away from our fork are:

 * opencontainers/umoci#572 which implemented a large number of fixes
   to overlayfs handling, such as opaque whiteouts and several features
   not implemented in our fork (xattr escaping, handling of missing
   parent directories, improved rootless support, handling of nested
   whiteouts inside an opaque whiteout).

 * opencontainers/umoci#581 which switched to a Docker-friendly gzip
   block size by default, removing the need to configure it (as
   suggested in opencontainers/umoci#509).

 * opencontainers/umoci#587 which implemented full configurable
   userxattr (user.overlay.*) support.

Signed-off-by: Aleksa Sarai <[email protected]>

---------

Signed-off-by: Aleksa Sarai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants