-
Notifications
You must be signed in to change notification settings - Fork 101
layer: numerous overlayfs whiteout fixes #572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #572 +/- ##
==========================================
- Coverage 72.72% 72.62% -0.11%
==========================================
Files 65 67 +2
Lines 5089 5391 +302
==========================================
+ Hits 3701 3915 +214
- Misses 1010 1100 +90
+ Partials 378 376 -2
🚀 New features to boost your workflow:
|
f7c61f8
to
325c907
Compare
7053dda
to
396d080
Compare
e5da54b
to
29ce3ca
Compare
There is no need to open-code the clearing logic because both Llistxattrs and Lremovexattr are called on the same path, so doing a single Wrap call is fine. Signed-off-by: Aleksa Sarai <[email protected]>
c190487
to
ad47308
Compare
For more complicated filtering it would be preferable to just have a callback function to tell lclearxattrs whether a given xattr should be skipped from clearing. Signed-off-by: Aleksa Sarai <[email protected]>
This mirrors TarExtractor, and lets us provide information about overlayfs whiteouts to the generator. Note that this is a bit of a temporary fix -- ideally we would unify the on-disk format configuration between TarExtractor and tarGenerator. Unfortunately this is a little complicated because WhiteoutMode was added to umoci.json back in commit 9ab5eff ("meta: include the WhiteoutMode") so it will require some compatibility considerations to fully implement in a later patchset. Signed-off-by: Aleksa Sarai <[email protected]>
GenerateInsertLayer using different naming to GenerateLayer makes it annoying when writing patches that affect both. Signed-off-by: Aleksa Sarai <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
Using the full path for the in-archive path results in uselss whiteout entries that arguably also disclose some information about the host system. Fixes: a425f92 ("layer: add support for translating overlay whiteouts to OCI ones") Signed-off-by: Aleksa Sarai <[email protected]>
A misplaced continue statement meant that TranslateOverlayWhiteouts would actually end up producing a layer with only overlayfs whiteout entries and no other files. Fixes: a425f92 ("layer: add support for translating overlay whiteouts to OCI ones") Signed-off-by: Aleksa Sarai <[email protected]>
We should actually explicitly verify that all of the entries in the generated archive actually match what we want in one go with assert.ElementsMatch. In the case of TranslateOverlayWhiteouts our tests were very bare-bones and missed several bugs in our implementation. Signed-off-by: Aleksa Sarai <[email protected]>
Fix up the helpers a bit, and for the opaque tests we really should've split out the parts that need root (trusted.overlay.* xattrs) from the other tests. Signed-off-by: Aleksa Sarai <[email protected]>
There's no need to have two simplified tar.Header representations for our tests, we can just have tarDentry be used for both checking the contents of a tar archive and for helping generate it. Signed-off-by: Aleksa Sarai <[email protected]>
If we're asked to generate a regular whiteout, adding an opaque entry as well is a no-op and could in theory confuse some extractors. Signed-off-by: Aleksa Sarai <[email protected]>
There's no need to modify the rootfs if we are privileged. Signed-off-by: Aleksa Sarai <[email protected]>
ad47308
to
4610a0c
Compare
4610a0c
to
d296fde
Compare
// opaque whiteouts | ||
val, err := fsEval.Lgetxattr(path, "trusted.overlay.opaque") | ||
if err != nil { | ||
// If we are missing privileges to read the xattr (ENODATA) or the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ignoring ENODATA safe? If someone has marked an /etc/foo.d
directory as a whiteout, lower layers might be able to inject config that higher layers tried to delete. Or am I misunderstanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the usecase is that you wouldn't give untrusted user access to the raw overlayfs-extracted lowerdir, you would give them access to a merged overlayfs right? You can't set those xattrs through a merged overlayfs AFAIK.
(Also for the moment it's trusted.overlay.opaque
so an unprivileged user couldn't write to it anyway. But this is the concern I had with enabling user.overlay.*
support by default without making it a config option.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the usecase is that you wouldn't give untrusted user access to the raw overlayfs-extracted lowerdir, you would give them access to a merged overlayfs right?
Maybe, but in stacker's case it will extract them with the unprivileged uid/gid, so it's not impossible. Though I agree, it requires a container escape, and at that point maybe it doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I would prefer it if we could avoid this by just checking llistxattr
(like with whiteout
) but sadly that's not how that xattr works...
But I also think that ENODATA
here is ultimately fine because it just means that either umoci isn't privileged enough (we need userxattr
support to fix this -- #576) or the rootfs has an empty xattr (which could happen for a variety of reasons, and an attacker that can do that can do plenty of other bad things imho).
…attrs As these are officially support ways of creating whiteouts with overlayfs, we should support converting them. While the trusted.overlay.whiteout xattr is probably very rare to see in the wild, trusted.overlay.opaque=y is probably something we are going to encounter and so we should support it. It's not clear what we should do about trusted.overlay.opaque=x -- the documentation (and code) imply that it has similar behaviour to trusted.overlay.opaque=y but in practice directories marked with that xattr don't actuall act like an opaque directory. We'll just ignore it for now. Signed-off-by: Aleksa Sarai <[email protected]>
The main thing to consider here is that when extracting stuff for overlayfs we are dealing with one layer in the layer stack, which means that we may be creating whiteouts for paths that don't exist or have the wrong type. In practice most archives will contain parent directories, but it's not a given and we should be able to handle these kinds of edge cases (like we do in the regular extraction mode). Signed-off-by: Aleksa Sarai <[email protected]>
As per the overlayfs documentation[1], extracting plain overlayfs whiteouts inside an opaque whiteout directory is not recommended: > A directory is made opaque by setting the xattr > "trusted.overlay.opaque" to "y". Where the upper filesystem contains > an opaque directory, any directory in the lower filesystem with the > same name is ignored. > > An opaque directory should not contain any whiteouts, because they do > not serve any purpose. In practice, it results in unfortunate readdir behaviour where the plain whiteouts are listed by readdir despite not being accessible. Aside from potention information leaks (proving that lower layers contain a file that has been masked) it also can lead to user confusion. Unfortunately the solution for this is kind of complicated, as in order to support the OCI requirements for opaque directories we need to keep any existing paths we have unpacked but we need to exclude whiteout inodes from being created inside opaque whiteout directories. The simplest solution is to have a trie (similar to upperPaths) which tracks the whiteouts we have extracted, and we can then deleting existing whiteouts or avoid extracting whiteouts based on the state of whiteouts in the upper layer. While this could've been merged with upperPaths, upperPaths contains far more entries and is stored more efficiently than upperWhiteouts -- for whiteout handling in overlayfs it's more efficient for us to only walk the subset of whiteout entries (which should be very small for most images). [1]: https://www.kernel.org/doc/html/v6.14/filesystems/overlayfs.html#whiteouts-and-opaque-directories Signed-off-by: Aleksa Sarai <[email protected]>
The previous logic would only prune unused branches of the trie in very specific (and trivial) circumstances, but it is quite easy for us to optimise the pruning by simply tracking how many children there are all in all descendants of the trie. This lets us quickly decide whether a branch is dead and should be pruned. Signed-off-by: Aleksa Sarai <[email protected]>
Since Linux 6.7[1], overlayfs supports representations of an overlayfs lower directory in a merged overlayfs through xattr escaping. Supporting this is preferable to silently ignoring any trusted.overlay.* xattrs we see in an archive. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dad02fad84cbce30f317b69a4f2391f90045f79d Signed-off-by: Aleksa Sarai <[email protected]>
d296fde
to
9a1cefa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(some offline in-depth discussion later...)
lgtm, thanks.
I'm hoping to test stacker against this later this week. Will open issues if I run into anything, but if there are issues i don't believe they'd be regressions, rather things this didn't fix (which the stacker fork did).
This allows us to switch away from our umoci fork now that upstream supports OverlayfsRootfs and the various features we need. The key changes that allow us to switch away from our fork are: * opencontainers/umoci#572 which implemented a large number of fixes to overlayfs handling, such as opaque whiteouts and several features not implemented in our fork (xattr escaping, handling of missing parent directories, improved rootless support, handling of nested whiteouts inside an opaque whiteout). * opencontainers/umoci#581 which switched to a Docker-friendly gzip block size by default, removing the need to configure it (as suggested in opencontainers/umoci#509). * opencontainers/umoci#587 which implemented full configurable userxattr (user.overlay.*) support. Signed-off-by: Aleksa Sarai <[email protected]>
* feat: update to skopeo v1.13.0 We need to update skopeo to match the pgzip version between skopeo and umoci. Signed-off-by: Aleksa Sarai <[email protected]> * feat: update to github.com/opencontainers/[email protected] This allows us to switch away from our umoci fork now that upstream supports OverlayfsRootfs and the various features we need. The key changes that allow us to switch away from our fork are: * opencontainers/umoci#572 which implemented a large number of fixes to overlayfs handling, such as opaque whiteouts and several features not implemented in our fork (xattr escaping, handling of missing parent directories, improved rootless support, handling of nested whiteouts inside an opaque whiteout). * opencontainers/umoci#581 which switched to a Docker-friendly gzip block size by default, removing the need to configure it (as suggested in opencontainers/umoci#509). * opencontainers/umoci#587 which implemented full configurable userxattr (user.overlay.*) support. Signed-off-by: Aleksa Sarai <[email protected]> --------- Signed-off-by: Aleksa Sarai <[email protected]>
It turns out that a lot of our overlayfs whiteout conversion code is actually quite broken, and quite a few things are not properly supported. Here are the set of issues we needed to fix:
GenerateLayer
would produce whiteouts with the wrong pathname (it would use the full path rather than the layer path), leaking information about the host and producing incorrect images as output.GenerateLayer
would skip any non-whiteouts files when operating inTranslateOverlayfsWhiteout
mode, silently losing data.TranslateOverlayfsWhiteout
was quite crude.GenerateInsertLayer
code would not usefsEval
in most places, which would cause rootless mode to not work properly.trusted.overlay.opaque=y
since users might use that to specify opaque whiteouts (there is no other way to specify this when umoci generates archives).overlayfs
(readdir
will report the existence of files that are whited out).(*tarExtractor).upperPaths
to be a trie that will allow us to remove already-extracted regular whiteouts from a directory that then gets tagged as an opaque whiteout, as well as skip extracting regular whiteouts from a directory that is already and opaque whiteout. None of this is necessary for regular extractions but overlayfs kind of forces us to do this.user.overlay.*
support feature moved to layer: overlay userxattr support #576.trusted.overlay.
escaping so that:trusted.overlay.*
will keep the same xattr name.trusted.overlay.*
will strip the xattrs buttrusted.overlay.overlay.*
will have one layer of escaping stripped when generating the tar archive.