Skip to content

more generic and full zstd compression support #562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 4, 2025

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Apr 3, 2025

Make our compression handling more generic, and support zstd decompression.

Also add a --compress flag that lets users specify compression for new layers. The default configuration is --compress=auto which (in contrast to the previous hardcode-gzip behaviour) will try to use the same compression algorithm as in the last layer. See umoci-repack(1) for more information.

Signed-off-by: Aleksa Sarai [email protected]

@codecov-commenter
Copy link

codecov-commenter commented Apr 3, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 71.30045% with 64 lines in your changes missing coverage. Please review.

Project coverage is 72.72%. Comparing base (e9fff47) to head (a79788c).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
oci/casext/blobcompress/zstd.go 36.11% 18 Missing and 5 partials ⚠️
oci/casext/blobcompress/gzip.go 57.57% 11 Missing and 3 partials ⚠️
oci/layer/unpack.go 53.84% 9 Missing and 3 partials ⚠️
mutate/mutate.go 83.33% 5 Missing and 2 partials ⚠️
cmd/umoci/utils_ux.go 86.20% 3 Missing and 1 partial ⚠️
oci/casext/blobcompress/algo.go 89.47% 1 Missing and 1 partial ⚠️
cmd/umoci/raw-add-layer.go 80.00% 0 Missing and 1 partial ⚠️
cmd/umoci/repack.go 85.71% 0 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #562      +/-   ##
==========================================
+ Coverage   72.43%   72.72%   +0.28%     
==========================================
  Files          60       65       +5     
  Lines        4985     5089     +104     
==========================================
+ Hits         3611     3701      +90     
- Misses       1000     1010      +10     
- Partials      374      378       +4     
Files with missing lines Coverage Δ
cmd/umoci/insert.go 80.00% <100.00%> (+0.56%) ⬆️
oci/casext/blobcompress/noop.go 100.00% <100.00%> (ø)
oci/casext/mediatype/compress.go 100.00% <100.00%> (ø)
pkg/iohelpers/count_reader.go 100.00% <100.00%> (ø)
cmd/umoci/raw-add-layer.go 56.60% <80.00%> (+1.26%) ⬆️
cmd/umoci/repack.go 84.67% <85.71%> (+3.05%) ⬆️
oci/casext/blobcompress/algo.go 89.47% <89.47%> (ø)
cmd/umoci/utils_ux.go 92.13% <86.20%> (-1.16%) ⬇️
mutate/mutate.go 72.68% <83.33%> (+0.36%) ⬆️
oci/layer/unpack.go 55.65% <53.84%> (-1.56%) ⬇️
... and 2 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cyphar cyphar force-pushed the zstd-compression branch from 22e012a to e8664b2 Compare April 3, 2025 16:38
cyphar added 3 commits April 4, 2025 03:45
The non-distributable stuff has all been deprecated by image-spec, so
there's no need to support it yet.

Signed-off-by: Aleksa Sarai <[email protected]>
add() was only ever used by Add() and the logic was already kind of
poorly split between them, so just merge the implementations.

Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar cyphar force-pushed the zstd-compression branch from e8664b2 to 1495576 Compare April 3, 2025 16:47
tych0
tych0 previously approved these changes Apr 3, 2025
Copy link
Member

@tych0 tych0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks pretty reasonable to me.

cyphar added 6 commits April 4, 2025 04:04
The previous implementation of the BytesRead() would store the value of
bytes read in the compressor implementation singleton, meaning that if
you were to compress multiple data streams you would end up with a
meaningless value from BytesRead(). Not good.

It also makes little sense to have the byte-counting logic embedded in
the compression code -- there is only one user that cares about this and
it's trivial to add a dummy io.Reader wrapper that counts the bytes
read.

Fixes: 8f65e8f ("mutate: add uncompressed blob size annotation")
Signed-off-by: Aleksa Sarai <[email protected]>
Until now we have had separate the compression and decompression logic
and algorithm definitions. This has made it easy for folks to forget to
add support for both (such as with zstd), and makes it more annoying to
try to make the feature generic.

Signed-off-by: Aleksa Sarai <[email protected]>
Rather than hard-coding a set of media-types, parse the "+foo" MIME type
suffix to figure out what compression algorithm was used for the layer.

This also enables zstd decompression support out of the box (since that
is a registered algorithm now) and also improves our error messages --
we now provide a more helpful error to explain whether the issue is that
the underlying MIME type of the layer is unsupported (think squashfs) or
if it's just an issue with the compression algorithm.

Signed-off-by: Aleksa Sarai <[email protected]>
If a user doesn't specify what compression type they would like to
(*Mutator).Add(), auto-select the compression algorithm by first trying
to use whatever the last layer's compression algorithm was (if we
support it) followed by a fallback to a default (gzip for now).

It is preferable to try to re-use the previous compression algorithm so
that users which operate on zstd-compressed images will produce
zstd-compressed layers rather than going with gzip.

For now, make this the default for the CLI as well. A future patch will
add support for a --compress=... flag that will let you configure this
behaviour and explicitly set a compression method.

Signed-off-by: Aleksa Sarai <[email protected]>
The default option is --compress=auto, which will try to match the
existing layer compression.

Signed-off-by: Aleksa Sarai <[email protected]>
It would be nice to have a test using an upstream zstd-compressed image,
rather than only using stuff we generate, but the tests we already have
for --compress=... should be more than enough to verify that we are
dealing with properly mixed compression types.

Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar cyphar force-pushed the zstd-compression branch from 1495576 to a79788c Compare April 3, 2025 17:04
@cyphar cyphar merged commit 0e2e670 into opencontainers:main Apr 4, 2025
18 checks passed
@cyphar cyphar deleted the zstd-compression branch April 4, 2025 03:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants