Improve seekable_format/examples/parallel_compression #4382

vasi · 2025-05-07T08:11:04Z

Change the design of parallel_compression, to allow it to actually work in parallel without blowing up memory usage. Fixes #3980.

In the new approach, we don't wait for all jobs to be accepted by the queue before writing any results. Instead, as each job finishes, it adds its result to an ordered-linked-list, and then writes (and frees) as many results as are available. We use a mutex for synchronization, so this should no longer be racy. This approach also conveniently allows streaming input, since we no longer need to know the number of frames ahead of time.

An added test enforces that parallel_compression doesn't use excessive memory.

This doesn't change anything about the main zstd program, so we should leave #3980 open, or open a new issue about zstd CLI seekability.

A variety of other fixes were necessary to get test passing, or were obvious cleanups once this was complete. Each is a separate commit.

This passes make flags, such as `-jN` for building in parallel, to the underlying make.

Some of these examples are intended to be parallel, and don't make sense to link against single-threaded libzstd. The filename of mt and nomt libzstd are identical, so it's still possible to link against the single-threaded one, just harder.

Previously, parallel_compression would only handle each job's results after ALL jobs were successfully queued. This caused all src/dst buffers to remain in memory until then! It also polled to check whether a job completed, which is racy without any memory barrier. Now, we flush results as a side effect of completing a job. Completed frames are placed in an ordered linked-list, and any eligible frames are flushed. This may be zero or multiple frames, depending on the order in which jobs finish. This design also makes it simple to support streaming input, so that is now available. Just pass `-` as the filename, and stdin/stdout will be used for I/O.

There was no memory barrier between writing and reading `done`, which would allow reordering to cause races. With so little data to handle after each job completes, we might as well just join.

Use ulimit to fail the test if we use O(filesize) memory, rather than O(threads).

Building lz4 as root was causing `make clean` to fail with permission errors. We used to have to install lz4 from source back in Ubuntu 14.04, but nowadays the installed lz4 is fine. Get rid of ancient helpers and cruft!

Cyan4973

Thanks @vasi !
Looks good to me !
It's also pretty clean, very readable, style is consistent,
and the modification comes with a test run in CI.
I like it.

vasi added 8 commits May 6, 2025 21:55

seekable_format: Build with $(MAKE)

744286a

This passes make flags, such as `-jN` for building in parallel, to the underlying make.

seekable_format: Cleanup POOL in parallel_compression

d9e7b81

seekable_format: Fix race in parallel_processing

7f56c5f

There was no memory barrier between writing and reading `done`, which would allow reordering to cause races. With so little data to handle after each job completes, we might as well just join.

seekable_format: Add test for parallel_compression memory usage

b39a5ea

Use ulimit to fail the test if we use O(filesize) memory, rather than O(threads).

seekable_format: Fix conversion warnings in parallel_compression

6620a73

lz4: Remove ancient test helpers

c94d4cb

Building lz4 as root was causing `make clean` to fail with permission errors. We used to have to install lz4 from source back in Ubuntu 14.04, but nowadays the installed lz4 is fine. Get rid of ancient helpers and cruft!

facebook-github-bot added the CLA Signed label May 7, 2025

Cyan4973 approved these changes May 8, 2025

View reviewed changes

Cyan4973 self-assigned this May 8, 2025

Cyan4973 merged commit f9938c2 into facebook:dev May 8, 2025
101 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve seekable_format/examples/parallel_compression #4382

Improve seekable_format/examples/parallel_compression #4382

Uh oh!

vasi commented May 7, 2025

Uh oh!

Cyan4973 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Improve seekable_format/examples/parallel_compression #4382

Improve seekable_format/examples/parallel_compression #4382

Uh oh!

Conversation

vasi commented May 7, 2025

Uh oh!

Cyan4973 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Cyan4973 left a comment •

edited

Loading