-
Notifications
You must be signed in to change notification settings - Fork 18.2k
flate: improve huffman flate hcode spatial locality #46007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…slices instead of a slice of structs.
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
This PR (HEAD: 05bf781) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/317789 to see it. Tip: You can toggle comments from me using the |
Message from Go Bot: Patch Set 1: Congratulations on opening your first change. Thank you for your contribution! Next steps: Most changes in the Go project go through a few rounds of revision. This can be During May-July and Nov-Jan the Go project is in a code freeze, during which Please don’t reply on this GitHub thread. Visit golang.org/cl/317789. |
Message from Joe Tsai: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/317789. |
Message from Teiva Harsanyi: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/317789. |
Experiment by adding golang/go#46007 from @teivah Before/after: ``` file out level insize outsize millis mb/s github-ranks-backup.bin gzkp 1 1862623243 458201422 6979 254.51 github-ranks-backup.bin gzkp 1 1862623243 458201422 7273 244.22 enwik9 gzkp 1 1000000000 382781160 5805 164.26 enwik9 gzkp 1 1000000000 382781160 5976 159.57 github-ranks-backup.bin gzkp -2 1862623243 1298789681 5592 317.65 github-ranks-backup.bin gzkp -2 1862623243 1298789681 5420 327.70 ``` Slower for general compression, but faster for huffman only compression.
Improving huffman
hcode
spatial locality and optimize accesses when iterating overlen
. For example:go/src/compress/flate/huffman_bit_writer.go
Lines 211 to 213 in 4469f54
I would propose that instead of having in
huffmanEncoder
a slice ofhcode
, to have anhcode
struct holding a slice ofcode
and a slice oflen
. It would optimize the utilization of the CPU cache lines.Here are the results I'm getting while comparing both benchmarks locally (x86) with
benstat
: