Skip to content

Commit 9be1f2f

Browse files
committed
Use LZ4 compression
It mostly helps a little on compression ratio (0.2813 / 0.2655 ≈ 1.06 overall), but it sometimes helps a lot. For example, on https://qoiformat.org/benchmark/images/icon_512/apps-preferences-desktop-locale.png The original PNG file is 26182 bytes. In QOI form, 191293 bytes. In QOIR form, before this commit, 184359 bytes. In QOIR form, after this commit, 41620 bytes. After 0.2655 CmpRatio 163.19 EncMPixels/s 219.50 DecMPixels/s Before 0.2813 CmpRatio 196.59 EncMPixels/s 225.13 DecMPixels/s
1 parent cc015fc commit 9be1f2f

File tree

5 files changed

+123
-53
lines changed

5 files changed

+123
-53
lines changed

doc/benchmarks.txt

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -18,23 +18,23 @@ My desktop machine:
1818

1919
---
2020

21-
CmpRatio = Compression Ratio
22-
EncMPixels/s = Encode MegaPixels per second
23-
DecMPixels/s = Decode MegaPixels per second
21+
CmpRatio = CompressedSize / DecompressedSize. Lower is better.
22+
EncMPixels/s = Encode MegaPixels per second. Higher is better.
23+
DecMPixels/s = Decode MegaPixels per second. Higher is better.
2424

2525
---
2626

27-
QOIR 0.2813 CmpRatio 196.59 EncMPixels/s 225.13 DecMPixels/s images/
28-
QOIR 0.0810 CmpRatio 479.80 EncMPixels/s 484.53 DecMPixels/s images/icon_512/
29-
QOIR 0.2886 CmpRatio 307.94 EncMPixels/s 314.12 DecMPixels/s images/icon_64/
30-
QOIR 0.5736 CmpRatio 122.88 EncMPixels/s 143.63 DecMPixels/s images/photo_kodak/
31-
QOIR 0.5990 CmpRatio 116.92 EncMPixels/s 144.26 DecMPixels/s images/photo_tecnick/
32-
QOIR 0.6615 CmpRatio 117.69 EncMPixels/s 143.00 DecMPixels/s images/photo_wikipedia/
33-
QOIR 0.2024 CmpRatio 252.76 EncMPixels/s 282.26 DecMPixels/s images/pngimg/
34-
QOIR 0.2435 CmpRatio 202.50 EncMPixels/s 228.36 DecMPixels/s images/screenshot_game/
35-
QOIR 0.0824 CmpRatio 410.65 EncMPixels/s 446.59 DecMPixels/s images/screenshot_web/
36-
QOIR 0.6424 CmpRatio 139.34 EncMPixels/s 149.47 DecMPixels/s images/textures_photo/
37-
QOIR 0.5675 CmpRatio 117.88 EncMPixels/s 150.63 DecMPixels/s images/textures_pk/
38-
QOIR 0.3540 CmpRatio 177.40 EncMPixels/s 197.71 DecMPixels/s images/textures_pk01/
39-
QOIR 0.4038 CmpRatio 159.58 EncMPixels/s 166.52 DecMPixels/s images/textures_pk02/
40-
QOIR 0.2224 CmpRatio 266.63 EncMPixels/s 299.83 DecMPixels/s images/textures_plants/
27+
QOIR 0.2655 CmpRatio 163.19 EncMPixels/s 219.50 DecMPixels/s images/
28+
QOIR 0.0563 CmpRatio 354.91 EncMPixels/s 449.48 DecMPixels/s images/icon_512/
29+
QOIR 0.2568 CmpRatio 180.26 EncMPixels/s 282.80 DecMPixels/s images/icon_64/
30+
QOIR 0.5676 CmpRatio 102.31 EncMPixels/s 141.52 DecMPixels/s images/photo_kodak/
31+
QOIR 0.5974 CmpRatio 107.93 EncMPixels/s 146.89 DecMPixels/s images/photo_tecnick/
32+
QOIR 0.6597 CmpRatio 106.10 EncMPixels/s 145.07 DecMPixels/s images/photo_wikipedia/
33+
QOIR 0.1883 CmpRatio 209.27 EncMPixels/s 273.31 DecMPixels/s images/pngimg/
34+
QOIR 0.2199 CmpRatio 165.27 EncMPixels/s 221.52 DecMPixels/s images/screenshot_game/
35+
QOIR 0.0701 CmpRatio 336.18 EncMPixels/s 423.63 DecMPixels/s images/screenshot_web/
36+
QOIR 0.6297 CmpRatio 103.66 EncMPixels/s 142.62 DecMPixels/s images/textures_photo/
37+
QOIR 0.5243 CmpRatio 85.75 EncMPixels/s 140.95 DecMPixels/s images/textures_pk/
38+
QOIR 0.3367 CmpRatio 144.59 EncMPixels/s 190.61 DecMPixels/s images/textures_pk01/
39+
QOIR 0.3909 CmpRatio 124.89 EncMPixels/s 158.69 DecMPixels/s images/textures_pk02/
40+
QOIR 0.2183 CmpRatio 217.01 EncMPixels/s 290.77 DecMPixels/s images/textures_plants/

doc/qoir_file_format.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,17 +41,31 @@ in the natural order (the same as pixels: left-to-right and top-to-bottom).
4141

4242
A tile's encoding consists of a 4 byte prefix:
4343

44-
- 3 byte EncodedTileLength.
44+
- 3 byte EncodedTileLength. Values above 0x10000 = 65536 are invalid unless the
45+
high bit of the EncodedTileFormat is set. None of the currently supported
46+
EncodedTileFormat values have that high bit set, but future versions might
47+
use this.
4548
- 1 byte EncodedTileFormat
4649

4750
After the prefix are EncodedTileLength bytes whose interpretation depends on
4851
the EncodedTileFormat:
4952

5053
- 0x00 "Literals tile format" means that the encoded tile bytes are literally
51-
uncompressed RGBA values.
54+
RGBA values (with no compression).
5255
- 0x01 "Opcodes tile format" means that the encoded tile bytes are pixel
53-
opcodes (similar to QOI opcodes). This is the 'meat' of the format.
54-
- Other values are valid (for forward compatibility) but the decoder should
55-
reject them as unsupported.
56+
opcodes (see below).
57+
- 0x02 "LZ4-Literals tile format" means that the encoded tile bytes are LZ4
58+
compressed. The decompressed bytes are like the "Literals tile format".
59+
- 0x03 "LZ4-Opcodes tile format" means that the encoded tile bytes are LZ4
60+
compressed. The decompressed bytes are like the "Opcodes tile format".
61+
- Other values are valid (for forward compatibility) but decoders should reject
62+
them as unsupported.
63+
64+
LZ4 specifically means [LZ4 block
65+
compression](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md). When
66+
used in QOIR encoded tiles, a decompressed size above 65536 is invalid.
67+
68+
69+
## Pixel Opcodes
5670

5771
TODO: add more details.

run_benchmarks.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ CFLAGS=${CFLAGS:--Wall -O3}
55

66
mkdir -p gen
77
echo Compiling...
8-
$CC $CFLAGS test/benchmarks.c -o gen/benchmarks
8+
$CC $CFLAGS test/benchmarks.c -llz4 -o gen/benchmarks
99
echo Running...
1010
# The "| awk etc" sorts by the final column.
1111
gen/benchmarks ${@:-test/data} | awk '{print $NF,$0}' | sort | cut -f2- -d' '

run_unit_tests.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@ CFLAGS=${CFLAGS:--Wall}
55

66
mkdir -p gen
77
echo Compiling...
8-
$CC $CFLAGS test/unit_tests.c -o gen/unit_tests
8+
$CC $CFLAGS test/unit_tests.c -llz4 -o gen/unit_tests
99
echo Running...
1010
gen/unit_tests

src/qoir.h

Lines changed: 85 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,9 @@
2626
#include <stdlib.h>
2727
#include <string.h>
2828

29+
// TODO: remove the dependency.
30+
#include <lz4.h>
31+
2932
#ifdef __cplusplus
3033
extern "C" {
3134
#endif
@@ -127,6 +130,9 @@ qoir_pixel_format__bytes_per_pixel(qoir_pixel_format pixfmt) {
127130
#define QOIR_TILE_SIZE 0x80
128131
#define QOIR_TILE_SHIFT 7
129132

133+
// QOIR_TS2 is the maximum (inclusive) number of pixels in a tile.
134+
#define QOIR_TS2 (QOIR_TILE_SIZE * QOIR_TILE_SIZE)
135+
130136
// -------- QOIR Decode
131137

132138
typedef struct qoir_decode_pixel_configuration_result_struct {
@@ -141,7 +147,10 @@ qoir_decode_pixel_configuration( //
141147

142148
typedef struct qoir_decode_buffer_struct {
143149
struct {
144-
uint8_t rgba[4 * QOIR_TILE_SIZE * QOIR_TILE_SIZE];
150+
// opcodes has to be before literals, so that (worst case) we can read (and
151+
// ignore) 8 bytes past the end of the opcodes array. See §
152+
uint8_t opcodes[4 * QOIR_TS2];
153+
uint8_t literals[4 * QOIR_TS2];
145154
} private_impl;
146155
} qoir_decode_buffer;
147156

@@ -173,7 +182,11 @@ qoir_decode( //
173182

174183
typedef struct qoir_encode_buffer_struct {
175184
struct {
176-
uint8_t rgba[4 * QOIR_TILE_SIZE * QOIR_TILE_SIZE];
185+
// opcodes' size is (5 * QOIR_TS2), not (4 * QOIR_TS2), because in the
186+
// worst case (during encoding, before discarding the too-long opcodes in
187+
// favor of literals), each pixel uses QOI_OP_RGBA, 5 bytes each.
188+
uint8_t opcodes[5 * QOIR_TS2];
189+
uint8_t literals[4 * QOIR_TS2];
177190
} private_impl;
178191
} qoir_encode_buffer;
179192

@@ -558,27 +571,57 @@ qoir_private_decode_qpix_payload( //
558571
src_ptr += 4;
559572
src_len -= 4;
560573
size_t tile_len = prefix & 0xFFFFFF;
561-
if (src_len < (tile_len + 8)) {
574+
if ((src_len < (tile_len + 8)) || //
575+
(((4 * QOIR_TS2) < tile_len) && ((prefix >> 31) != 0))) {
562576
return qoir_status_message__error_invalid_data;
563577
}
564578

565-
const uint8_t* rgba = NULL;
579+
const uint8_t* literals = NULL;
566580
switch (prefix >> 24) {
567581
case 0: { // Literals tile format.
568582
if (tile_len != (4 * tw * th)) {
569583
return qoir_status_message__error_invalid_data;
570584
}
571-
rgba = src_ptr;
585+
literals = src_ptr;
572586
break;
573587
}
574588
case 1: { // Opcodes tile format.
575589
const char* status_message = qoir_private_decode_tile_opcodes(
576-
decbuf->private_impl.rgba, (uint32_t)tw, (uint32_t)th, //
590+
decbuf->private_impl.literals, (uint32_t)tw, (uint32_t)th, //
577591
src_ptr, tile_len + 8); // See § for +8.
578592
if (status_message) {
579593
return status_message;
580594
}
581-
rgba = decbuf->private_impl.rgba;
595+
literals = decbuf->private_impl.literals;
596+
break;
597+
}
598+
case 2: { // LZ4-Literals tile format.
599+
int n = LZ4_decompress_safe((const char*)src_ptr, //
600+
(char*)decbuf->private_impl.literals, //
601+
tile_len, //
602+
sizeof(decbuf->private_impl.literals));
603+
if (n < 0) {
604+
return qoir_status_message__error_invalid_data;
605+
}
606+
literals = decbuf->private_impl.literals;
607+
break;
608+
}
609+
case 3: { // LZ4-Opcodes tile format.
610+
int n = LZ4_decompress_safe((const char*)src_ptr, //
611+
(char*)decbuf->private_impl.opcodes, //
612+
tile_len, //
613+
sizeof(decbuf->private_impl.opcodes));
614+
if (n < 0) {
615+
return qoir_status_message__error_invalid_data;
616+
}
617+
const char* status_message = qoir_private_decode_tile_opcodes(
618+
decbuf->private_impl.literals, //
619+
(uint32_t)tw, (uint32_t)th, //
620+
decbuf->private_impl.opcodes, n + 8); // See § for +8.
621+
if (status_message) {
622+
return status_message;
623+
}
624+
literals = decbuf->private_impl.literals;
582625
break;
583626
}
584627
default:
@@ -590,7 +633,7 @@ qoir_private_decode_qpix_payload( //
590633

591634
uint8_t* dp =
592635
dst_data + (dst_stride_in_bytes * ty) + (num_dst_channels * tx);
593-
(*swizzle)(dp, dst_stride_in_bytes, rgba, 4 * tw, tw, th);
636+
(*swizzle)(dp, dst_stride_in_bytes, literals, 4 * tw, tw, th);
594637
}
595638
}
596639

@@ -879,30 +922,44 @@ qoir_private_encode_qpix_payload( //
879922
const uint8_t* sp = src_pixbuf->data +
880923
(src_pixbuf->stride_in_bytes * ty) +
881924
(num_src_channels * tx);
882-
(*swizzle)(encbuf->private_impl.rgba, 4 * tw, //
883-
sp, src_pixbuf->stride_in_bytes, //
925+
(*swizzle)(encbuf->private_impl.literals, 4 * tw, //
926+
sp, src_pixbuf->stride_in_bytes, //
884927
tw, th);
885928

886-
// qoir_private_encode_tile_opcodes can (temporarily) write up to (5 *
887-
// QOIR_TILE_SIZE * QOIR_TILE_SIZE) bytes, since QOI_OP_RGBA is 5 bytes,
888-
// but we use the Literals tile format if its shorter, worst case is (4 *
889-
// QTS * QTS). The difference, ((5 - 4) * QTS * QTS), is pre-allocated by
890-
// the caller as extra 'scratch space'. Reference: †
891929
qoir_private_size_t_result r = qoir_private_encode_tile_opcodes(
892-
dp + 4, encbuf->private_impl.rgba, tw, th);
930+
encbuf->private_impl.opcodes, encbuf->private_impl.literals, tw, th);
931+
size_t literals_len = 4 * tw * th;
893932
if (r.status_message) {
894933
result.status_message = r.status_message;
895934
return r;
896-
} else if (r.value >= (4 * tw * th)) {
897-
// Use the Literals tile format.
898-
size_t n = 4 * tw * th;
899-
memcpy(dp + 4, encbuf->private_impl.rgba, n);
900-
qoir_private_poke_u32le(dp, (uint32_t)n);
901-
dp += 4 + n;
935+
936+
} else if (r.value >= literals_len) {
937+
// Use the Literals or LZ4-Literals tile format.
938+
int n = LZ4_compress_default(
939+
((const char*)(encbuf->private_impl.literals)), ((char*)(dp + 4)),
940+
(int)literals_len, 4 * QOIR_TS2);
941+
if ((0 < n) && (n < literals_len)) {
942+
qoir_private_poke_u32le(dp, 0x02000000 | (uint32_t)n);
943+
dp += 4 + n;
944+
} else {
945+
memcpy(dp + 4, encbuf->private_impl.literals, literals_len);
946+
qoir_private_poke_u32le(dp, 0x00000000 | (uint32_t)literals_len);
947+
dp += 4 + literals_len;
948+
}
949+
902950
} else {
903-
// Use the Opcodes tile format.
904-
qoir_private_poke_u32le(dp, ((uint32_t)(r.value | 0x01000000)));
905-
dp += 4 + r.value;
951+
// Use the Opcodes or LZ4-Opcodes tile format.
952+
int n =
953+
LZ4_compress_default(((const char*)(encbuf->private_impl.opcodes)),
954+
((char*)(dp + 4)), (int)r.value, 4 * QOIR_TS2);
955+
if ((0 < n) && (n < r.value)) {
956+
qoir_private_poke_u32le(dp, 0x03000000 | (uint32_t)n);
957+
dp += 4 + n;
958+
} else {
959+
memcpy(dp + 4, encbuf->private_impl.opcodes, r.value);
960+
qoir_private_poke_u32le(dp, 0x01000000 | (uint32_t)r.value);
961+
dp += 4 + r.value;
962+
}
906963
}
907964
}
908965
}
@@ -950,12 +1007,11 @@ qoir_encode( //
9501007
uint64_t height_in_tiles =
9511008
(src_pixbuf->pixcfg.height_in_pixels + QOIR_TILE_MASK) >> QOIR_TILE_SHIFT;
9521009
uint64_t tile_len_worst_case =
953-
4 + (4 * QOIR_TILE_SIZE * QOIR_TILE_SIZE); // Prefix + literal format.
1010+
4 + (4 * QOIR_TS2); // Prefix + literal format.
9541011
uint64_t dst_len_worst_case =
9551012
(width_in_tiles * height_in_tiles * tile_len_worst_case) +
956-
44 + // QOIR, QPIX and QEND chunk headers are 12 bytes each.
957-
// QOIR also has an 8 byte payload.
958-
(QOIR_TILE_SIZE * QOIR_TILE_SIZE); // See †.
1013+
44; // QOIR, QPIX and QEND chunk headers are 12 bytes each.
1014+
// QOIR also has an 8 byte payload.
9591015
if (dst_len_worst_case > SIZE_MAX) {
9601016
result.status_message =
9611017
qoir_status_message__error_unsupported_pixbuf_dimensions;

0 commit comments

Comments
 (0)