-
Notifications
You must be signed in to change notification settings - Fork 14.1k
[ELF] Add --compress-section to compress matched non-SHF_ALLOC sections #84855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ELF] Add --compress-section to compress matched non-SHF_ALLOC sections #84855
Conversation
Created using spr 1.3.5-bogner
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld Author: Fangrui Song (MaskRay) Changes--compress-nonalloc-sections <section-glib>=[none|zlib|zstd] is similar An older revision is available at https://reviews.llvm.org/D154641 . Since different output sections may use different compression GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452 Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674 Full diff: https://github.com/llvm/llvm-project/pull/84855.diff 10 Files Affected:
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 691ebfc074320f..40319e23263b52 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -223,6 +223,8 @@ struct Config {
bool checkSections;
bool checkDynamicRelocs;
llvm::DebugCompressionType compressDebugSections;
+ llvm::SmallVector<std::pair<llvm::GlobPattern, llvm::DebugCompressionType>, 0>
+ compressNonAllocSections;
bool cref;
llvm::SmallVector<std::pair<llvm::GlobPattern, uint64_t>, 0>
deadRelocInNonAlloc;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index de4b2e345ac917..816eb77bb178d8 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1516,6 +1516,23 @@ static void readConfigs(opt::InputArgList &args) {
}
}
+ for (opt::Arg *arg : args.filtered(OPT_compress_nonalloc_sections)) {
+ SmallVector<StringRef, 0> fields;
+ StringRef(arg->getValue()).split(fields, '=');
+ if (fields.size() != 2 || fields[1].empty()) {
+ error(arg->getSpelling() +
+ ": parse error, not 'section-glob=[none|zlib|zstd]'");
+ continue;
+ }
+ auto type = getCompressionType(fields[1], arg->getSpelling());
+ if (Expected<GlobPattern> pat = GlobPattern::create(fields[0])) {
+ config->compressNonAllocSections.emplace_back(std::move(*pat), type);
+ } else {
+ error(arg->getSpelling() + ": " + toString(pat.takeError()));
+ continue;
+ }
+ }
+
for (opt::Arg *arg : args.filtered(OPT_z)) {
std::pair<StringRef, StringRef> option =
StringRef(arg->getValue()).split('=');
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index c10a73e2d9c36f..568e845b735b83 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -67,6 +67,10 @@ defm compress_debug_sections:
Eq<"compress-debug-sections", "Compress DWARF debug sections">,
MetaVarName<"[none,zlib,zstd]">;
+defm compress_nonalloc_sections: EEq<"compress-nonalloc-sections",
+ "Compress non-SHF_ALLOC output sections matching <section-glob>">,
+ MetaVarName<"<section-glob>=[none|zlib|zstd]">;
+
defm defsym: Eq<"defsym", "Define a symbol alias">, MetaVarName<"<symbol>=<value>">;
defm optimize_bb_jumps: BB<"optimize-bb-jumps",
diff --git a/lld/ELF/OutputSections.cpp b/lld/ELF/OutputSections.cpp
index ee937418678707..0928f5e711b442 100644
--- a/lld/ELF/OutputSections.cpp
+++ b/lld/ELF/OutputSections.cpp
@@ -326,17 +326,29 @@ static SmallVector<uint8_t, 0> deflateShard(ArrayRef<uint8_t> in, int level,
}
#endif
-// Compress section contents if this section contains debug info.
+// Compress certain non-SHF_ALLOC sections:
+//
+// * (--compress-debug-sections is specified) non-empty .debug_* sections
+// * (--compress-nonalloc-sections is specified) matched sections
template <class ELFT> void OutputSection::maybeCompress() {
using Elf_Chdr = typename ELFT::Chdr;
(void)sizeof(Elf_Chdr);
+ if (flags & SHF_ALLOC)
+ return;
- // Compress only DWARF debug sections.
- if (config->compressDebugSections == DebugCompressionType::None ||
- (flags & SHF_ALLOC) || !name.starts_with(".debug_") || size == 0)
+ DebugCompressionType type = DebugCompressionType::None;
+ if (config->compressDebugSections != DebugCompressionType::None &&
+ name.starts_with(".debug_") && size) {
+ type = config->compressDebugSections;
+ } else {
+ for (auto &[glob, t] : config->compressNonAllocSections)
+ if (glob.match(name))
+ type = t;
+ }
+ if (type == DebugCompressionType::None)
return;
- llvm::TimeTraceScope timeScope("Compress debug sections");
+ llvm::TimeTraceScope timeScope("Compress sections");
compressed.uncompressedSize = size;
auto buf = std::make_unique<uint8_t[]>(size);
// Write uncompressed data to a temporary zero-initialized buffer.
@@ -344,14 +356,21 @@ template <class ELFT> void OutputSection::maybeCompress() {
parallel::TaskGroup tg;
writeTo<ELFT>(buf.get(), tg);
}
+ // The generic ABI specifies "The sh_size and sh_addralign fields of the
+ // section header for a compressed section reflect the requirements of the
+ // compressed section." However, 1-byte alignment has been wildly accepted
+ // and utilized for a long time. Removing alignment padding is particularly
+ // useful when there are many compressed output sections.
+ addralign = 1;
#if LLVM_ENABLE_ZSTD
// Use ZSTD's streaming compression API which permits parallel workers working
// on the stream. See http://facebook.github.io/zstd/zstd_manual.html
// "Streaming compression - HowTo".
- if (config->compressDebugSections == DebugCompressionType::Zstd) {
+ if (type == DebugCompressionType::Zstd) {
// Allocate a buffer of half of the input size, and grow it by 1.5x if
// insufficient.
+ compressed.type = ELFCOMPRESS_ZSTD;
compressed.shards = std::make_unique<SmallVector<uint8_t, 0>[]>(1);
SmallVector<uint8_t, 0> &out = compressed.shards[0];
out.resize_for_overwrite(std::max<size_t>(size / 2, 32));
@@ -424,6 +443,7 @@ template <class ELFT> void OutputSection::maybeCompress() {
}
size += 4; // checksum
+ compressed.type = ELFCOMPRESS_ZLIB;
compressed.shards = std::move(shardsOut);
compressed.numShards = numShards;
compressed.checksum = checksum;
@@ -450,20 +470,18 @@ void OutputSection::writeTo(uint8_t *buf, parallel::TaskGroup &tg) {
if (type == SHT_NOBITS)
return;
- // If --compress-debug-section is specified and if this is a debug section,
- // we've already compressed section contents. If that's the case,
- // just write it down.
+ // If the section is compressed due to
+ // --compress-debug-section/--compress-sections, the content is already known.
if (compressed.shards) {
auto *chdr = reinterpret_cast<typename ELFT::Chdr *>(buf);
+ chdr->ch_type = compressed.type;
chdr->ch_size = compressed.uncompressedSize;
chdr->ch_addralign = addralign;
buf += sizeof(*chdr);
- if (config->compressDebugSections == DebugCompressionType::Zstd) {
- chdr->ch_type = ELFCOMPRESS_ZSTD;
+ if (compressed.type == ELFCOMPRESS_ZSTD) {
memcpy(buf, compressed.shards[0].data(), compressed.shards[0].size());
return;
}
- chdr->ch_type = ELFCOMPRESS_ZLIB;
// Compute shard offsets.
auto offsets = std::make_unique<size_t[]>(compressed.numShards);
diff --git a/lld/ELF/OutputSections.h b/lld/ELF/OutputSections.h
index c7931471a6ed30..016da2237e44d6 100644
--- a/lld/ELF/OutputSections.h
+++ b/lld/ELF/OutputSections.h
@@ -23,6 +23,7 @@ struct PhdrEntry;
struct CompressedData {
std::unique_ptr<SmallVector<uint8_t, 0>[]> shards;
+ uint32_t type = 0;
uint32_t numShards = 0;
uint32_t checksum = 0;
uint64_t uncompressedSize;
@@ -116,12 +117,13 @@ class OutputSection final : public SectionBase {
void sortInitFini();
void sortCtorsDtors();
+ // Used for implementation of --compress-debug-sections and
+ // --compress-nonalloc-sections.
+ CompressedData compressed;
+
private:
SmallVector<InputSection *, 0> storage;
- // Used for implementation of --compress-debug-sections option.
- CompressedData compressed;
-
std::array<uint8_t, 4> getFiller();
};
diff --git a/lld/docs/ld.lld.1 b/lld/docs/ld.lld.1
index e4d39e47f5c5ab..44507180673222 100644
--- a/lld/docs/ld.lld.1
+++ b/lld/docs/ld.lld.1
@@ -164,6 +164,10 @@ to set the compression level to 6.
The compression level is 5.
.El
.Pp
+.It Fl -compress-nonalloc-sections Ns = Ns Ar section-glob=[none|zlib|zstd]
+Compress output sections that match the glob and do not have the SHF_ALLOC flag.
+This is like a generalized
+.Cm --compress-debug-sections.
.It Fl -cref
Output cross reference table. If
.Fl Map
diff --git a/lld/test/ELF/compress-sections-err.s b/lld/test/ELF/compress-nonalloc-sections-err.s
similarity index 62%
rename from lld/test/ELF/compress-sections-err.s
rename to lld/test/ELF/compress-nonalloc-sections-err.s
index 09780380708319..4924bba29575fe 100644
--- a/lld/test/ELF/compress-sections-err.s
+++ b/lld/test/ELF/compress-nonalloc-sections-err.s
@@ -5,8 +5,11 @@
# RUN: ld.lld %t.o --compress-debug-sections=zlib --compress-debug-sections=none -o /dev/null 2>&1 | count 0
# RUN: not ld.lld %t.o --compress-debug-sections=zlib -o /dev/null 2>&1 | \
# RUN: FileCheck %s --implicit-check-not=error:
+# RUN: not ld.lld %t.o --compress-nonalloc-sections=foo=zlib -o /dev/null 2>&1 | \
+# RUN: FileCheck %s --check-prefix=CHECK2 --implicit-check-not=error:
# CHECK: error: --compress-debug-sections: LLVM was not built with LLVM_ENABLE_ZLIB or did not find zlib at build time
+# CHECK2: error: --compress-nonalloc-sections: LLVM was not built with LLVM_ENABLE_ZLIB or did not find zlib at build time
.globl _start
_start:
diff --git a/lld/test/ELF/compress-nonalloc-sections-special.s b/lld/test/ELF/compress-nonalloc-sections-special.s
new file mode 100644
index 00000000000000..b78b542987ddd9
--- /dev/null
+++ b/lld/test/ELF/compress-nonalloc-sections-special.s
@@ -0,0 +1,27 @@
+# REQUIRES: x86, zlib
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
+# RUN: ld.lld -pie %t.o --compress-nonalloc-sections .strtab=zlib --compress-nonalloc-sections .symtab=zlib -o %t
+# RUN: llvm-readelf -Ss -x .strtab %t 2>&1 | FileCheck %s
+
+# CHECK: nonalloc0 PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 00 0 0 1
+# CHECK: .symtab SYMTAB 0000000000000000 [[#%x,]] [[#%x,]] 18 C 12 3 1
+# CHECK-NEXT: .shstrtab STRTAB 0000000000000000 [[#%x,]] [[#%x,]] 00 0 0 1
+# CHECK-NEXT: .strtab STRTAB 0000000000000000 [[#%x,]] [[#%x,]] 00 C 0 0 1
+
+## TODO Add compressed SHT_STRTAB/SHT_SYMTAB support to llvm-readelf
+# CHECK: warning: {{.*}}: unable to get the string table for the SHT_SYMTAB section: SHT_STRTAB string table section
+
+# CHECK: Hex dump of section '.strtab':
+# CHECK-NEXT: 01000000 00000000 1a000000 00000000
+# CHECK-NEXT: 01000000 00000000 {{.*}}
+
+.globl _start, g0, g1
+_start:
+l0:
+g0:
+g1:
+
+.section nonalloc0,""
+.quad .text+1
+.quad .text+2
diff --git a/lld/test/ELF/compress-nonalloc-sections.s b/lld/test/ELF/compress-nonalloc-sections.s
new file mode 100644
index 00000000000000..2f161c0fe50c08
--- /dev/null
+++ b/lld/test/ELF/compress-nonalloc-sections.s
@@ -0,0 +1,82 @@
+# REQUIRES: x86, zlib, zstd
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
+# RUN: ld.lld -pie %t.o -o %t --compress-nonalloc-sections '*0=zlib' --compress-nonalloc-sections '*0=none'
+# RUN: llvm-readelf -SrsX %t | FileCheck %s --check-prefix=CHECK1
+
+# CHECK1: Name Type Address Off Size ES Flg Lk Inf Al
+# CHECK1: foo0 PROGBITS [[#%x,FOO0:]] [[#%x,]] [[#%x,]] 00 A 0 0 8
+# CHECK1-NEXT: foo1 PROGBITS [[#%x,FOO1:]] [[#%x,]] [[#%x,]] 00 A 0 0 8
+# CHECK1-NEXT: .text PROGBITS [[#%x,TEXT:]] [[#%x,]] [[#%x,]] 00 AX 0 0 4
+# CHECK1: nonalloc0 PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 00 0 0 8
+# CHECK1-NEXT: nonalloc1 PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 00 0 0 8
+# CHECK1-NEXT: .debug_str PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 01 MS 0 0 1
+
+# CHECK1: 0000000000000010 0 NOTYPE LOCAL DEFAULT [[#]] (nonalloc0) sym0
+# CHECK1: 0000000000000008 0 NOTYPE LOCAL DEFAULT [[#]] (nonalloc1) sym1
+
+# RUN: ld.lld -pie %t.o -o %t2 --compress-nonalloc-sections '*0=zlib' --compress-nonalloc-sections .debug_str=zstd
+# RUN: llvm-readelf -SrsX -x nonalloc0 -x .debug_str %t2 | FileCheck %s --check-prefix=CHECK2
+
+# CHECK2: Name Type Address Off Size ES Flg Lk Inf Al
+# CHECK2: foo0 PROGBITS [[#%x,FOO0:]] [[#%x,]] [[#%x,]] 00 A 0 0 8
+# CHECK2-NEXT: foo1 PROGBITS [[#%x,FOO1:]] [[#%x,]] [[#%x,]] 00 A 0 0 8
+# CHECK2-NEXT: .text PROGBITS [[#%x,TEXT:]] [[#%x,]] [[#%x,]] 00 AX 0 0 4
+# CHECK2: nonalloc0 PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 00 C 0 0 1
+# CHECK2-NEXT: nonalloc1 PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 00 0 0 8
+# CHECK2-NEXT: .debug_str PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 01 MSC 0 0 1
+
+# CHECK2: 0000000000000010 0 NOTYPE LOCAL DEFAULT [[#]] (nonalloc0) sym0
+# CHECK2: 0000000000000008 0 NOTYPE LOCAL DEFAULT [[#]] (nonalloc1) sym1
+
+# CHECK2: Hex dump of section 'nonalloc0':
+## zlib with ch_size=0x10
+# CHECK2-NEXT: 01000000 00000000 10000000 00000000
+# CHECK2-NEXT: 01000000 00000000 {{.*}}
+# CHECK2: Hex dump of section '.debug_str':
+## zstd with ch_size=0x38
+# CHECK2-NEXT: 02000000 00000000 38000000 00000000
+# CHECK2-NEXT: 01000000 00000000 {{.*}}
+
+# RUN: not ld.lld --compress-nonalloc-sections=foo %t.o -o /dev/null 2>&1 | \
+# RUN: FileCheck %s --check-prefix=ERR1 --implicit-check-not=error:
+# ERR1: error: --compress-nonalloc-sections: parse error, not 'section-glob=[none|zlib|zstd]'
+
+# RUN: not ld.lld --compress-nonalloc-sections 'a[=zlib' %t.o -o /dev/null 2>&1 | \
+# RUN: FileCheck %s --check-prefix=ERR2 --implicit-check-not=error:
+# ERR2: error: --compress-nonalloc-sections: invalid glob pattern, unmatched '['
+
+# RUN: not ld.lld %t.o -o /dev/null --compress-nonalloc-sections='.debug*=zlib-gabi' --compress-nonalloc-sections='.debug*=' 2>&1 | \
+# RUN: FileCheck -check-prefix=ERR3 %s
+# ERR3: unknown --compress-nonalloc-sections value: zlib-gabi
+# ERR3-NEXT: --compress-nonalloc-sections: parse error, not 'section-glob=[none|zlib|zstd]'
+
+.globl _start
+_start:
+ leaq __start_foo0(%rip), %rax
+ leaq __stop_foo0(%rip), %rax
+ ret
+
+.section foo0,"a"
+.balign 8
+.quad .text-.
+.quad .text-.
+.section foo1,"a"
+.balign 8
+.quad .text-.
+.quad .text-.
+.section nonalloc0,""
+.balign 8
+.quad .text
+.quad .text
+sym0:
+.section nonalloc1,""
+.balign 8
+.quad 42
+sym1:
+
+.section .debug_str,"MS",@progbits,1
+.Linfo_string0:
+ .asciz "AAAAAAAAAAAAAAAAAAAAAAAAAAA"
+.Linfo_string1:
+ .asciz "BBBBBBBBBBBBBBBBBBBBBBBBBBB"
diff --git a/lld/test/ELF/linkerscript/compress-nonalloc-sections.s b/lld/test/ELF/linkerscript/compress-nonalloc-sections.s
new file mode 100644
index 00000000000000..74b93eef442980
--- /dev/null
+++ b/lld/test/ELF/linkerscript/compress-nonalloc-sections.s
@@ -0,0 +1,62 @@
+# REQUIRES: x86, zlib
+
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o
+# RUN: ld.lld -T a.lds a.o --compress-nonalloc-sections nonalloc=zlib --compress-nonalloc-sections str=zlib -o out
+# RUN: llvm-readelf -SsXz -p str out | FileCheck %s
+
+# CHECK: Name Type Address Off Size ES Flg Lk Inf Al
+# CHECK: nonalloc PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 00 C 0 0 1
+# CHECK-NEXT: str PROGBITS 0000000000000000 [[#%x,]] [[#%x,]] 01 MSC 0 0 1
+
+# CHECK: 0000000000000000 0 NOTYPE GLOBAL DEFAULT [[#]] (nonalloc) nonalloc_start
+# CHECK: 0000000000000023 0 NOTYPE GLOBAL DEFAULT [[#]] (nonalloc) nonalloc_end
+# CHECK: String dump of section 'str':
+# CHECK-NEXT: [ 0] AAA
+# CHECK-NEXT: [ 4] BBB
+
+## TODO The uncompressed size of 'nonalloc' is dependent on linker script
+## commands, which is not handled. We should report an error.
+# RUN: ld.lld -T b.lds a.o --compress-nonalloc-sections nonalloc=zlib
+
+#--- a.s
+.globl _start
+_start:
+ ret
+
+.section nonalloc0,""
+.balign 8
+.quad .text
+.quad .text
+.section nonalloc1,""
+.balign 8
+.quad 42
+
+.section str,"MS",@progbits,1
+ .asciz "AAA"
+ .asciz "BBB"
+
+#--- a.lds
+SECTIONS {
+ .text : { *(.text) }
+ c = SIZEOF(.text);
+ b = c+1;
+ a = b+1;
+ nonalloc : {
+ nonalloc_start = .;
+## In general, using data commands is error-prone. This case is correct, though.
+ *(nonalloc*) QUAD(SIZEOF(.text))
+ . += a;
+ nonalloc_end = .;
+ }
+ str : { *(str) }
+}
+
+#--- b.lds
+SECTIONS {
+ nonalloc : { *(nonalloc*) . += a; }
+ .text : { *(.text) }
+ a = b+1;
+ b = c+1;
+ c = SIZEOF(.text);
+}
|
I'd prefer |
Thanks for your feedback. I'm eager to hear others' as well. If we go with --compress-sections, we need a warning when a SHF_ALLOC section is matched. If we use |
It seems a bit inconsistent to me that, say, |
|
And still, if the name template in the |
…tests. --compress-debug-sections=none wins Created using spr 1.3.5-bogner
Sorry, I was confused. I think you are talking about the option precedence. I agree that --compress-debug-sections should win, even if it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
This patch has been reviewed last July and we have decided to have at least the non-SHF_ALLOC form. Notes:
If we adopt the following code from https://reviews.llvm.org/D154641 // See the comment in finalizeAddressDependentContent.
if (sec->compressed.shards) {
if (sec->size != sec->compressed.uncompressedSize)
fatal("uncompressed size of SHF_COMPRESSED section '" + sec->name +
"' is dependent on linker script commands");
sec->size = savedSize;
dot = savedDot2 + savedSize;
} The new |
…tion Created using spr 1.3.5-bogner
I think the only concern about the "warn for SHF_ALLOC match" behavior is if someone wanted to use a catch-all (or otherwise very wide) pattern with the expectation of it omitting SHF_ALLOC sections whose names match. But I think that's a new outlying use that we can address in later iterations. The warning behavior seems like a conservative and safe way to add the feature initially. |
Created using spr 1.3.5-bogner
zstd excels at scaling from low-ratio-very-fast to high-ratio-pretty-slow. Some users prioritize speed and prefer disk read speed, while others focus on achieving the highest compression ratio possible, similar to traditional high-ratio codecs like LZMA. Add an optional `level` to `--compress-sections` (#84855) to cater to these diverse needs. While we initially aimed for a one-size-fits-all approach, this no longer seems to work. (https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html) When --compress-debug-sections is used together, make --compress-sections take precedence since --compress-sections is usually more specific. Remove the level distinction between -O/-O1 and -O2 for --compress-debug-sections=zlib for a more consistent user experience. Pull Request: #90567
--compress-sections =[none|zlib|zstd] is similar to
--compress-debug-sections but applies to broader sections without the
SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is
matched. An interesting use case is to compress
.strtab
/.symtab
,which consume a significant portion of the file size (15.1% for a
release build of Clang).
An older revision is available at https://reviews.llvm.org/D154641 .
This patch focuses on non-allocated sections for safety. Moving
maybeCompress
as D154641 does not handle STT_SECTION symbols for-r --compress-debug-sections=zlib
(seerelocatable-section-symbol.s
from #66804).
Since different output sections may use different compression
algorithms, we need CompressedData::type to generalize
config->compressDebugSections.
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674