Skip to content

Commit ebc65d1

Browse files
abrownalexcrichton
andauthored
asm: introduce a new x64 assembler (#10110)
* asm: add initial infrastructure for an external assembler This change adds some initial logic implementing an external assembler for Cranelift's x64 backend, as proposed in RFC [#41]. This adds two crates: - the `cranelift/assembler/meta` crate defines the instructions; to print out the defined instructions use `cargo run -p cranelift-assembler-meta` - the `cranelift/assembler` crate exposes the generated Rust code for those instructions; to see the path to the generated code use `cargo run -p cranelift-assembler` The assembler itself is straight-forward enough (modulo the code generation, of course); its integration into `cranelift-codegen` is what is most tricky about this change. Instructions that we will emit in the new assembler are contained in the `Inst::External` variant. This unfortunately increases the memory size of `Inst`, but only temporarily if we end up removing the extra `enum` indirection by adopting the new assembler wholesale. Another integration point is ISLE: we generate ISLE definitions and a Rust helper macro to make the external assembler instructions accessible to ISLE lowering. This change introduces some duplication: the encoding logic (e.g. for REX instructions) currently lives both in `cranelift-codegen` and the new assembler crate. The `Formatter` logic for the assembler `meta` crate is quite similar to the other `meta` crate. This minimal duplication felt worth the additional safety provided by the new assembler. The `cranelift-assembler` crate is fuzzable (see the `README.md`). It will generate instructions with randomized operands and compare their encoding and pretty-printed string to a known-good disassembler, currently `capstone`. This gives us confidence we previously didn't have regarding emission. In the future, we may want to think through how to fuzz (or otherwise check) the integration between `cranelift-codegen` and this new assembler level. [#41]: bytecodealliance/rfcs#41 * asm: bless Cranelift file tests Using the new assembler's pretty-printing results in slightly different disassembly of compiled CLIF. This is because the assembler matches a certain configuration of `capstone`, causing the following obvious differences: - instructions with only two operands only print two operands; the original `MInst` instructions separate out the read-write operand into two separate operands (SSA-like) - the original instructions have some space padding after the instruction mnemonic, those from the new assembler do not This change uses the slightly new style as-is, but this is open for debate; we can change the configuration of `capstone` that we fuzz against. My only preferences would be to (1) retain some way to visually distinguish the new assembler instructions in the disassembly (temporarily, for debugging) and (2) eventually transition to pretty-printing instructions in Intel-style (`rw, r`) instead of the current (`r, rw`). * ci: skip formatting when `rustfmt` not present Though it is likely that `rustfmt` is present in a Rust environment, some CI tasks do not have this tool installed. To handle this case (plus the chance that other Wasmtime builds are similar), this change skips formatting with a `stderr` warning when `rustfmt` fails. * vet: audit `arbtest` for use as a dev-dependency * ci: make assembler crates publishable In order to satisfy `ci/publish.rs`, it would appear that we need to use a version that matches the rest of the Cranelift crates. * review: use Cargo workspace values * review: document `Inst`, move `Inst::name` * review: clarify 'earlier' doc comment * review: document multi-byte opcodes * review: document `Rex` builder methods * review: document encoding rules * review: clarify 'bits' -> 'width' * review: clarify confusing legacy prefixes * review: tweak IA-32e language * review: expand documentation for format * review: move feature list closer to enum * review: add a TODO to remove AT&T operand ordering * review: move prefix emission to separate lines * review: add testing note * review: fix incomplete sentence * review: rename `MinusRsp` to `NonRspGpr` * review: add TODO for commented out instructions * review: add conservative down-conversion to `is_imm*` * Fuzzing updates for cranelift-assembler-x64 (#10) * Fuzzing updates for cranelift-assembler-x64 * Ensure fuzzers build on CI * Move fuzz crate into the main workspace * Move `fuzz.rs` support code directly into fuzzer * Move `capstone` dependency into the fuzzer * Make `arbitrary` an optional dependency Shuffle around a few things in a few locations for this. * vet: skip audit for `cranelift-assembler-x64-fuzz` Co-authored-by: Alex Crichton <[email protected]> * review: use 32-bit form for 8-bit and 16-bit reg-reg Cranelift's existing lowering for 8-bit and 16-bit reg-reg `AND` used the wider version of the instruction--the 32-bit reg-reg `AND`. As pointed out by @cfallin [here], this was likely due to avoid partial register stalls. This change keeps that lowering by distinguishing more precisely between `GprMemImm` that are in register or memory. [here]: #10110 (comment) --------- Co-authored-by: Alex Crichton <[email protected]>
1 parent 505b3c6 commit ebc65d1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+4215
-156
lines changed

.github/workflows/main.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -634,6 +634,7 @@ jobs:
634634
- run: cargo fuzz build --dev
635635
- run: cargo fuzz build --dev --fuzz-dir ./cranelift/isle/fuzz
636636
- run: cargo fuzz build --dev --fuzz-dir ./crates/environ/fuzz --features component-model
637+
- run: cargo fuzz build --dev --fuzz-dir ./cranelift/assembler-x64/fuzz
637638

638639
# common logic to cancel the entire run if this job fails
639640
- uses: ./.github/actions/cancel-on-failure

Cargo.lock

Lines changed: 35 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,7 @@ opt-level = 0
139139
resolver = '2'
140140
members = [
141141
"cranelift",
142+
"cranelift/assembler-x64/fuzz",
142143
"cranelift/isle/fuzz",
143144
"cranelift/isle/islec",
144145
"cranelift/isle/veri/veri_engine",

cranelift/assembler-x64/Cargo.toml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
[package]
2+
name = "cranelift-assembler-x64"
3+
description = "A Cranelift-specific x64 assembler"
4+
version = "0.117.0"
5+
license = "Apache-2.0 WITH LLVM-exception"
6+
edition.workspace = true
7+
rust-version.workspace = true
8+
9+
[dependencies]
10+
arbitrary = { workspace = true, features = ["derive"], optional = true }
11+
12+
[dev-dependencies]
13+
arbtest = "0.3.1"
14+
15+
[build-dependencies]
16+
cranelift-assembler-x64-meta = { path = "meta" }
17+
18+
[lints.clippy]
19+
all = "deny"
20+
pedantic = "warn"
21+
module_name_repetitions = { level = "allow", priority = 1 }
22+
similar_names = { level = "allow", priority = 1 }
23+
wildcard_imports = { level = "allow", priority = 1 }
24+
25+
[features]
26+
arbitrary = ['dep:arbitrary']

cranelift/assembler-x64/README.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# `cranelift-assembler-x64`
2+
3+
A Cranelift-specific x64 assembler. Unlike the existing `cranelift-codegen`
4+
assembler, this assembler uses instructions, not instruction classes, as the
5+
core abstraction.
6+
7+
### Use
8+
9+
Like `cranelift-codegen`, using this assembler starts with `enum Inst`. For
10+
convenience, a `main.rs` script prints the path to this generated code:
11+
12+
```console
13+
$ cat $(cargo run)
14+
#[derive(arbitrary::Arbitrary, Debug)]
15+
pub enum Inst {
16+
andb_i(andb_i),
17+
andw_i(andw_i),
18+
andl_i(andl_i),
19+
...
20+
```
21+
22+
### Test
23+
24+
In order to check that this assembler emits correct machine code, we fuzz it
25+
against a known-good disassembler. We can run a quick, one-second check:
26+
27+
```console
28+
$ cargo test -- --nocapture
29+
```
30+
31+
Or we can run the fuzzer indefinitely:
32+
33+
```console
34+
$ cargo +nightly fuzz run -s none roundtrip -j16
35+
```
36+

cranelift/assembler-x64/build.rs

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
use cranelift_assembler_x64_meta as meta;
2+
use std::env;
3+
use std::path::Path;
4+
5+
fn main() {
6+
println!("cargo:rerun-if-changed=build.rs");
7+
8+
let out_dir = env::var("OUT_DIR").expect("The OUT_DIR environment variable must be set");
9+
let out_dir = Path::new(&out_dir);
10+
let built_files = [
11+
meta::generate_rust_assembler(out_dir.join("assembler.rs")),
12+
meta::generate_isle_macro(out_dir.join("assembler-isle-macro.rs")),
13+
meta::generate_isle_definitions(out_dir.join("assembler-definitions.isle")),
14+
];
15+
16+
println!(
17+
"cargo:rustc-env=ASSEMBLER_BUILT_FILES={}",
18+
built_files
19+
.iter()
20+
.map(|p| p.to_string_lossy().to_string())
21+
.collect::<Vec<_>>()
22+
.join(":")
23+
);
24+
}
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
target
2+
corpus
3+
artifacts
4+
coverage

cranelift/assembler-x64/fuzz/Cargo.lock

Lines changed: 148 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
[package]
2+
name = "cranelift-assembler-x64-fuzz"
3+
version = "0.0.0"
4+
publish = false
5+
edition.workspace = true
6+
rust-version.workspace = true
7+
8+
[package.metadata]
9+
cargo-fuzz = true
10+
11+
[dependencies]
12+
libfuzzer-sys = { workspace = true }
13+
cranelift-assembler-x64 = { path = "..", features = ['arbitrary'] }
14+
capstone = { workspace = true }
15+
arbitrary = { workspace = true, features = ['derive'] }
16+
17+
[[bin]]
18+
name = "roundtrip"
19+
path = "fuzz_targets/roundtrip.rs"
20+
test = false
21+
doc = false
22+
bench = false

0 commit comments

Comments
 (0)