Encoder mishandles 16-bit address truncation behavior #471

mappzor · 2024-01-09T16:05:35Z

Let's consider two instructions in 32-bit mode:

00 3d 00 ff ff ff       add byte ptr ds:[0xFFFFFF00], bh
67 00 3e 00 ff          add byte ptr ds:[0x0000FF00], bh

Attempting to re-encode both of them results in 67 00 3e 00 ff add byte ptr ds:[0x0000FF00], bh. This presents a unique problem as it's not just a matter of buggy logic. In both cases disassembled instructions have matching operands (fuzzer couldn't catch this) with full 64-bit displacement 0xFFFFFFFFFFFFFF00. The only difference is effective address size (attribute of the whole instruction not just a single operand), so it's all about interpreting the operand correctly.

I see two ways of fixing this:

require 16-bit absolute addresses to be truncated to 16-bits by the user - I don't like this, it's very clunky and goes against the rule that displacements are uniformly signed, it's also just a terrible API
use address size hints - ZydisEncoderDecodedInstructionToEncoderRequest always sets those hints anyway based on effective address size, currently they are ignored unless instruction scales with address size (hidden scaling). This hint can be used in additional context to resolve disambiguity between 16 and 32-bit absolute addresses. I think it's intuitive that address size hint covers all issues regarding address size. It's also a very natural place to document this properly. Lack of hint would indicate that native size is preferred (16 in 16-bit modes, 32 bits otherwise)

Currently I'm exploring 2nd option as it seems the most viable. I'm also considering to get rid of another clunky 16-bit behavior. Consider this in 16-bit mode:

67 3A 05 00 1E 57 1A    cmp al, byte ptr ds:[0x1E00]

Displacement is 0x1A571E00 after disassembling but it gets truncated to 16 bits inside ZydisCalcAbsoluteAddress during formatting. Currently encoder ignores everything above 16 bits to re-encode this properly. I think encoder shouldn't accept this and it would be better to make ZydisEncoderDecodedInstructionToEncoderRequest responsible for sanitization.

The text was updated successfully, but these errors were encountered:

tremalrik · 2024-01-11T21:30:31Z

These examples indicate that it's not just the encoder that's mishandling 16-bit address truncation, but the formatter as well.

While displacements can always be treated as signed in x86, address truncation is unsigned. Also - and this is the part that Zydis doesn't appear to handle correctly - the 67h prefix overrides not just address calculation width but address truncation width as well.

As a result:

in 32-bit mode, decoding 67 00 3e 00 ff as add byte ptr ds:[0x0000FF00],bh is correct - in this case, the 67h prefix overrides truncation width from 32-bit down to 16-bit, and the resulting memory access goes to ds:[0x0000FF00], not ds:[0xFFFFFF00]. As a result, re-encoding 00 3d 00 ff ff ff into 67 00 3e 00 ff is not correct, but it would have been correct to re-encode 00 3d 00 ff 00 00 into 67 00 3e 00 ff.
in 16-bit mode, decoding 67 3a 05 00 1e 57 1a as cmp al, byte ptr ds:[0x1E00] is NOT correct! The 67h prefix will, in this case, override address truncation width from 16-bit to 32-bit, so the memory access will go to address ds:[0x1a571e00], not ds:[0x1e00]. As such, the correct decoding is cmp al, byte ptr ds:[0x1a571e00] - which is what e.g. XED outputs. (Attempting to access address ds:[0x1a571e00] in 16-bit mode will normally cause a #GP, due to the data segment size normally being 64K in 16-bit mode, however it is perfectly possible to set up a big 32-bit data segment for unreal-mode or 16-bit protected-mode, in which case such an instruction will execute just fine in 16-bit mode.)

mappzor · 2024-01-11T21:35:00Z

@tremalrik I've addressed this already in #472 :)

Separate PR will soon address rest of the issues.

mappzor added a commit to mappzor/zydis that referenced this issue Jan 12, 2024

Reworked absolute address handling (Fixes zyantific#471)

eead169

mappzor mentioned this issue Jan 12, 2024

Reworked absolute address handling (Fixes #471) #473

Merged

mappzor added a commit to mappzor/zydis that referenced this issue Jan 16, 2024

Reworked absolute address handling (Fixes zyantific#471)

65fadbd

mappzor added a commit to mappzor/zydis that referenced this issue Jan 16, 2024

Reworked absolute address handling (Fixes zyantific#471)

a4efe51

athre0z closed this as completed in #473 Feb 4, 2024

athre0z pushed a commit that referenced this issue Feb 4, 2024

Reworked absolute address handling (Fixes #471)

88f88c6

athre0z pushed a commit that referenced this issue Feb 4, 2024

Reworked absolute address handling (Fixes #471)

d3d573c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoder mishandles 16-bit address truncation behavior #471

Encoder mishandles 16-bit address truncation behavior #471

mappzor commented Jan 9, 2024

tremalrik commented Jan 11, 2024

mappzor commented Jan 11, 2024

Encoder mishandles 16-bit address truncation behavior #471

Encoder mishandles 16-bit address truncation behavior #471

Comments

mappzor commented Jan 9, 2024

tremalrik commented Jan 11, 2024

mappzor commented Jan 11, 2024