Skip to content

RuntimeAllocator: Align returned pointers #8891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

bkchr
Copy link
Member

@bkchr bkchr commented Jun 17, 2025

Rust recently switched the default alignment of u128 to 16bytes: https://blog.rust-lang.org/2024/03/30/i128-layout-update/ This broke the assumption of our host allocator that the biggest alignment is 8 bytes.

To fix the alignment issue, the runtime allocator now takes care of aligned the returned pointer. We are abusing the fact that we know how the host allocator is working and storing some extra data in its header. This is not a perfect solution as we don't align the host side pointers, but the host side is mainly allocating u8 arrays that should be fine with the 8byte alignment. Any node side change would be a consensus breaking change.

Closes: #8818

Rust recently switched the default alignment of u128 to 16bytes: https://blog.rust-lang.org/2024/03/30/i128-layout-update/
This broke the assumption of our host allocator that the biggest alignment is 8 bytes.

To fix the alignment issue, the runtime allocator now takes care of aligned the returned pointer. Because not only the runtime allocator is allocating pointers, but also the host side can do so, we need to tag the pointers allocated by the runtime allocator.
This is not a perfect solution as we don't align the host side pointers, but the host side is mainly allocating `u8` arrays that should be fine with the `8byte` alignment. Any node side change would be a consensus breaking change.
@bkchr bkchr added the T0-node This PR/Issue is related to the topic “node”. label Jun 17, 2025
@bkchr bkchr requested review from koute and ggwpez June 18, 2025 12:46
@bkchr bkchr added T17-primitives Changes to primitives that are not covered by any other label. and removed T0-node This PR/Issue is related to the topic “node”. labels Jun 18, 2025
//
// As the host side already aligns the pointer by `8`, we only need to account for any
// excess.
let ptr = crate::allocator::malloc((size + align.saturating_sub(8)) as u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably check ptr for null here to handle underlying allocator error.

@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/15742807550
Failed job name: test-linux-stable


unsafe {
let offset = (ptr_offset as u16).to_ne_bytes();
ptr::copy(offset.as_ptr(), ptr.sub(OFFSET_LENGTH), OFFSET_LENGTH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. write_unaligned might be nicer than copy, e.g. ptr.cast::<u16>().sub(1).write_unaligned(ptr_offset)
  2. It'd be nicer to do something like type Offset = u16 and then do const OFFSET_LENGTH = size_of::<Offset>() so that there's a single source of truth for the type/size of whatever we're writing (if we're using OFFSET_LENGTH).

Comment on lines +33 to +35
/// little-endian `u64`. `0x00000001_00000000` is the representation of an occupied header (aka when
/// it is used, which is the case after calling `alloc`). So, we are able to reclaim two bytes of
/// this header for our use case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a little misleading/incomplete I think, since 0x00000001_00000000 is just the mask for the bit whether the slot is occupied. (:

Maybe something like this would be a nicer explanation? (Feel free to modify.)

The allocation header consists of 8 bytes. The first four bytes (as written in memory) are used to store the order of the allocation (or the link to the next slot, if unallocated). Then the least significant bit of the next byte determines whether a given slot is occupied or free, and the last three bytes are unused.

So technically we could use up to 31-bits without clobbering anything needed by the allocator as-is, but since the "is allocated?" flag will always be 1 and since the order could be recalculated (since dealloc gets passed the same layout as alloc does) we could use the full 64-bits if we wanted to (but that'd be an overkill, since we don't really need all of this space).

Comment on lines +76 to +81
let mut offset: [u8; OFFSET_LENGTH] = [0; OFFSET_LENGTH];
unsafe {
ptr::copy(ptr.sub(OFFSET_LENGTH), offset.as_mut_ptr(), OFFSET_LENGTH);
}

let offset = u16::from_ne_bytes(offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be simplified to something like let offset = unsafe { ptr.cast::<u16>().sub(1).read_unaligned() };

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T17-primitives Changes to primitives that are not covered by any other label.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmark panics in slice::from_raw_parts
3 participants