Skip to content

Fix in erroneous implementation of _mm256_bsrli_epi128 #1823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 17, 2025
4 changes: 2 additions & 2 deletions crates/core_arch/src/x86/avx2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ pub fn _mm256_alignr_epi8<const IMM8: i32>(a: __m256i, b: __m256i) -> __m256i {
return transmute(a);
}

let r: i8x32 = match IMM8 % 16 {
let r: i8x32 = match IMM8 {
Copy link
Member

@bjorn3 bjorn3 Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to also add a value to the default arm. It is currently unreachable_unchecked(), which makes IMM8 values > 15 UB.
Edit: Maybe you should change the IMM8 == 16 if above to be IMM8 >= 16? Also there is handling for IMM8 > 16 above already: (_mm256_setzero_si256(), a) which would become unreachable in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, I was mistaken. I think specifically values between 17-31 (inclusive) are UB without the %, which makes it not redundant. My mistake!

0 => simd_shuffle!(
b,
a,
Expand Down Expand Up @@ -2964,7 +2964,7 @@ pub fn _mm256_bsrli_epi128<const IMM8: i32>(a: __m256i) -> __m256i {
unsafe {
let a = a.as_i8x32();
let zero = i8x32::ZERO;
let r: i8x32 = match IMM8 % 16 {
let r: i8x32 = match IMM8 {
0 => simd_shuffle!(
a,
zero,
Expand Down
Loading