Commit Graph

8 Commits

Author SHA1 Message Date
David Wood
92eb4450fa tests: use minicore more
minicore makes it much easier to add new language items to all of the
existing `no_core` tests.
2025-02-24 09:26:54 +00:00
Jubilee Young
ee111b24e3 tests/assembly: use -Copt-level=3 instead of -O 2025-02-08 19:02:32 -08:00
Jörn Horstmann
3779b8e32e Consistently use the most significant bit of vector masks
This improves the codegen for vector `select`, `gather`, `scatter` and
boolean reduction intrinsics and fixes rust-lang/portable-simd#316.

The current behavior of most mask operations during llvm codegen is to
truncate the mask vector to <N x i1>, telling llvm to use the least
significat bit. The exception is the `simd_bitmask` intrinsics, which
already used the most signifiant bit.

Since sse/avx instructions are defined to use the most significant bit,
truncating means that llvm has to insert a left shift to move the bit
into the most significant position, before the mask can actually be
used.

Similarly on aarch64, mask operations like blend work bit by bit,
repeating the least significant bit across the whole lane involves
shifting it into the sign position and then comparing against zero.

By shifting before truncating to <N x i1>, we tell llvm that we only
consider the most significant bit, removing the need for additional
shift instructions in the assembly.
2025-01-26 16:44:23 +01:00
Josh Stone
6fd8a50680 Update the minimum external LLVM to 18 2024-09-18 13:53:31 -07:00
Gary Guo
5812b1fd12 Remove c_unwind from tests and fix tests 2024-06-19 13:54:55 +01:00
Josh Stone
1b79bb937f Add inline comments why we're forcing the target cpu 2024-05-01 16:54:20 -07:00
Josh Stone
706f06c39a Use an explicit x86-64 cpu in tests that are sensitive to it
There are a few tests that depend on some target features **not** being
enabled by default, and usually they are correct with the default x86-64
target CPU. However, in downstream builds we have modified the default
to fit our distros -- `x86-64-v2` in RHEL 9 and `x86-64-v3` in RHEL 10
-- and the latter especially trips tests that expect not to have AVX.

These cases are few enough that we can just set them back explicitly.
2024-05-01 15:25:26 -07:00
Jörn Horstmann
e91f937779 Add tests for the generated assembly of mask related simd instructions.
The tests show that the code generation currently uses the least
significant bits of <iX x N> vector masks when converting to <i1 xN>.
This leads to an additional left shift operation in the assembly for
x86, since mask operations on x86 operate based on the most significant
bit. On aarch64 the left shift is followed by a comparison against zero,
which repeats the sign bit across the whole lane.

The exception, which does not introduce an unneeded shift, is
simd_bitmask, because the code generation already shifts before
truncating.

By using the "C" calling convention the tests should be stable regarding
changes in register allocation, but it is possible that future llvm
updates will require updating some of the checks.

This additional instruction would be removed by the fix in #104693,
which uses the most significant bit for all mask operations.
2024-03-12 08:52:54 +01:00