tests: use minicore more
minicore makes it much easier to add new language items to all of the existing `no_core` tests.
Most of the remaining tests that *could* use minicore either fail because..
1. LLVM IR output changes and doesn't pass the test as written. I didn't look into these further.
2. The test has revisions w/ different compilation flags, expecting some to fail, and when using minicore, minicore is compiled with those flags and fails in the expected way because of the flags rather than the test, and that's considered a failure.
But these tests can be changed and make adding new language items a lot easier.
r? ```@jieyouxu```
import `simd_` intrinsics
In most cases, we can import the simd intrinsics rather than redeclare them. Apparently, most of these tests were written before `std::intrinsics::simd` existed.
There are a couple of exceptions where we can't yet import:
- the intrinsics are not declared as `const fn` in the standard library, causing issues in the `const-eval` tests
- the `simd_shuffle_generic` function is not exposed from `std::intrinsics`
- the `simd_fpow` and `simd_fpowi` functions are not exposed from `std::intrinsics` (removed in https://github.com/rust-lang/rust/pull/137595)
- some tests use `no_core`, and therefore cannot use `std::intrinsics`
r? ```@RalfJung```
cc ```@workingjubilee``` do you have context on why some intrinsics are not exposed?
Update some comparison codegen tests now that they pass in LLVM20
Fixes#106107
Needed one tweak to the default `PartialOrd::le` to get the test to pass. Everything but the derived 2-field `le` test passes even without the change to the defaults in the trait.
Same motivation as #136287, but for a newly introduced test. Rather than
over-constraining here, we just match the sret and accept pretty much
all other attributes.
@rustbot label llvm-main
r? @nikic
remove `simd_fpow` and `simd_fpowi`
Discussed in https://github.com/rust-lang/rust/issues/137555
These functions are not exposed from `std::intrinsics::simd`, and not used anywhere outside of the compiler. They also don't lower to particularly good code at least on the major ISAs (I checked x86_64, aarch64, s390x, powerpc), where the vector is just spilled to the stack and scalar functions are used for the actual logic.
r? `@RalfJung`
Tighten `str-to-string-128690.rs``CHECK{,-NOT}`s to make it less likely to incorrectly fail with symbol name mangling
The `invoke` to match on to `CHECK` or `CHECK-NOT` (latest master) looks like
```llvm
%_0.i.i.i.i.i.i.i.i.i.i.i.i.i1.i = invoke noundef zeroext i1 ``@"_ZN42_$LT$str$u20$as$u20$core..fmt..Display$GT$3fmt17ha18033e7fb4f14fcE"(ptr`` noalias noundef nonnull readonly align 1 %_3.val.i.i.i.i.i.i.i.i.i.i.i.i.i, i64 noundef %_3.val1.i.i.i.i.i.i.i.i.i.i.i.i.i, ptr noalias noundef nonnull align 8 dereferenceable(64) %formatter.i)
to label %bb1.i unwind label %cleanup.i, !noalias !80
```
in the local `.ll` output.
This test incorrectly failed in https://github.com/rust-lang/rust/pull/137483#issuecomment-2676925819 due to
```
// CHECK-NOT: {{(call|invoke).*}}fmt
```
matching against the unrelated call
```llvm
tail call void ``@_RNvNtCseLfmtnDCoTB_5alloc7raw_vec12handle_error``
```
It's not pretty by any means, but...
r? ``@saethlin``
Emit getelementptr inbounds nuw for pointer::add()
Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative.
Fixes https://github.com/rust-lang/rust/issues/137217.
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic
LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that.
Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35`
try-job: test-various
Reduce `Box::default` stack copies in debug mode
The `Box::new(T::default())` implementation of `Box::default` only
had two stack copies in debug mode, compared to the current version,
which has four. By avoiding creating any `MaybeUninit<T>`'s and just writing
`T` directly to the `Box` pointer, the stack usage in debug mode remains
the same as the old version.
Another option would be to mark `Box::write` as `#[inline(always)]`,
and change it's implementation to to avoid calling `MaybeUninit::write`
(which creates a `MaybeUninit<T>` on the stack) and to use `ptr::write` instead.
Fixes: #136043
Do not ignore uninhabited types for function-call ABI purposes. (Remove BackendRepr::Uninhabited)
Accepted MCP: https://github.com/rust-lang/compiler-team/issues/832Fixes#135802
Do not consider the inhabitedness of a type for function call ABI purposes.
* Remove the [`rustc_abi::BackendRepr::Uninhabited`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.BackendRepr.html) variant
* Instead calculate the `BackendRepr` of uninhabited types "normally" (as though they were not uninhabited "at the top level", but still considering inhabitedness of variants to determine enum layout, etc)
* Add an `uninhabited: bool` field to [`rustc_abi::LayoutData`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/struct.LayoutData.html) so inhabitedness of a `LayoutData` can still be queried when necessary (e.g. when determining if an enum variant needs a tag value allocated to it).
This should not affect type layouts (size/align/field offset); this should only affect function call ABI, and only of uninhabited types.
cc ``@RalfJung``
Create a generic AVR target: avr-none
This commit removes the `avr-unknown-gnu-atmega328` target and replaces it with a more generic `avr-none` variant that must be specialized using `-C target-cpu` (e.g. `-C target-cpu=atmega328p`).
Seizing the day, I'm adding myself as the maintainer of this target - I've been already fixing the bugs anyway, might as well make it official 🙂
Related discussions:
- https://github.com/rust-lang/rust/pull/131171
- https://github.com/rust-lang/compiler-team/issues/800
try-job: x86_64-gnu-debug
Fix codegen of uninhabited PassMode::Indirect return types.
Add codegen test for uninhabited PassMode::Indirect return types.
Enable optimizations for uninhabited return type codegen test
Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar`
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
This commit removes the `avr-unknown-gnu-atmega328` target and replaces
it with a more generic `avr-none` variant that must be specialized with
the `-C target-cpu` flag (e.g. `-C target-cpu=atmega328p`).
x86: use SSE2 to pass float and SIMD types
This builds on the new X86Sse2 ABI landed in https://github.com/rust-lang/rust/pull/137037 to actually make it a separate ABI from the default x86 ABI, and use SSE2 registers. Specifically, we use it in two ways: to return `f64` values in a register rather than by-ptr, and to pass vectors of size up to 128bit in a register (or, well, whatever LLVM does when passing `<4 x float>` by-val, I don't actually know if this ends up in a register).
Cc `@workingjubilee`
Fixes#133611
try-job: aarch64-apple
try-job: aarch64-gnu
try-job: aarch64-gnu-debug
try-job: test-various
try-job: x86_64-gnu-nopt
try-job: dist-i586-gnu-i586-i686-musl
try-job: x86_64-msvc-1
improve cold_path()
#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely`
However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this:
```
if let Some(x) = y { ... }
```
may generate 3-target switch:
```
switch y.discriminator:
0 => true branch
1 = > false branch
_ => unreachable
```
and therefore marking a branch as cold will have no effect.
This PR improves `cold_path()` to work with arbitrary switch instructions.
Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
llvm: Tolerate captures in tests
llvm/llvm-project@7e3735d1a1 introduces `captures` annotations. Adjust regexes to be tolerant of these.
`@rustbot` label:+llvm-main
Set both `nuw` and `nsw` in slice size calculation
There's an old note in the code to do this, and now that [LLVM-C has an API for it](f0b8ff1251/llvm/include/llvm-c/Core.h (L4403-L4408)), we might as well. And it's been there since what looks like LLVM 17 de9b6aa341 so doesn't even need to be conditional.
(There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
`transmute` should also assume non-null pointers
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
debuginfo: Set bitwidth appropriately in enum variant tags
Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data.
LLVM added support for 128-bit enumerators in llvm/llvm-project#125578
That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing.
As a result, we went from emitting tags that looked like:
DW_AT_discr_value (0xfe)
(because `dwarf::BestForm` selected `data1`)
to emitting tags that looked like:
DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )
This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools.
3. Will result in smaller debug information.
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
Mark condition/carry bit as clobbered in C-SKY inline assembly
C-SKY's compare and some arithmetic/logical instructions modify condition/carry bit (C) in PSR, but there is currently no way to mark it as clobbered in `asm!`.
This PR marks it as clobbered except when [`options(preserves_flags)`](https://doc.rust-lang.org/reference/inline-assembly.html#r-asm.options.supported-options.preserves_flags) is used.
Refs:
- Section 1.3 "Programming model" and Section 1.3.5 "Condition/carry bit" in CSKY Architecture user_guide:
9f7121f7d4/CSKY%20Architecture%20user_guide.pdf
> Under user mode, condition/carry bit (C) is located in the lowest bit of PSR, and it can be
accessed and changed by common user instructions. It is the only data bit that can be visited
under user mode in PSR.
> Condition or carry bit represents the result after one operation. Condition/carry bit can be
clearly set according to the results of compare instructions or unclearly set as some
high-precision arithmetic or logical instructions. In addition, special instructions such as
DEC[GT,LT,NE] and XTRB[0-3] will influence the value of condition/carry bit.
- Register definition in LLVM:
https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/CSKY/CSKYRegisterInfo.td#L88
cc ```@Dirreke``` ([target maintainer](aa6f5ab18e/src/doc/rustc/src/platform-support/csky-unknown-linux-gnuabiv2.md (target-maintainers)))
r? ```@Amanieu```
```@rustbot``` label +O-csky +A-inline-assembly
Cast allocas to default address space
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used.
This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0.
For example the following code compiles incorrectly when not casting the address space to the default one:
```rust
fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ {
let local = 0i8; /* addrspace(5) */
let res = if cond { p } else { &raw const local };
res
}
```
results in
```llvm
%local = alloca addrspace(5) i8
%res = alloca addrspace(5) ptr
if:
; Store 64-bit flat pointer
store ptr %p, ptr addrspace(5) %res
else:
; Store 32-bit scratch pointer
store ptr addrspace(5) %local, ptr addrspace(5) %res
ret:
; Load and return 64-bit flat pointer
%res.load = load ptr, ptr addrspace(5) %res
ret ptr %res.load
```
For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers.
The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type.
Tracking issue: #135024