Commit Graph

1164 Commits

Author SHA1 Message Date
Scott McMurray
4207c786e7 PR feedback 2025-04-09 21:44:59 -07:00
Scott McMurray
63dcac8423 skip tests/codegen/swap-small-types when debug assertions are on
In `swap_nonoverlapping_short` there's a new `debug_assert!`, and if that's enabled then the `alloca`s don't optimize out.
2025-04-09 10:44:49 -07:00
Scott McMurray
50d0ce1b42 Ensure swap_nonoverlapping is really always untyped 2025-04-09 09:09:37 -07:00
lincot
09d5bcf1ad Speed up String::push and String::insert
Improve performance of `String` methods by avoiding unnecessary memcpy
for the character bytes, with added codegen check to ensure compliance.
2025-04-09 13:06:10 +03:00
Stuart Cook
7ffa56c3a3 Rollup merge of #139098 - scottmcm:assert-impossible-tags, r=WaffleLapkin
Tell LLVM about impossible niche tags

I was trying to find a better way of emitting discriminant calculations, but sadly had no luck.

So here's a fairly small PR with the bits that did seem worth bothering:

1. As the [`TagEncoding::Niche` docs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.TagEncoding.html#variant.Niche) describe, it's possible to end up with a dead value in the input that's not already communicated via the range parameter attribute nor the range load metadata attribute.  So this adds an `llvm.assume` in non-debug mode to tell LLVM about that.  (That way it can tell that the sides of the `select` have disjoint possible values.)

2. I'd written a bunch more tests, or at least made them parameterized, in the process of trying things out, so this checks in those tests to hopefully help future people not trip on the same weird edge cases, like when the tag type is `i8` but yet there's still a variant index and discriminant of `258` which doesn't fit in that tag type because the enum is really weird.
2025-04-08 20:55:03 +10:00
Scott McMurray
502f7f9c24 Address PR feedback 2025-04-07 18:12:06 -07:00
Ralf Jung
2678d04dd9 mitigate MSVC unsoundness by not emitting alignment attributes on win32-msvc targets
also mention the MSVC alignment issue in platform-support.md
2025-04-07 23:30:55 +02:00
Santiago Pastorino
20f93c9e8e Add codegen test to be sure we get rid of uneeded clones after monomorphization 2025-04-07 16:53:11 -03:00
Stuart Cook
5863b426b9 Rollup merge of #139465 - EnzymeAD:autodiff-sret, r=oli-obk
add sret handling for scalar autodiff

r? `@oli-obk`

Fixing one of the todo's which I left in my previous batching PR.
This one handles sret for scalar autodiff.  `sret` mostly shows up when we try to return a lot of scalar floats.
People often start testing autodiff which toy functions which just use a few scalars as inputs and outputs, and those were the most likely to be affected by this issue. So this fix should make learning/teaching hopefully a bit easier.

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-07 22:29:21 +10:00
Manuel Drehwald
ca5bea3ebb move old tests, add sret test 2025-04-07 07:11:52 -04:00
Eduard Stefes
15e1a6676c Use -C target-cpu=z13 on s390x vector test
The default s390x cpu(z10) does not have vector support. Setting
target-cpu at least to z13 enables vectorisation for s390x architecture
and makes the tests pass.
2025-04-07 09:36:56 +02:00
Bennet Bleßmann
7dd57f085c update/bless tests 2025-04-06 21:41:47 +02:00
Stuart Cook
f4aa209e20 Rollup merge of #139438 - Zalathar:fix-test-122600, r=scottmcm
Prevent a test from seeing forbidden numbers in the rustc version

The final CHECK-NOT directive in this test was able to see past the end of the enclosing function, and find the substring `753` or `754` in the git hash in the rustc version number, causing false failures in CI whenever the git hash happens to contain those digits in sequence.

Adding an explicit check for `ret` prevents the CHECK-NOT directive from seeing past the end of the function.

---

Manually tested by adding `// CHECK-NOT: rustc` after the existing CHECK-NOT directives, and demonstrating that the new check prevents it from seeing the rustc version string.
2025-04-06 16:21:03 +10:00
Scott McMurray
51e67e21cf LLVM18 compatibility fixes in the tests 2025-04-05 19:54:50 -07:00
Scott McMurray
1f06a6a252 Tell LLVM about impossible niche tags 2025-04-05 19:54:47 -07:00
Zalathar
f6afb35c61 Prevent a test from seeing forbidden numbers in the rustc version
The final CHECK-NOT directive in this test was able to see past the end of the
enclosing function, and find the substring 753 or 754 in the git hash in the
rustc version number, causing false failures in CI.

Adding an explicit check for `ret` prevents the CHECK-NOT directive from seeing
past the end of the function.
2025-04-06 12:38:20 +10:00
Scott McMurray
e30cb329d8 Polymorphize array::IntoIter's iterator impl 2025-04-05 17:55:24 -07:00
Josh Stone
12167d7064 Update the minimum external LLVM to 19 2025-04-05 11:44:38 -07:00
Matthias Krüger
543160dd62 Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco
KCFI: Add KCFI arity indicator support

Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 10:18:03 +02:00
Ramon de C Valle
a98546b961 KCFI: Add KCFI arity indicator support
Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311,
https://github.com/llvm/llvm-project/pull/121070, and
https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 04:05:04 +00:00
Stuart Cook
a038028eca Rollup merge of #138024 - reitermarkus:unicode-panic-optimization, r=ibraheemdev
Allow optimizing out `panic_bounds_check` in Unicode checks.

Allow optimizing out `panic_bounds_check` in Unicode checks.

For context, see https://github.com/japaric/ufmt/issues/52#issuecomment-2699207241.
2025-04-05 13:18:14 +11:00
Stuart Cook
c6bf3a01ef Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Stuart Cook
2e4e196a5b Rollup merge of #136457 - calder:master, r=tgross35
Expose algebraic floating point intrinsics

# Problem

A stable Rust implementation of a simple dot product is 8x slower than C++ on modern x86-64 CPUs. The root cause is an inability to let the compiler reorder floating point operations for better vectorization.

See https://github.com/calder/dot-bench for benchmarks. Measurements below were performed on a i7-10875H.

### C++: 10us 

With Clang 18.1.3 and `-O2 -march=haswell`:
<table>
<tr>
    <th>C++</th>
    <th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="cc">
float dot(float *a, float *b, size_t len) {
    #pragma clang fp reassociate(on)
    float sum = 0.0;
    for (size_t i = 0; i < len; ++i) {
        sum += a[i] * b[i];
    }
    return sum;
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/739573c0-380a-4d84-9fd9-141343ce7e68" />
</td>
</tr>
</table>

### Nightly Rust: 10us 

With rustc 1.86.0-nightly (8239a37f9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
    <th>Rust</th>
    <th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
    let mut sum = 0.0;
    for i in 0..a.len() {
        sum = fadd_algebraic(sum, fmul_algebraic(a[i], b[i]));
    }
    sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/9dcf953a-2cd7-42f3-bc34-7117de4c5fb9" />
</td>
</tr>
</table>

### Stable Rust: 84us 

With rustc 1.84.1 (e71f9a9a9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
    <th>Rust</th>
    <th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
    let mut sum = 0.0;
    for i in 0..a.len() {
        sum += a[i] * b[i];
    }
    sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/936a1f7e-33e4-4ff8-a732-c3cdfe068dca" />
</td>
</tr>
</table>

# Proposed Change

Add `core::intrinsics::f*_algebraic` wrappers to `f16`, `f32`, `f64`, and `f128` gated on a new `float_algebraic` feature.

# Alternatives Considered

https://github.com/rust-lang/rust/issues/21690 has a lot of good discussion of various options for supporting fast math in Rust, but is still open a decade later because any choice that opts in more than individual operations is ultimately contrary to Rust's design principles.

In the mean time, processors have evolved and we're leaving major performance on the table by not supporting vectorization. We shouldn't make users choose between an unstable compiler and an 8x performance hit.

# References

* https://github.com/rust-lang/rust/issues/21690
* https://github.com/rust-lang/libs-team/issues/532
* https://github.com/rust-lang/rust/issues/136469
* https://github.com/calder/dot-bench
* https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231ps

try-job: x86_64-gnu-nopt
try-job: x86_64-gnu-aux
2025-04-05 13:18:12 +11:00
Calder Coalson
8ff70529f2 Expose algebraic floating point intrinsics 2025-04-04 16:13:57 -07:00
Manuel Drehwald
79e17bc71e add new tests for autodiff batching and update old ones 2025-04-04 14:24:46 -04:00
bors
00095b3da4 Auto merge of #132527 - DianQK:gvn-stmt-iter, r=oli-obk
gvn: Invalid dereferences for all non-local mutations

Fixes #132353.

This PR removes the computation value by traversing SSA locals through `for_each_assignment_mut`.

Because the `for_each_assignment_mut` traversal skips statements which have side effects, such as dereference assignments, the computation may be unsound. Instead of `for_each_assignment_mut`, we compute values by traversing in reverse postorder.

Because we compute and use the symbolic representation of values on the fly, I invalidate all old values when encountering a dereference assignment. The current approach does not prevent the optimization of a clone to a copy.

In the future, we may add an alias model, or dominance information for dereference assignments, or SSA form to help GVN.

r? cjgillot

cc `@jieyouxu` #132356
cc `@RalfJung` #133474
2025-04-03 19:17:33 +00:00
dianqk
ac7dd7a1b3 Remove unsound-mir-opts for simplify_aggregate_to_copy 2025-04-03 21:59:43 +08:00
Matthias Krüger
e332aa89a7 Rollup merge of #139145 - okaneco:safe_splits, r=Amanieu
slice: Remove some uses of unsafe in first/last chunk methods

Remove unsafe `split_at_unchecked` and `split_at_mut_unchecked` in some slice `split_first_chunk`/`split_last_chunk` methods.
Replace those calls with the safe `split_at` and `split_at_checked` where applicable.

Add codegen tests to check for no panics when calculating the last chunk index using `checked_sub` and `split_at`.

Better viewed with whitespace disabled in diff view

---

The unchecked calls are mostly manual implementations of the safe methods, but with the safety condition negated from `mid <= len` to `len < mid`.
```rust
if self.len() < N {
    None
} else {
    // SAFETY: We manually verified the bounds of the split.
    let (first, tail) = unsafe { self.split_at_unchecked(N) };
    // Or for the last_chunk methods
    let (init, last) = unsafe { self.split_at_unchecked(self.len() - N) };
```

Unsafe is still needed for the pointer array casts. Their safety comments are unmodified.
2025-04-03 07:39:05 +02:00
Matthias Krüger
2a557ec9b8 Rollup merge of #139188 - durin42:llvm-21-LintPass, r=dianqk
PassWrapper: adapt for llvm/llvm-project@94122d58fc77079a291a3d008914…

…006cb509d9db

We also have to remove the LLVM argument in cast-target-abi.rs for LLVM
21. I'm not really sure what the best approach here is since that test already uses revisions. We could also fork the test into a copy for LLVM 19-20 and another for LLVM 21, but what I did for now was drop the lint-abort-on-error flag to LLVM figuring that some coverage was better than none, but I'm happy to change this if that was a bad direction.

r? dianqk
````@rustbot```` label llvm-main
2025-04-01 20:25:23 +02:00
Augie Fackler
b14a0ce7f6 PassWrapper: adapt for llvm/llvm-project@94122d58fc
We also have to remove the LLVM argument in cast-target-abi.rs for LLVM
21. I'm not really sure what the best approach here is since that test
already uses revisions. We could also fork the test into a copy for LLVM
19-20 and another for LLVM 21, but what I did for now was drop the
lint-abort-on-error flag to LLVM figuring that some coverage was better
than none, but I'm happy to change this if that was a bad direction.

The above also applies for ffi-out-of-bounds-loads.rs.

r? dianqk
@rustbot label llvm-main
2025-03-31 15:47:26 -04:00
reez12g
dea9472127 Add tests for LLVM 20 slice bounds check optimization 2025-03-31 22:38:53 +09:00
Scott McMurray
19648ce5cd codegen test for non-memcmp array comparison 2025-03-30 23:44:31 -07:00
okaneco
59ca7679c7 slice: Remove some uses of unsafe in first/last chunk methods
Remove unsafe `split_at_unchecked` and `split_at_mut_unchecked`
in some slice `split_first_chunk`/`split_last_chunk` methods.
Replace those calls with the safe `split_at` and `split_at_checked` where
applicable.

Add codegen tests to check for no panics when calculating the last
chunk index using `checked_sub` and `split_at`
2025-03-30 12:45:04 -04:00
bors
2a06022951 Auto merge of #138503 - bjorn3:string_merging, r=tmiasko
Avoid wrapping constant allocations in packed structs when not necessary

This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes.

try-job: armhf-gnu
2025-03-28 10:18:32 +00:00
bjorn3
a5fa12b6b9 Avoid wrapping constant allocations in packed structs when not necessary
This way LLVM will set the string merging flag if the alloc is a nul
terminated string, reducing binary sizes.
2025-03-28 09:19:57 +00:00
Stuart Cook
ff325e0a00 Rollup merge of #138818 - khuey:138198, r=jieyouxu
Don't produce debug information for compiler-introduced-vars when desugaring assignments.

An assignment such as

(a, b) = (b, c);

desugars to the HIR

{ let (lhs, lhs) = (b, c); a = lhs; b = lhs; };

The repeated `lhs` leads to multiple Locals assigned to the same DILocalVariable. Rather than attempting to fix that, get rid of the debug info for these bindings that don't even exist in the program to begin with.

Fixes #138198

r? `@jieyouxu`
2025-03-26 19:40:28 +11:00
bors
e61403aa4c Auto merge of #138634 - saethlin:repeated-uninit, r=scottmcm,oli-obk
Lower to a memset(undef) when Rvalue::Repeat repeats uninit

Fixes https://github.com/rust-lang/rust/issues/138625.

It is technically correct to just do nothing. But if we actually do nothing, we may miss that this is de-initializing something, so instead we just lower to a single memset that writes undef. This is still superior to the memcpy loop, in both quality of code we hand to the backend and LLVM's final output.
2025-03-25 02:09:15 +00:00
bors
1df5affaca Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics

Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics.

These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19.

I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something.

r? `@scottmcm`
2025-03-24 22:53:12 +00:00
Kyle Huey
8cab8e07bc Don't produce debug information for compiler-introduced-vars when desugaring assignments.
An assignment such as

(a, b) = (b, c);

desugars to the HIR

{ let (lhs, lhs) = (b, c); a = lhs; b = lhs; };

The repeated `lhs` leads to multiple Locals assigned to the same DILocalVariable. Rather than
attempting to fix that, get rid of the debug info for these bindings that don't even exist
in the program to begin with.

Fixes #138198
2025-03-21 17:34:45 -07:00
Ben Kimock
8e7d8ddffe Lower to a memset(undef) when Rvalue::Repeat repeats uninit 2025-03-19 23:57:49 -04:00
Jesus Checa Hidalgo
20432c9eee Use explicit cpu in some asm and codegen tests.
Some tests expect to be compiled for a specific CPU or require certain
target features to be present (or absent). These tests work fine with
default CPUs but fail in downstream builds for RHEL and Fedora, where
we use non-default CPUs such as z13 on s390x, pwr9 on ppc64le, or
x86-64-v2/x86-64-v3 on x86_64.
2025-03-19 19:45:46 +01:00
bors
493c38ba37 Auto merge of #127173 - bjorn3:mangle_rustc_std_internal_symbol, r=wesleywiser,jieyouxu
Mangle rustc_std_internal_symbols functions

This reduces the risk of issues when using a staticlib or rust dylib compiled with a different rustc version in a rust program. Currently this will either (in the case of staticlib) cause a linker error due to duplicate symbol definitions, or (in the case of rust dylibs) cause rustc_std_internal_symbols functions to be silently overridden. As rust gets more commonly used inside the implementation of libraries consumed with a C interface (like Spidermonkey, Ruby YJIT (curently has to do partial linking of all rust code to hide all symbols not part of the C api), the Rusticl OpenCL implementation in mesa) this is becoming much more of an issue. With this PR the only symbols remaining with an unmangled name are rust_eh_personality (LLVM doesn't allow renaming it) and `__rust_no_alloc_shim_is_unstable`.

Helps mitigate https://github.com/rust-lang/rust/issues/104707

try-job: aarch64-gnu-debug
try-job: aarch64-apple
try-job: x86_64-apple-1
try-job: x86_64-mingw-1
try-job: i686-mingw-1
try-job: x86_64-msvc-1
try-job: i686-msvc-1
try-job: test-various
try-job: armhf-gnu
2025-03-17 22:16:22 +00:00
Matthias Krüger
8f5c09b37c Rollup merge of #138349 - 1c3t3a:external-weak-cfi, r=rcvalle
Emit function declarations for functions with `#[linkage="extern_weak"]`

Currently, when declaring an extern weak function in Rust, we use the following syntax:
```rust
unsafe extern "C" {
   #[linkage = "extern_weak"]
   static FOO: Option<unsafe extern "C" fn() -> ()>;
}
```
This allows runtime-checking the extern weak symbol through the Option.

When emitting LLVM-IR, the Rust compiler currently emits this static as an i8, and a pointer that is initialized with the value of the global i8 and represents the nullabilty e.g.
```
`@FOO` = extern_weak global i8
`@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO`
```

This approach does not work well with CFI, where we need to attach CFI metadata to a concrete function declaration, which was pointed out in https://github.com/rust-lang/rust/issues/115199.

This change switches to emitting a proper function declaration instead of a global i8. This allows CFI to work for extern_weak functions. Example:
```
`@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO`
...
declare !type !61 !type !62 !type !63 !type !64 extern_weak void `@FOO(double)` unnamed_addr #6
```

We keep initializing the Rust internal symbol with the function declaration, which preserves the correct behavior for runtime checking the Option.

r? `@rcvalle`

cc `@jakos-sec`

try-job: test-various
2025-03-17 16:34:50 +01:00
bjorn3
b754ef727c Remove implicit #[no_mangle] for #[rustc_std_internal_symbol] 2025-03-17 14:08:09 +00:00
Gary Guo
292c622507 Stabilize asm_goto 2025-03-17 11:12:10 +00:00
Bastian Kersting
b30cf11b96 Emit function declarations for functions with #[linkage="extern_weak"]
Currently, when declaring an extern weak function in Rust, we use the
following syntax:
```rust
unsafe extern "C" {
   #[linkage = "extern_weak"]
   static FOO: Option<unsafe extern "C" fn() -> ()>;
}
```
This allows runtime-checking the extern weak symbol through the Option.

When emitting LLVM-IR, the Rust compiler currently emits this static
as an i8, and a pointer that is initialized with the value of the global
i8 and represents the nullabilty e.g.
```
@FOO = extern_weak global i8
@_rust_extern_with_linkage_FOO = internal global ptr @FOO
```

This approach does not work well with CFI, where we need to attach CFI
metadata to a concrete function declaration, which was pointed out in
https://github.com/rust-lang/rust/issues/115199.

This change switches to emitting a proper function declaration instead
of a global i8. This allows CFI to work for extern_weak functions.

We keep initializing the Rust internal symbol with the function
declaration, which preserves the correct behavior for runtime checking
the Option.

Co-authored-by: Jakob Koschel <jakobkoschel@google.com>
2025-03-17 08:27:53 +00:00
bors
5f3b84a421 Auto merge of #137278 - heiseish:101210-extra-codegen-tests, r=scottmcm
added some new test to check for result and options opt

Apologies for the delay. Finally have some time to get back into contributing.

## Context
- Added some tests to show optimization on result and options for 64 and 128 bits
- Relevant issue https://github.com/rust-lang/rust/issues/101210

## Some newb questions from me
- [x] My local llvm IR has `nuw` in `result_nop_match_128` etc whereas [godbolt version](https://rust.godbolt.org/z/Td9zoT5zn) doesn't have. So I put optional there, but not sure if it's desirable (maybe I'm not using the compiled llvm in the repo). I ran the test with
```bash
./x test tests/codegen/try_question_mark_nop.rs
```
- [x] Unless I'm reading it wrongly, but `option_nop_match_128` and `option_nop_traits_128` look to be **not** optimized away?

Update:
Here's the test for future reference
```rust
// CHECK-LABEL: `@option_nop_match_128`
#[no_mangle]
pub fn option_nop_match_128(x: Option<i128>) -> Option<i128> {
    // CHECK: start:
    // CHECK-NEXT: %trunc = trunc nuw i128 %0 to i1
    // CHECK-NEXT: br i1 %trunc, label %bb3, label %bb4
    // CHECK: bb3:
    // CHECK-NEXT: %2 = getelementptr inbounds {{(nuw )?}}i8, ptr %_0, i64 16
    // CHECK-NEXT: store i128 %1, ptr %2, align 16
    // CHECK: bb4:
    // CHECK-NEXT: %storemerge = phi i128 [ 1, %bb3 ], [ 0, %start ]
    // CHECK-NEXT: store i128 %storemerge, ptr %_0, align 16
    // CHECK-NEXT: ret void
    match x {
        Some(x) => Some(x),
        None => None,
    }
}
```

r? `@scottmcm`
2025-03-16 05:17:07 +00:00
许杰友 Jieyou Xu (Joe)
c05a643ca8 Rollup merge of #138472 - KonaeAkira:master, r=Mark-Simulacrum
Add codegen test for #129795

Adds test for #129795.

Min LLVM version is 20 because the optimization only happens since LLVM 20.
2025-03-16 09:40:09 +08:00
bors
523c507d26 Auto merge of #138157 - scottmcm:inline-more-tiny-things, r=oli-obk
Allow more top-down inlining for single-BB callees

This means that things like `<usize as Step>::forward_unchecked` and `<PartialOrd for f32>::le` will inline even if
we've already done a bunch of inlining to find the calls to them.

Fixes #138136

~~Draft as it's built atop #138135, which adds a mir-opt test that's a nice demonstration of this.  To see just this change, look at <48f63e3be5>~~ Rebased to be just the inlining change, as the other existing tests show it great.
2025-03-14 03:51:19 +00:00
KonaeAkira
ccde0a2203 Fix formatting (line too long) 2025-03-14 01:45:10 +01:00