gvn: Invalid dereferences for all non-local mutations
Fixes#132353.
This PR removes the computation value by traversing SSA locals through `for_each_assignment_mut`.
Because the `for_each_assignment_mut` traversal skips statements which have side effects, such as dereference assignments, the computation may be unsound. Instead of `for_each_assignment_mut`, we compute values by traversing in reverse postorder.
Because we compute and use the symbolic representation of values on the fly, I invalidate all old values when encountering a dereference assignment. The current approach does not prevent the optimization of a clone to a copy.
In the future, we may add an alias model, or dominance information for dereference assignments, or SSA form to help GVN.
r? cjgillot
cc `@jieyouxu` #132356
cc `@RalfJung` #133474
`for_each_assignment_mut` can skip assignment statements with side effects,
which can result in some assignment statements retrieving outdated value.
For example, it may skip a dereference assignment statement.
Tell rustfmt to use the 2024 edition in ./x.py fmt
Most crates in this repo have been moved to the 2024 edition already. This also allows removing a rustfmt exclusion for a cg_clif test.
This is a way to shrink call spans that doesn't involve mixing different spans,
and avoids overlap with argument spans.
This patch also removes some low-value comments that were causing rustfmt to
ignore the match arms.
Simplify `PartialOrd` on tuples containing primitives
We noticed in https://github.com/rust-lang/rust/pull/133984#issuecomment-2704011800 that currently the tuple comparison code, while it [does optimize down](https://github.com/rust-lang/rust/blob/master/tests/codegen/comparison-operators-2-tuple.rs) today, is kinda huge: <https://rust.godbolt.org/z/xqMoeYbhE>
This PR changes the tuple code to go through an overridable "chaining" version of the comparison functions, so that for simple things like `(i16, u16)` and `(f32, f32)` (as seen in the new MIR pre-codegen test) we just directly get the
```rust
if lhs.0 == rhs.0 { lhs.0 OP rhs.0 }
else { lhs.1 OP rhs.1 }
```
version in MIR, rather than emitting a mess for LLVM to have to clean up.
Test added in the first commit, so you can see the MIR diff in the second one.
Add MIR pre-codegen tests to track #138544
I don't know how best to fix the problem yet, but wanted to check in some tests to demonstrate it and make sure that they get updated to keep it fixed if anyone does fix it 🙂
No code changes; just the tests for #138544.
Reduce FormattingOptions to 64 bits
This is part of https://github.com/rust-lang/rust/issues/99012
This reduces FormattingOptions from 6-7 machine words (384 bits on 64-bit platforms, 224 bits on 32-bit platforms) to just 64 bits (a single register on 64-bit platforms).
Before:
```rust
pub struct FormattingOptions {
flags: u32, // only 6 bits used
fill: char,
align: Option<Alignment>,
width: Option<usize>,
precision: Option<usize>,
}
```
After:
```rust
pub struct FormattingOptions {
/// Bits:
/// - 0-20: fill character (21 bits, a full `char`)
/// - 21: `+` flag
/// - 22: `-` flag
/// - 23: `#` flag
/// - 24: `0` flag
/// - 25: `x?` flag
/// - 26: `X?` flag
/// - 27: Width flag (if set, the width field below is used)
/// - 28: Precision flag (if set, the precision field below is used)
/// - 29-30: Alignment (0: Left, 1: Right, 2: Center, 3: Unknown)
/// - 31: Always set to 1
flags: u32,
/// Width if width flag above is set. Otherwise, always 0.
width: u16,
/// Precision if precision flag above is set. Otherwise, always 0.
precision: u16,
}
```
mir_build: Avoid some useless work when visiting "primary" bindings
While looking over `visit_primary_bindings`, I noticed that it does a bunch of extra work to build up a collection of “user-type projections”, even though 2/3 of its call sites don't even use them. Those callers can get the same result via `thir::Pat::walk_always`.
(And it turns out that doing so also avoids creating some redundant user-type entries in MIR for some binding constructs.)
I also noticed that even when the user-type projections *are* used, the process of building them ends up eagerly cloning some nested vectors at every recursion step, even in cases where they won't be used because the current subpattern has no bindings. To avoid this, the visit method now assembles a linked list on the stack containing the information that *would* be needed to create projections, and only creates the concrete projections as needed when a primary binding is encountered.
Some relevant prior PRs:
- #55274
- 0bfe184b1a in #55937
---
There should be no user-visible change in compiler output.
The existing method does some non-obvious extra work to collect user types and
build user-type projections, which is specifically needed by `declare_bindings`
and not by the other two callers.
Remove fake borrows of refs that are converted into non-refs in `MakeByMoveBody`
Remove fake borrows of closure captures if that capture has been replaced with a by-move version of that capture.
For example, given an async closure that looks like:
```
let f: Foo;
let c = async move || {
match f { ... }
};
```
... in this pair of coroutine-closure + coroutine, we capture `Foo` in the parent and `&Foo` in the child. We will emit two fake borrows like:
```
_2 = &fake shallow (*(_1.0: &Foo));
_3 = &fake shallow (_1.0: &Foo);
```
However, since the by-move-body transform is responsible for replacing `_1.0: &Foo` with `_1.0: Foo` (since the `AsyncFnOnce` coroutine will own `Foo` by value), that makes the second fake borrow obsolete since we never have an upvar of type `&Foo`, and we should replace it with a `nop`.
As a side-note, we don't actually even care about fake borrows here at all since they're fully a MIR borrowck artifact, and we don't need to borrowck by-move MIR bodies. But it's best to preserve as much as we can between these two bodies :)
Fixes#138501
r? oli-obk
Allow more top-down inlining for single-BB callees
This means that things like `<usize as Step>::forward_unchecked` and `<PartialOrd for f32>::le` will inline even if
we've already done a bunch of inlining to find the calls to them.
Fixes#138136
~~Draft as it's built atop #138135, which adds a mir-opt test that's a nice demonstration of this. To see just this change, look at <48f63e3be5>~~ Rebased to be just the inlining change, as the other existing tests show it great.
This means that things like `<usize as Step>::forward_unchecked` and `<PartialOrd for f32>::le` will inline even if we've already done a bunch of inlining to find the calls to them.
Improve the generic MIR in the default `PartialOrd::le` and friends
It looks like I regressed this accidentally in #137197 due to #137901
So this PR does two things:
1. Tweaks the way we're calling `is_some_and` so that it optimizes in the generic MIR (rather than needing to optimize it in every monomorphization) -- the first commit adds a MIR test, so you can see the difference in the second commit.
2. Updates the implementations of `is_le` and friends to be slightly simpler, and parallel how clang does them.
Add FileCheck annotations to mir-opt/issues
This resolves a part of #116971 .
The directory `tests/mir-opt/issues` has only one test issue_75439.rs which should add FileCheck annotations.
Originally it was introduced in #75580 to confirm that there were duplicated basic blocks against or-patterns, but in #123067 the duplication was resolved. So FileCheck should ensure that there is no such duplication any more.
Instead of `expand_statements`. This makes the code shorter and
consistent with other MIR transform passes.
The tests require updating because there is a slight change in
MIR output:
- the old code replaced the original statement with twelve new
statements.
- the new code inserts converts the original statement to a `nop` and
then insert twelve new statements in front of it.
I.e. we now end up with an extra `nop`, which doesn't matter at all.
Emit MIR for each bit with on `dont_reset_cast_kind_without_updating_operand`
PR #136450 introduced a diff that includes a pointer-sized alloc. This doesn't cause any problems on the compiler test suite but it affects the test suite that ferrocene has for `aarch64-unknown-none` as the snapshot of the diff only includes a 32-bit alloc even though this should be a 64-bit alloc on `aarch64-unknown-none`.
r? ``@compiler-errors``