Inspired by <https://github.com/rust-lang/rust/pull/138759#discussion_r2156375342> where I noticed that we were nearly at this point, plus the comments I was writing in 143410 that reminded me a type-dependent `true` is fine.
This PR splits the `OperandRef::builder` logic out to a separate type, with the updates needed to handle SIMD as well. In doing so, that makes the existing `Aggregate` path in `codegen_rvalue_operand` capable of handing SIMD values just fine.
As a result, we no longer need to do layout calculations for aggregate result types when running the analysis to determine which things can be SSA in codegen.
Another refactor pulled out from 138759
The previous implementation I'd written here based on `index_by_increasing_offset` is complicated to follow and difficult to extend to non-structs.
This changes the implementation, without actually changing any codegen (thus no test changes either), to be more like the existing `extract_field` (<2b0274c71d/compiler/rustc_codegen_ssa/src/mir/operand.rs (L345-L425)>) in that it allows setting a particular field directly.
Notably I've found this one much easier to get right, in particular because having the `OperandRef<Result<V, Scalar>>` gives a really useful thing to include in ICE messages if something did happen to go wrong.
The initial naming of "Abi" was an awful mistake, conveying wrong ideas
about how psABIs worked and even more about what the enum meant.
It was only meant to represent the way the value would be described to
a codegen backend as it was lowered to that intermediate representation.
It was never meant to mean anything about the actual psABI handling!
The conflation is because LLVM typically will associate a certain form
with a certain ABI, but even that does not hold when the special cases
that actually exist arise, plus the IR annotations that modify the ABI.
Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename
`BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to
the persistent misunderstandings, this too is now incorrect:
- Scattered ABI-relevant code is entangled with BackendRepr
- We do not always pre-compute a correct BackendRepr that reflects how
we "actually" want this value to be handled, so we leave the backend
interface to also inject various special-cases here
- In some cases `BackendRepr::Memory` is a "real" aggregate, but in
others it is in fact using memory, and in some cases it is a scalar!
Our rustc-to-backend lowering code handles this sort of thing right now.
That will eventually be addressed by lifting duplicated lowering code
to either rustc_codegen_ssa or rustc_target as appropriate.
Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`.
Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit
changes the latter to `XyzCodegenMethods`, for consistency.
I added `PlaceValue` in 123775, but kept that one line-by-line simple because it touched so many places.
This goes through to add more helpers & docs, and change some `PlaceRef` to `PlaceValue` where the type didn't need to be included.
No behaviour changes.
Avoid `alloca`s in codegen for simple `mir::Aggregate` statements
The core idea here is to remove the abstraction penalty of simple newtypes in codegen.
Even something simple like constructing a
```rust
#[repr(transparent)] struct Foo(u32);
```
forces an `alloca` to be generated in nightly right now.
Certainly LLVM can optimize that away, but it would be nice if it didn't have to.
Quick example:
```rust
#[repr(transparent)]
pub struct Transparent32(u32);
#[no_mangle]
pub fn make_transparent(x: u32) -> Transparent32 {
let a = Transparent32(x);
a
}
```
on nightly we produce <https://rust.godbolt.org/z/zcvoM79ae>
```llvm
define noundef i32 `@make_transparent(i32` noundef %x) unnamed_addr #0 {
%a = alloca i32, align 4
store i32 %x, ptr %a, align 4
%0 = load i32, ptr %a, align 4, !noundef !3
ret i32 %0
}
```
but after this PR we produce
```llvm
define noundef i32 `@make_transparent(i32` noundef %x) unnamed_addr #0 {
start:
ret i32 %x
}
```
(even before the optimizer runs).