Commit Graph

3349 Commits

Author SHA1 Message Date
bors
5d22242a3a Auto merge of #144062 - bjorn3:lto_refactors2, r=davidtwco
Various refactors to the LTO handling code (part 2)

Continuing from https://github.com/rust-lang/rust/pull/143388 this removes a bit of dead code and moves the LTO symbol export calculation from individual backends to cg_ssa.
2025-07-24 12:50:26 +00:00
usamoi
e31876c143 generate elf symbol version in raw-dylib 2025-07-24 19:04:00 +08:00
Camille GILLOT
0460c92d52 Remove useless lifetime parameter. 2025-07-23 23:54:37 +00:00
Camille GILLOT
9ff071219b Give an AllocId to ConstValue::Slice. 2025-07-23 23:54:37 +00:00
Scott McMurray
a93a9aa2d5 Don't emit two assumes in transmutes when one is a subset of the other
For example, transmuting between `bool` and `Ordering` doesn't need two `assume`s because one range is a superset of the other.

Multiple are still used for things like `char` <-> `NonZero<u32>`, which overlap but where neither fully contains the other.
2025-07-23 09:16:32 -07:00
Scott McMurray
b7e025cfb6 Remove rvalue_creates_operand entirely
Split to a separate commit to it could be reverted later if necessary, should we get new `Rvalue`s where we can't handle it this way.
2025-07-23 08:40:27 -07:00
Scott McMurray
ea0c7788c0 re-enable direct bitcasts for Int/Float vector transmutes (but not ones involving pointers) 2025-07-23 08:32:46 -07:00
Scott McMurray
231dddde3e Let codegen_transmute_operand just handle everything
When combined with 143720, this means `rvalue_creates_operand` can just return `true` for *every* `Rvalue`.  (A future PR could consider removing it, though just letting it optimize out is fine for now.)

It's nicer anyway, IMHO, because it avoids needing the layout checks to be consistent in the two places, and thus is an overall reduction in code.  Plus it's a more helpful building block when used in other places this way.
2025-07-23 08:25:13 -07:00
Scott McMurray
6a5c7e0415 No longer need allocas for consuming Result<!, i32> and similar
In optimized builds GVN gets rid of these already, but in `opt-level=0` we actually make `alloca`s for this, which particularly impacts `?`-style things that use actually-only-one-variant types like this.
2025-07-23 00:09:36 -07:00
Ralf Jung
de1b999ff6 atomicrmw on pointers: move integer-pointer cast hacks into backend 2025-07-23 08:32:55 +02:00
Alisa Sireneva
ed11a39643 Don't special-case llvm.* as nounwind
Certain LLVM intrinsics, such as `llvm.wasm.throw`, can unwind. Marking
them as nounwind causes us to skip cleanup of locals and optimize out
`catch_unwind` under inlining or when `llvm.wasm.throw` is used directly
by user code.

The motivation for forcibly marking llvm.* as nounwind is no longer
present: most intrinsics are linked as `extern "C"` or other
non-unwinding ABIs, so we won't codegen `invoke` for them anyway.
2025-07-23 02:17:54 +03:00
Guillaume Gomez
a27f3e3fd1 Rename tests/codegen into tests/codegen-llvm 2025-07-22 14:28:48 +02:00
许杰友 Jieyou Xu (Joe)
5e3eb25125 Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk
gpu offload host code generation

r? ghost

This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening.

A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.

I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.

Tracking:
- https://github.com/rust-lang/rust/issues/131513
2025-07-22 00:54:24 +08:00
bjorn3
dadc4cae50 Remove each_linked_rlib_for_lto from CodegenContext 2025-07-21 07:58:44 +00:00
bjorn3
1a6f941d2b Move exported_symbols_for_lto out of CodegenContext 2025-07-21 07:58:44 +00:00
bjorn3
6e757354ad Merge exported_symbols computation into exported_symbols_for_lto
And move exported_symbols_for_lto call from backends to cg_ssa.
2025-07-21 07:58:44 +00:00
bjorn3
1c8dc6f440 Move LTO symbol export calculation from backends to cg_ssa 2025-07-21 07:58:44 +00:00
bjorn3
2ad7930b40 Remove worker id
It isn't used anywhere. Also inline free_worker into the only call site.
2025-07-21 07:58:44 +00:00
bjorn3
112799e637 Merge modules and cached_modules for fat LTO
The modules vec can already contain serialized modules and there is no
need to distinguish between cached and non-cached cgus at LTO time.
2025-07-21 07:58:44 +00:00
Scott McMurray
41ce1ed252 Ban projecting into SIMD types [MCP838] 2025-07-20 10:22:09 -07:00
Guillaume Gomez
1e6ef245cd Rollup merge of #144143 - Gelbpunkt:target-features-crt-static, r=RalfJung
Fix `-Ctarget-feature`s getting ignored after `crt-static`

The current behaviour introduced by commit a50a3b8e31 would discard any target features specified after `crt-static` (the only member of `RUSTC_SPECIFIC_FEATURES`). This is because it returned instead of continuing processing the next feature.

I wasn't entirely sure how the regression test should look like, but this one should do. If anyone has some suggestions, I'm happy to learn, it's my first test :)

I've confirmed that the test fails without the fix on `powerpc64le-unknown-linux-musl` and `x86_64-unknown-linux-gnu`.

cc ``@RalfJung``
2025-07-20 15:34:07 +02:00
Scott McMurray
0586c63e07 Allow Rvalue::Repeat to return true in rvalue_creates_operand too
The conversation in 143502 made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  I'll try out such a world next :)
2025-07-19 20:50:02 -07:00
bors
83825dd277 Auto merge of #143784 - scottmcm:enums-again-new-ex2, r=dianqk
Simplify discriminant codegen for niche-encoded variants which don't wrap across an integer boundary

Inspired by rust-lang/rust#139729, this attempts to be a much-simpler and more-localized change while still making a difference.  (Specifically, this does not try to solve the problem with select-sinking, leaving that to be fixed by https://github.com/llvm/llvm-project/issues/134024 -- once it gets released -- instead of in rustc's codegen.)

What this *does* improve is checking for the variant in a 3+ variant enum when that variant is the type providing the niche.  Something like `if let Foo::WithBool(_) = ...` previously compiled to `ugt(add(x, -2), 2)`, which is non-trivial to think about because it's depending on the unsigned wrapping to shift the 0/1 up above 2.  With this PR it compiles to just `ult(x, 2)`, which is probably what you'd have written yourself if you were doing it by hand to look for "is this byte a bool?".

That's done by leaving most of the codegen alone, but adding a couple new special cases to the `is_niche` check.  The default looks at the relative discriminant, but in the common cases where there's no wraparound involved, we can just check the original value, rather than the offsetted one.

The first commit just adds some tests, so the best way to see the effect of this change is to look at the second commit and how it updates the test expectations.
2025-07-19 08:03:40 +00:00
Manuel Drehwald
634016478e add -Zoffload=Enable flag behind -Zunstable-options, to enable gpu (host) code generation 2025-07-18 16:24:00 -07:00
Jens Reidel
664d742933 rustc_codegen_ssa: Don't skip target-features after crt-static
The current behaviour introduced by commit
a50a3b8e31 would discard any
target features specified after crt-static (the only member of
RUSTC_SPECIFIC_FEATURES). This is because it returned instead of
continuing processing the next flag.

Signed-off-by: Jens Reidel <adrian@travitia.xyz>
2025-07-18 18:59:13 +02:00
Matthias Krüger
79c8f90460 Rollup merge of #143846 - usamoi:gc, r=bjorn3
pass --gc-sections if -Zexport-executable-symbols is enabled and improve tests

Exported symbols are added as GC roots in linking, so `--gc-sections` won't hurt `-Zexport-executable-symbols`.

Fixes the run-make test to work on Linux. Enable the ui test on more targets.

cc rust-lang/rust#84161
2025-07-18 04:27:52 +02:00
Matthias Krüger
accf61dd42 Rollup merge of #143293 - folkertdev:naked-function-kcfi, r=compiler-errors
fix `-Zsanitizer=kcfi` on `#[naked]` functions

fixes https://github.com/rust-lang/rust/issues/143266

With `-Zsanitizer=kcfi`, indirect calls happen via generated intermediate shim that forwards the call. The generated shim preserves the attributes of the original, including `#[unsafe(naked)]`. The shim is not a naked function though, and violates its invariants (like having a body that consists of a single `naked_asm!` call).

My fix here is to match on the `InstanceKind`, and only use `codegen_naked_asm` when the instance is not a `ReifyShim`. That does beg the question whether there are other `InstanceKind`s that could come up. As far as I can tell the answer is no: calling via `dyn` seems to work find, and `#[track_caller]` is disallowed in combination with `#[naked]`.

r? codegen
````@rustbot```` label +A-naked
cc ````@maurer```` ````@rcvalle````
2025-07-18 04:27:51 +02:00
usamoi
5bb6b9db30 remove no_gc_sections 2025-07-17 14:54:52 +08:00
León Orell Valerian Liehr
be5f8f299d Rollup merge of #143388 - bjorn3:lto_refactors, r=compiler-errors
Various refactors to the LTO handling code

In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
2025-07-17 03:58:28 +02:00
Folkert de Vries
9c8ab89187 use codegen_instance_attrs where an instance is (easily) available 2025-07-16 23:24:32 +02:00
Folkert de Vries
ec0ff720d1 add codegen_instance_attrs query
and use it for naked functions
2025-07-16 21:38:58 +02:00
Folkert de Vries
f100767dce fix -Zsanitizer=kcfi on #[naked] functions
And more broadly only codegen `InstanceKind::Item` using the naked
function codegen code. Other instance kinds should follow the normal
path.
2025-07-16 21:38:48 +02:00
Samuel Tardieu
b564ecf04b Rollup merge of #143920 - oli-obk:cg-llvm-safety, r=jieyouxu
Make more of codegen_llvm safe

Best reviewed commit-by-commit.
2025-07-16 17:06:40 +02:00
Scott McMurray
4fa23d96bc Improve comments inside codegen_get_discr 2025-07-15 22:30:46 -07:00
Oli Scherer
7f95f04267 Eliminate all direct uses of LLVMMDStringInContext2 2025-07-14 08:27:08 +00:00
Anne Stijns
75561c446a Port #[link_ordinal] to the new attribute parsing infrastructure. 2025-07-13 11:51:01 +02:00
usamoi
f58accb8f3 pass --gc-sections if -Zexport-executable-symbols is enabled and improve tests 2025-07-13 16:27:47 +08:00
Scott McMurray
d5bcfb334b Simplify codegen for niche-encoded variant tests 2025-07-12 04:53:24 -07:00
bors
915e535244 Auto merge of #143810 - matthiaskrgr:rollup-iw7a23z, r=matthiaskrgr
Rollup of 9 pull requests

Successful merges:

 - rust-lang/rust#143403 (Port several trait/coherence-related attributes the new attribute system)
 - rust-lang/rust#143633 (fix: correct assertion to check for 'noinline' attribute presence before removal)
 - rust-lang/rust#143647 (Clarify and expand documentation for std::sys_common dependency structure)
 - rust-lang/rust#143716 (compiler: doc/comment some codegen-for-functions interfaces)
 - rust-lang/rust#143747 (Add target maintainer information for aarch64-unknown-linux-musl)
 - rust-lang/rust#143759 (Fix typos in function names in the `target_feature` test)
 - rust-lang/rust#143767 (Bump `src/tools/x` to Edition 2024 and some cleanups)
 - rust-lang/rust#143769 (Remove support for SwitchInt edge effects in backward dataflow)
 - rust-lang/rust#143770 (build-helper: clippy fixes)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-07-12 10:46:43 +00:00
bors
2f9c9cede6 Auto merge of #143766 - matthiaskrgr:rollup-0x7t69s, r=matthiaskrgr
Rollup of 8 pull requests

Successful merges:

 - rust-lang/rust#142391 (rust: library: Add `setsid` method to `CommandExt` trait)
 - rust-lang/rust#143302 (`tests/ui`: A New Order [27/N])
 - rust-lang/rust#143303 (`tests/ui`: A New Order [28/28] FINAL PART)
 - rust-lang/rust#143568 (std: sys: net: uefi: tcp4: Add timeout support)
 - rust-lang/rust#143611 (Mention more APIs in `ParseIntError` docs)
 - rust-lang/rust#143661 (chore: Improve how the other suggestions message gets rendered)
 - rust-lang/rust#143708 (fix: Include frontmatter in -Zunpretty output )
 - rust-lang/rust#143718 (Make UB transmutes really UB in LLVM)

r? `@ghost`
`@rustbot` modify labels: rollup

try-job: i686-gnu-nopt-1
try-job: test-various
2025-07-12 07:44:04 +00:00
Matthias Krüger
b18064f2f2 Rollup merge of #143716 - workingjubilee:document-some-codegen-backend-stuff, r=bjorn3,fee1-dead
compiler: doc/comment some codegen-for-functions interfaces

An out-of-date comment gets updated and some underdocumented functions get documented.
2025-07-11 19:45:24 +02:00
Jubilee Young
39f7707fea compiler: comment on some call-related codegen fn in cg_ssa
Partially documents the situation due to LLVM CFI.
2025-07-11 01:08:21 -07:00
Matthias Krüger
e43481e362 Rollup merge of #143718 - scottmcm:ub-transmute-is-ub, r=WaffleLapkin
Make UB transmutes really UB in LLVM

Ralf suggested in <https://github.com/rust-lang/rust/pull/143410#discussion_r2184928123> that UB transmutes shouldn't be trapping, which happened for the one path *that* PR was changing, but there's another path as well, so *this* PR changes that other path to match.

r? codegen
2025-07-11 07:35:22 +02:00
bors
855e0fe46e Auto merge of #142911 - mejrs:unsized, r=compiler-errors
Remove support for dynamic allocas

Followup to rust-lang/rust#141811
2025-07-11 05:27:32 +00:00
Matthias Krüger
b11e9e31dd Rollup merge of #143446 - usamoi:export-executable-symbols, r=bjorn3,oli-obk
use `--dynamic-list` for exporting executable symbols

closes rust-lang/rust#101610
cc rust-lang/rust#84161

https://sourceware.org/binutils/docs-2.39/ld/VERSION.html:

> --dynamic-list=dynamic-list-file
Specify the name of a dynamic list file to the linker. This is typically used when creating shared libraries to specify a list of global symbols whose references shouldn’t be bound to the definition within the shared library, or creating dynamically linked executables to specify a list of symbols which should be added to the symbol table in the executable. This option is only meaningful on ELF platforms which support shared libraries.

`ld.lld --help`:

>   --dynamic-list=<file>: Similar to --export-dynamic-symbol-list. When creating a shared object, this additionally implies -Bsymbolic but does not set DF_SYMBOLIC

>  --export-dynamic-symbol-list=file: Read a list of dynamic symbol patterns. Apply --export-dynamic-symbol on each pattern

>  --export-dynamic-symbol=glob: (executable) Put matched symbols in the dynamic symbol table. (shared object) References to matched non-local STV_DEFAULT symbols shouldn't be bound to definitions within the shared object. Does not imply -Bsymbolic.

>  --export-dynamic: Put symbols in the dynamic symbol table

Use `--dynamic-list` because it's older than `--export-dynamic-symbol-list` (binutils 2.35)

try-job: dist-i586-gnu-i586-i686-musl
2025-07-10 20:28:46 +02:00
Scott McMurray
f5fc8727db Add BuilderMethods::unreachable_nonterminator
So places that need `unreachable` but in the middle of a basic block can call that instead of figuring out the best way to do it.
2025-07-10 09:17:28 -07:00
bors
cf3fb768db Auto merge of #143696 - oli-obk:constable-type-id2, r=RalfJung
Add opaque TypeId handles for CTFE

Reopen of https://github.com/rust-lang/rust/pull/142789#issuecomment-3053155043 after some bors insta-merge chaos

r? `@RalfJung`
2025-07-10 07:04:03 +00:00
Scott McMurray
58d7c2d5a7 Make UB transmutes really UB in LLVM
Ralf suggested in <https://github.com/rust-lang/rust/pull/143410#discussion_r2184928123> that UB transmutes shouldn't be trapping, which happened for the one path that PR was changing, but there's another path as well, so this PR changes that other path to match.
2025-07-09 22:30:15 -07:00
bors
e3fccdd4a1 Auto merge of #143502 - scottmcm:aggregate-simd, r=oli-obk
Let `rvalue_creates_operand` return true for *all* `Rvalue::Aggregate`s

~~Draft for now because it's built on Ralf's rust-lang/rust#143291~~

Inspired by https://github.com/rust-lang/rust/pull/138759#discussion_r2156375342 where I noticed that we were nearly at this point, plus the comments I was writing in rust-lang/rust#143410 that reminded me a type-dependent `true` is fine.

This PR splits the `OperandRef::builder` logic out to a separate type, with the updates needed to handle SIMD as well.  In doing so, that makes the existing `Aggregate` path in `codegen_rvalue_operand` capable of handing SIMD values just fine.

As a result, we no longer need to do layout calculations for aggregate result types when running the analysis to determine which things can be SSA in codegen.
2025-07-09 16:37:20 +00:00
Oli Scherer
486ffda9dc Add opaque TypeId handles for CTFE 2025-07-09 16:37:11 +00:00