Commit Graph

275 Commits

Author SHA1 Message Date
Zalathar
ffeed2b94e Extract helper method module_add_named_metadata_node 2025-10-02 18:04:24 +10:00
Zalathar
ecb831dcf4 Extract helper method set_metadata_node 2025-10-02 18:04:23 +10:00
Zalathar
cc6329a9bc Replace MetadataType with the MetadataKindId constants 2025-09-30 20:10:30 +10:00
Matthias Krüger
35f443d318 Rollup merge of #146778 - nikic:allocator-shim-attributes, r=jackh726
Use standard attribute logic for allocator shim

Use llfn_attrs_from_instance() to generate the attributes for the allocator shim. This ensures that we generate all the usual attributes (and don't get to find out one-by-one that a certain attribute is important for a certain target). Additionally this will enable emitting the allocator-specific attributes (not included here).

This change is quite awkward because the allocator shim uses SimpleCx, while llfn_attrs_from_instance uses CodegenCx. I've switched it to use SimpleCx plus tcx/sess arguments where necessary. If there's a simpler way to do this, I'd love to know about it...
2025-09-26 18:11:11 +02:00
Stuart Cook
fab06469ee Rollup merge of #146667 - calebzulawski:simd-mono-lane-limit, r=lcnr,RalfJung
Add an attribute to check the number of lanes in a SIMD vector after monomorphization

Allows std::simd to drop the `LaneCount<N>: SupportedLaneCount` trait and maintain good error messages.

Also, extends rust-lang/rust#145967 by including spans in layout errors for all ADTs.

r? ``@RalfJung``

cc ``@workingjubilee`` ``@programmerjake``
2025-09-25 20:31:53 +10:00
Nikita Popov
d226e7aa93 Use standard attribute logic for allocator shim
Use llfn_attrs_from_instance() to generate the attributes for the
allocator shim. This ensures that we generate all the usual
attributes (and don't get to find out one-by-one that a certain
attribute is important for a certain target). Additionally this
will enable emitting the allocator-specific attributes (not
included here).

This change is quite awkward because the allocator shim uses
SimpleCx, while llfn_attrs_from_instance uses CodegenCx. I've
switched it to use SimpleCx plus tcx/sess arguments where necessary.
If there's a simpler way to do this, I'd love to know about it...
2025-09-25 10:04:40 +02:00
Caleb Zulawski
f5c6c9542e Add an attribute to check the number of lanes in a SIMD vector after monomorphization
Unify zero-length and oversized SIMD errors
2025-09-23 20:47:34 -04:00
Reuben Cruise
6f813e887a Adds AArch64 GCS support
- Adds option to rustc config to enable GCS
- Passes `guarded-control-stack` flag to llvm if enabled
2025-09-17 14:16:31 +01:00
Stuart Cook
6ad98750e0 Rollup merge of #145660 - jbatez:darwin_objc, r=jdonszelmann,madsmtm,tmandry
initial implementation of the darwin_objc unstable feature

Tracking issue: https://github.com/rust-lang/rust/issues/145496

This feature makes it possible to reference Objective-C classes and selectors using the same ABI used by native Objective-C on Apple/Darwin platforms. Without it, Rust code interacting with Objective-C must resort to loading classes and selectors using costly string-based lookups at runtime. With it, these references can be loaded efficiently at dynamic load time.

r? ```@tmandry```

try-job: `*apple*`
try-job: `x86_64-gnu-nopt`
2025-09-17 14:56:44 +10:00
Josh Stone
580b4891aa Update the minimum external LLVM to 20 2025-09-16 11:49:20 -07:00
Jo Bates
1ebf69d1b1 initial implementation of the darwin_objc unstable feature 2025-09-13 16:06:22 -07:00
WANG Rui
923b892b67 compiler: Apply target features to the entry function 2025-09-04 21:40:25 +08:00
Matthew Maurer
5d9f8fcd3e llvm: nvptx: Layout update to match LLVM
LLVM upstream switched layouts to support 256-bit vector load/store.
2025-09-02 18:47:59 +00:00
Zalathar
b4e97e5d86 Rename llvm::Bool aliases to standard const case
This avoids the need for `#![allow(non_upper_case_globals)]`.
2025-08-24 23:09:54 +10:00
许杰友 Jieyou Xu (Joe)
df01a87de2 Rollup merge of #140740 - ojeda:indirect-branch-cs-prefix, r=davidtwco
Add `-Zindirect-branch-cs-prefix`

Cc: ``@azhogin`` ``@Darksonn``

This goes on top of https://github.com/rust-lang/rust/pull/135927, i.e. please skip the first commit here. Please feel free to inherit it there.

In fact, I am not sure if there is any use case for the flag without `-Zretpoline*`. GCC and Clang allow it, though.

There is a `FIXME` for two `ignore`s in the test that I took from another test I did in the past -- they may be needed or not here since I didn't run the full CI. Either way, it is not critical.

Tracking issue: https://github.com/rust-lang/rust/issues/116852.
MCP: https://github.com/rust-lang/compiler-team/issues/868.
2025-08-19 19:42:01 +08:00
Stuart Cook
1e454c64b2 Rollup merge of #145309 - winstonallo:issue-145271-fix, r=tgross35
Fix `-Zregparm` for LLVM builtins

This fixes the issue where `-Zregparm=N` was not working correctly when calling LLVM intrinsics

By default on `x86-32`, arguments are passed on the stack. The `-Zregparm=N` flag allows the first `N` arguments to be passed in registers instead.

When calling intrinsics like `memset`, LLVM still passes parameters on the stack, which prevents optimizations like tail calls.

As proposed by ````@tgross35,```` I fixed this by setting the `NumRegisterParameters` LLVM module flag to `N` when the `-Zregparm=N` is set.

```rust
// compiler/rust_codegen_llvm/src/context.rs#375-382
if let Some(regparm_count) = sess.opts.unstable_opts.regparm {
    llvm::add_module_flag_u32(
        llmod,
        llvm::ModuleFlagMergeBehavior::Error,
        "NumRegisterParameters",
        regparm_count,
    );
}
```
[Here](https://rust.godbolt.org/z/YMezreo48) is a before/after compiler explorer.

Here is the final result for the code snippet in the original issue:
```asm
entrypoint:
        push    esi
        mov     esi, eax
        mov     eax, ecx
        mov     ecx, esi
        pop     esi
        jmp     memset   ; Tail call parameters in registers
```

Fixes: https://github.com/rust-lang/rust/issues/145271
2025-08-18 15:31:11 +10:00
Alice Ryhl
1cd7080c3a Add -Zindirect-branch-cs-prefix (from draft PR) 2025-08-17 16:50:23 +02:00
Marcelo Domínguez
e1d79b9aad Remove lto inline logic 2025-08-14 16:30:16 +00:00
Marcelo Domínguez
250d77e5d7 Complete functionality and general cleanup 2025-08-14 16:30:15 +00:00
Marcelo Domínguez
5c631041aa Basic implementation of autodiff intrinsic 2025-08-14 16:29:58 +00:00
winstonallo
04ff1444bb Set NumRegisterParameters LLVM module flag to N when -Zregparm=N is
set

* Enforce the `-Zregparm=N` flag by setting the NumRegisterParameters
LLVM module flag * Add assembly tests verifying that the parameters are
passed in registers for reparm values 1, 2, and 3, for both LLVM
intrinsics and non-builtin functions * Add c_void type to minicore
2025-08-13 17:37:30 +02:00
Tom Vijlbrief
2563e4a7ff [AVR] Changed data_layout 2025-08-12 08:33:27 +02:00
许杰友 Jieyou Xu (Joe)
5e3eb25125 Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk
gpu offload host code generation

r? ghost

This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening.

A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.

I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.

Tracking:
- https://github.com/rust-lang/rust/issues/131513
2025-07-22 00:54:24 +08:00
Manuel Drehwald
5958ebe829 add various wrappers for gpu code generation 2025-07-18 16:24:12 -07:00
Nikita Popov
63e1074c97 Update AMDGPU data layout 2025-07-18 09:35:11 +02:00
Oli Scherer
7f95f04267 Eliminate all direct uses of LLVMMDStringInContext2 2025-07-14 08:27:08 +00:00
Oli Scherer
56d22cd29f Use context methods instead of directly calling FFI 2025-07-14 08:27:08 +00:00
Oli Scherer
b9baf63f99 Merge typeid_metadata and create_metadata 2025-07-14 08:27:08 +00:00
Edoardo Marangoni
93f1201c06 compiler: Parse p- specs in datalayout string, allow definition of custom default data address space 2025-07-07 09:04:53 +02:00
Jacob Pratt
0eb8a66130 Rollup merge of #142588 - ZuseZ4:generic-ctx-imprv, r=oli-obk
Generic ctx imprv

Cleanup work for my gpu pr

r? `@oli-obk`
2025-06-17 23:19:36 +02:00
Manuel Drehwald
6359123d25 add and use generic get_const_int function 2025-06-16 14:23:06 -07:00
sayantn
9415f3d8a6 Use LLVMIntrinsicGetDeclaration to completely remove the hardcoded intrinsics list 2025-06-15 22:15:16 +05:30
sayantn
d56fcd968d Simplify implementation of Rust intrinsics by using type parameters in the cache 2025-06-12 00:32:42 +05:30
bors
15825b7161 Auto merge of #139385 - joboet:threadlocal_address, r=nikic
rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS

Fixes #136044
r? `@nikic`
2025-05-30 15:39:56 +00:00
joboet
e4d9b06cc8 rustc_codegen_llvm: use threadlocal.address intrinsic to access TLS 2025-05-29 16:07:43 +02:00
bjorn3
0fd257d66c Remove a couple of uses of interior mutability around statics 2025-05-28 20:55:00 +00:00
bjorn3
c593c01703 Remove codegen_unit from MiscCodegenMethods 2025-05-28 20:55:00 +00:00
Urgau
7f0ae5e3ad Use the fallback body for {minimum,maximum}f128 on LLVM as well. 2025-05-10 17:34:54 +02:00
Urgau
e7247df590 Use intrinsics for {f16,f32,f64,f128}::{minimum,maximum} operations 2025-05-09 17:11:23 +02:00
bit-aloo
7018392337 remove noinline attribute and add alwaysinline after AD pass 2025-04-28 21:10:32 +05:30
Josh Stone
12167d7064 Update the minimum external LLVM to 19 2025-04-05 11:44:38 -07:00
Matthias Krüger
543160dd62 Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco
KCFI: Add KCFI arity indicator support

Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 10:18:03 +02:00
Ramon de C Valle
a98546b961 KCFI: Add KCFI arity indicator support
Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311,
https://github.com/llvm/llvm-project/pull/121070, and
https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 04:05:04 +00:00
Stuart Cook
c6bf3a01ef Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Manuel Drehwald
b7c63a973f add autodiff batching backend 2025-04-04 14:24:23 -04:00
bors
1df5affaca Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics

Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics.

These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19.

I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something.

r? `@scottmcm`
2025-03-24 22:53:12 +00:00
bjorn3
b754ef727c Remove implicit #[no_mangle] for #[rustc_std_internal_symbol] 2025-03-17 14:08:09 +00:00
Matthias Krüger
63c548d82c Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco
Clean up various LLVM FFI things in codegen_llvm

cc ```@ZuseZ4``` I touched some autodiff parts

The major change of this PR is [bfd88ce](bfd88cead0) which makes `CodegenCx` generic just like `GenericBuilder`

The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks.

best reviewed commit-by-commit
2025-03-07 19:15:34 +01:00
DaniPopes
58c10c66c1 Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding
LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.
2025-03-06 22:29:05 +08:00
Oli Scherer
553828c6f4 Mark more LLVM FFI as safe 2025-02-24 15:11:29 +00:00