rust-lang/rust - rust - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Zalathar	ffeed2b94e	Extract helper method `module_add_named_metadata_node`	2025-10-02 18:04:24 +10:00
Zalathar	ecb831dcf4	Extract helper method `set_metadata_node`	2025-10-02 18:04:23 +10:00
Zalathar	cc6329a9bc	Replace `MetadataType` with the `MetadataKindId` constants	2025-09-30 20:10:30 +10:00
Matthias Krüger	35f443d318	Rollup merge of #146778 - nikic:allocator-shim-attributes, r=jackh726 Use standard attribute logic for allocator shim Use llfn_attrs_from_instance() to generate the attributes for the allocator shim. This ensures that we generate all the usual attributes (and don't get to find out one-by-one that a certain attribute is important for a certain target). Additionally this will enable emitting the allocator-specific attributes (not included here). This change is quite awkward because the allocator shim uses SimpleCx, while llfn_attrs_from_instance uses CodegenCx. I've switched it to use SimpleCx plus tcx/sess arguments where necessary. If there's a simpler way to do this, I'd love to know about it...	2025-09-26 18:11:11 +02:00
Stuart Cook	fab06469ee	Rollup merge of #146667 - calebzulawski:simd-mono-lane-limit, r=lcnr,RalfJung Add an attribute to check the number of lanes in a SIMD vector after monomorphization Allows std::simd to drop the `LaneCount<N>: SupportedLaneCount` trait and maintain good error messages. Also, extends rust-lang/rust#145967 by including spans in layout errors for all ADTs. r? ``@RalfJung`` cc ``@workingjubilee`` ``@programmerjake``	2025-09-25 20:31:53 +10:00
Nikita Popov	d226e7aa93	Use standard attribute logic for allocator shim Use llfn_attrs_from_instance() to generate the attributes for the allocator shim. This ensures that we generate all the usual attributes (and don't get to find out one-by-one that a certain attribute is important for a certain target). Additionally this will enable emitting the allocator-specific attributes (not included here). This change is quite awkward because the allocator shim uses SimpleCx, while llfn_attrs_from_instance uses CodegenCx. I've switched it to use SimpleCx plus tcx/sess arguments where necessary. If there's a simpler way to do this, I'd love to know about it...	2025-09-25 10:04:40 +02:00
Caleb Zulawski	f5c6c9542e	Add an attribute to check the number of lanes in a SIMD vector after monomorphization Unify zero-length and oversized SIMD errors	2025-09-23 20:47:34 -04:00
Reuben Cruise	6f813e887a	Adds AArch64 GCS support - Adds option to rustc config to enable GCS - Passes `guarded-control-stack` flag to llvm if enabled	2025-09-17 14:16:31 +01:00
Stuart Cook	6ad98750e0	Rollup merge of #145660 - jbatez:darwin_objc, r=jdonszelmann,madsmtm,tmandry initial implementation of the darwin_objc unstable feature Tracking issue: https://github.com/rust-lang/rust/issues/145496 This feature makes it possible to reference Objective-C classes and selectors using the same ABI used by native Objective-C on Apple/Darwin platforms. Without it, Rust code interacting with Objective-C must resort to loading classes and selectors using costly string-based lookups at runtime. With it, these references can be loaded efficiently at dynamic load time. r? ```@tmandry``` try-job: `apple` try-job: `x86_64-gnu-nopt`	2025-09-17 14:56:44 +10:00
Josh Stone	580b4891aa	Update the minimum external LLVM to 20	2025-09-16 11:49:20 -07:00
Jo Bates	1ebf69d1b1	initial implementation of the darwin_objc unstable feature	2025-09-13 16:06:22 -07:00
WANG Rui	923b892b67	compiler: Apply target features to the entry function	2025-09-04 21:40:25 +08:00
Matthew Maurer	5d9f8fcd3e	llvm: nvptx: Layout update to match LLVM LLVM upstream switched layouts to support 256-bit vector load/store.	2025-09-02 18:47:59 +00:00
Zalathar	b4e97e5d86	Rename `llvm::Bool` aliases to standard const case This avoids the need for `#![allow(non_upper_case_globals)]`.	2025-08-24 23:09:54 +10:00
许杰友 Jieyou Xu (Joe)	df01a87de2	Rollup merge of #140740 - ojeda:indirect-branch-cs-prefix, r=davidtwco Add `-Zindirect-branch-cs-prefix` Cc: ``@azhogin`` ``@Darksonn`` This goes on top of https://github.com/rust-lang/rust/pull/135927, i.e. please skip the first commit here. Please feel free to inherit it there. In fact, I am not sure if there is any use case for the flag without `-Zretpoline*`. GCC and Clang allow it, though. There is a `FIXME` for two `ignore`s in the test that I took from another test I did in the past -- they may be needed or not here since I didn't run the full CI. Either way, it is not critical. Tracking issue: https://github.com/rust-lang/rust/issues/116852. MCP: https://github.com/rust-lang/compiler-team/issues/868.	2025-08-19 19:42:01 +08:00
Stuart Cook	1e454c64b2	Rollup merge of #145309 - winstonallo:issue-145271-fix, r=tgross35 Fix `-Zregparm` for LLVM builtins This fixes the issue where `-Zregparm=N` was not working correctly when calling LLVM intrinsics By default on `x86-32`, arguments are passed on the stack. The `-Zregparm=N` flag allows the first `N` arguments to be passed in registers instead. When calling intrinsics like `memset`, LLVM still passes parameters on the stack, which prevents optimizations like tail calls. As proposed by ````@tgross35,```` I fixed this by setting the `NumRegisterParameters` LLVM module flag to `N` when the `-Zregparm=N` is set. ```rust // compiler/rust_codegen_llvm/src/context.rs#375-382 if let Some(regparm_count) = sess.opts.unstable_opts.regparm { llvm::add_module_flag_u32( llmod, llvm::ModuleFlagMergeBehavior::Error, "NumRegisterParameters", regparm_count, ); } ``` [Here](https://rust.godbolt.org/z/YMezreo48) is a before/after compiler explorer. Here is the final result for the code snippet in the original issue: ```asm entrypoint: push esi mov esi, eax mov eax, ecx mov ecx, esi pop esi jmp memset ; Tail call parameters in registers ``` Fixes: https://github.com/rust-lang/rust/issues/145271	2025-08-18 15:31:11 +10:00
Alice Ryhl	1cd7080c3a	Add -Zindirect-branch-cs-prefix (from draft PR)	2025-08-17 16:50:23 +02:00
Marcelo Domínguez	e1d79b9aad	Remove lto inline logic	2025-08-14 16:30:16 +00:00
Marcelo Domínguez	250d77e5d7	Complete functionality and general cleanup	2025-08-14 16:30:15 +00:00
Marcelo Domínguez	5c631041aa	Basic implementation of `autodiff` intrinsic	2025-08-14 16:29:58 +00:00
winstonallo	04ff1444bb	Set NumRegisterParameters LLVM module flag to `N` when `-Zregparm=N` is set * Enforce the `-Zregparm=N` flag by setting the NumRegisterParameters LLVM module flag * Add assembly tests verifying that the parameters are passed in registers for reparm values 1, 2, and 3, for both LLVM intrinsics and non-builtin functions * Add c_void type to minicore	2025-08-13 17:37:30 +02:00
Tom Vijlbrief	2563e4a7ff	[AVR] Changed data_layout	2025-08-12 08:33:27 +02:00
许杰友 Jieyou Xu (Joe)	5e3eb25125	Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk gpu offload host code generation r? ghost This will generate most of the host side code to use llvm's offload feature. The first PR will only handle automatic mem-transfers to and from the device. So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch. Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening. A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU. A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues. I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work. This work will also be compatible with std::autodiff, so one can differentiate GPU kernels. Tracking: - https://github.com/rust-lang/rust/issues/131513	2025-07-22 00:54:24 +08:00
Manuel Drehwald	5958ebe829	add various wrappers for gpu code generation	2025-07-18 16:24:12 -07:00
Nikita Popov	63e1074c97	Update AMDGPU data layout	2025-07-18 09:35:11 +02:00
Oli Scherer	7f95f04267	Eliminate all direct uses of LLVMMDStringInContext2	2025-07-14 08:27:08 +00:00
Oli Scherer	56d22cd29f	Use context methods instead of directly calling FFI	2025-07-14 08:27:08 +00:00
Oli Scherer	b9baf63f99	Merge `typeid_metadata` and `create_metadata`	2025-07-14 08:27:08 +00:00
Edoardo Marangoni	93f1201c06	compiler: Parse `p-` specs in datalayout string, allow definition of custom default data address space	2025-07-07 09:04:53 +02:00
Jacob Pratt	0eb8a66130	Rollup merge of #142588 - ZuseZ4:generic-ctx-imprv, r=oli-obk Generic ctx imprv Cleanup work for my gpu pr r? `@oli-obk`	2025-06-17 23:19:36 +02:00
Manuel Drehwald	6359123d25	add and use generic get_const_int function	2025-06-16 14:23:06 -07:00
sayantn	9415f3d8a6	Use `LLVMIntrinsicGetDeclaration` to completely remove the hardcoded intrinsics list	2025-06-15 22:15:16 +05:30
sayantn	d56fcd968d	Simplify implementation of Rust intrinsics by using type parameters in the cache	2025-06-12 00:32:42 +05:30
bors	15825b7161	Auto merge of #139385 - joboet:threadlocal_address, r=nikic rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes #136044 r? `@nikic`	2025-05-30 15:39:56 +00:00
joboet	e4d9b06cc8	rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS	2025-05-29 16:07:43 +02:00
bjorn3	0fd257d66c	Remove a couple of uses of interior mutability around statics	2025-05-28 20:55:00 +00:00
bjorn3	c593c01703	Remove codegen_unit from MiscCodegenMethods	2025-05-28 20:55:00 +00:00
Urgau	7f0ae5e3ad	Use the fallback body for `{minimum,maximum}f128` on LLVM as well.	2025-05-10 17:34:54 +02:00
Urgau	e7247df590	Use intrinsics for `{f16,f32,f64,f128}::{minimum,maximum}` operations	2025-05-09 17:11:23 +02:00
bit-aloo	7018392337	remove noinline attribute and add alwaysinline after AD pass	2025-04-28 21:10:32 +05:30
Josh Stone	12167d7064	Update the minimum external LLVM to 19	2025-04-05 11:44:38 -07:00
Matthias Krüger	543160dd62	Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).	2025-04-05 10:18:03 +02:00
Ramon de C Valle	a98546b961	KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).	2025-04-05 04:05:04 +00:00
Stuart Cook	c6bf3a01ef	Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283	2025-04-05 13:18:13 +11:00
Manuel Drehwald	b7c63a973f	add autodiff batching backend	2025-04-04 14:24:23 -04:00
bors	1df5affaca	Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`	2025-03-24 22:53:12 +00:00
bjorn3	b754ef727c	Remove implicit #[no_mangle] for #[rustc_std_internal_symbol]	2025-03-17 14:08:09 +00:00
Matthias Krüger	63c548d82c	Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [`bfd88ce`](`bfd88cead0`) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit	2025-03-07 19:15:34 +01:00
DaniPopes	58c10c66c1	Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.	2025-03-06 22:29:05 +08:00
Oli Scherer	553828c6f4	Mark more LLVM FFI as safe	2025-02-24 15:11:29 +00:00

1 2 3 4 5 ...

275 Commits