Allow custom default address spaces and parse `p-` specifications in the datalayout string
Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout).
This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications.
The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run.
r? workingjubilee
Fix some comments and related types and locals where it is obvious, e.g.
- bare_fn -> fn_ptr
- LifetimeBinderKind::BareFnType -> LifetimeBinderKind::FnPtrType
Co-authored-by: León Orell Valerian Liehr <me@fmease.dev>
It's like `Symbol` but for byte strings. The interner is now used for
both `Symbol` and `ByteSymbol`. E.g. if you intern `"dog"` and `b"dog"`
you'll get a `Symbol` and a `ByteSymbol` with the same index and the
characters will only be stored once.
The motivation for this is to eliminate the `Arc`s in `ast::LitKind`, to
make `ast::LitKind` impl `Copy`, and to avoid the need to arena-allocate
`ast::LitKind` in HIR. The latter change reduces peak memory by a
non-trivial amount on literal-heavy benchmarks such as `deep-vector` and
`tuple-stress`.
`Encoder`, `Decoder`, `SpanEncoder`, and `SpanDecoder` all get some
changes so that they can handle normal strings and byte strings.
This change does slow down compilation of programs that use
`include_bytes!` on large files, because the contents of those files are
now interned (hashed). This makes `include_bytes!` more similar to
`include_str!`, though `include_bytes!` contents still aren't escaped,
and hashing is still much cheaper than escaping.
Taint body on invalid call ABI
Fixes https://github.com/rust-lang/rust/issues/142969
I'm not certain if there are any other paths that should be tainted, but they would operate similarly. Perhaps pointer coercion.
Introduces `extern "rust-invalid"` for testing purposes.
r? ```@workingjubilee``` or ```@oli-obk``` (or anyone)
We modify rustc_ast_lowering to prevent all unsupported ABIs
from leaking through the HIR without being checked for target support.
Previously ad-hoc checking on various HIR items required making sure
we check every HIR item which could contain an `extern "{abi}"` string.
This is a losing proposition compared to gating the lowering itself.
As a consequence, unsupported ABI strings will now hard-error instead of
triggering the FCW `unsupported_fn_ptr_calling_conventions`.
This FCW was upgraded to warn in dependencies in Rust 1.87 which was
released on 2025 May 17, and it is now 2025 June, so it has become
active within a stable Rust version.
As we already had errored on these ABIs in most other positions, and
have warned for fn ptrs, this breakage has had reasonable foreshadowing.
However, this does cause errors for usages of `extern "{abi}"` that were
theoretically writeable within source but could not actually be applied
in any useful way by Rust programmers without either warning or error.
For instance, trait declarations without impls were never checked.
These are the exact kinds of leakages that this new approach prevents.
A deprecation cycle is not useful for these marginal cases as upon impl,
even default impls within traits, different HIR objects would be used.
Details of our HIR analysis meant that those objects did get checked.
We choose to error twice if an ABI is also barred by a feature gate
on the presumption that usage of a target-incorrect ABI is intentional.
Co-authored-by: Ralf Jung <post@ralfj.de>
assert more in release in `rustc_ast_lowering`
My understanding of the compiler's architecture is that in the `ast_lowering` crate, we are constructing the HIR as a one-time thing per crate. This is after tokenizing, parsing, resolution, expansion, possible reparsing, reresolution, reexpansion, and so on. In other words, there are many reasons that perf-focused PRs spend a lot of time touching `rustc_parse`, `rustc_expand`, `rustc_ast`, and then `rustc_hir` and "onwards", but `ast_lowering` is a little bit of an odd duck.
In this crate, we have a number of debug assertions. Some are clearly expensive checks that seem like they are prohibitive to run in actual optimized compiler builds, but then there are a number that are simple asserts on integer equalities, `is_empty`, or the like. I believe we should do some of them even in release builds, because the correctness gain is worth the performance cost: almost zero.
add `extern "custom"` functions
tracking issue: rust-lang/rust#140829
previous discussion: https://github.com/rust-lang/rust/issues/140566
In short, an `extern "custom"` function is a function with a custom ABI, that rust does not know about. Therefore, such functions can only be defined with `#[unsafe(naked)]` and `naked_asm!`, or via an `extern "C" { /* ... */ }` block. These functions cannot be called using normal rust syntax: calling them can only be done from inline assembly.
The motivation is low-level scenarios where a custom calling convention is used. Currently, we often pick `extern "C"`, but that is a lie because the function does not actually respect the C calling convention.
At the moment `"custom"` seems to be the name with the most support. That name is not final, but we need to pick something to actually implement this.
r? `@traviscross`
cc `@tgross35`
try-job: x86_64-apple-2
Replace some `Option<Span>` with `Span` and use DUMMY_SP instead of None
Turns out many locations actually have a span available that we could use, so I used it
Add a new `mismatched-lifetime-syntaxes` lint
The lang-team [discussed this](https://hackmd.io/nf4ZUYd7Rp6rq-1svJZSaQ) and I attempted to [summarize](https://github.com/rust-lang/rust/pull/120808#issuecomment-2701863833) their decision. The summary-of-the-summary is:
- Using two different kinds of syntax for elided lifetimes is confusing. In rare cases, it may even [lead to unsound code](https://github.com/rust-lang/rust/issues/48686)! Some examples:
```rust
// Lint will warn about these
fn(v: ContainsLifetime) -> ContainsLifetime<'_>;
fn(&'static u8) -> &u8;
```
- Matching up references with no lifetime syntax, references with anonymous lifetime syntax, and paths with anonymous lifetime syntax is an exception to the simplest possible rule:
```rust
// Lint will not warn about these
fn(&u8) -> &'_ u8;
fn(&'_ u8) -> &u8;
fn(&u8) -> ContainsLifetime<'_>;
```
- Having a lint for consistent syntax of elided lifetimes will make the [future goal](https://github.com/rust-lang/rust/issues/91639) of warning-by-default for paths participating in elision much simpler.
---
This new lint attempts to accomplish the goal of enforcing consistent syntax. In the process, it supersedes and replaces the existing `elided-named-lifetimes` lint, which means it starts out life as warn-by-default.
This adds an `iter!` macro that can be used to create movable
generators.
This also adds a yield_expr feature so the `yield` keyword can be used
within iter! macro bodies. This was needed because several unstable
features each need `yield` expressions, so this allows us to stabilize
them separately from any individual feature.
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
Co-authored-by: Jieyou Xu <jieyouxu@outlook.com>
Co-authored-by: Travis Cross <tc@traviscross.com>
Rollup of 9 pull requests
Successful merges:
- rust-lang/rust#141554 (Improve documentation for codegen options)
- rust-lang/rust#141817 (rustc_llvm: add Windows system libs only when cross-compiling from Wi…)
- rust-lang/rust#141843 (Add `visit_id` to ast `Visitor`)
- rust-lang/rust#141881 (Subtree update of `rust-analyzer`)
- rust-lang/rust#141898 ([rustdoc-json] Implement PartialOrd and Ord for rustdoc_types::Id)
- rust-lang/rust#141921 (Disable f64 minimum/maximum tests for arm 32)
- rust-lang/rust#141930 (Enable triagebot `[concern]` functionality)
- rust-lang/rust#141936 (Decouple "reporting in deps" from `FutureIncompatibilityReason`)
- rust-lang/rust#141949 (move `test-float-parse` tool into `src/tools` dir)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `visit_id` to ast `Visitor`
This helps with efforts to deduplicate the `MutVisitor` and the `Visitor` code. All users of `Visitor`'s methods that have extra `NodeId` as parameters really just want to visit the id on its own.
Also includes some methods deduplicated and cleaned up as a result of this change.
r? oli-obk
`UsePath` contains a `SmallVec<[Res; 3]>`. This holds up to three `Res`
results, one per namespace (type, value, or macro). `lower_import_res`
takes a `PerNS<Option<Res<NodeId>>>` result and lowers it into the
`SmallVec`. This is pretty weird. The input `PerNS` makes it clear which
`Res` belongs to which namespace, but the `SmallVec` throws that
information away.
And code that operates on the `SmallVec` tends to use iteration (or even
just grabbing the first entry!) without knowing which namespace the
`Res` belongs to. Even weirder! Also, `SmallVec` is an overly flexible
type to use here, because it can contain any number of elements (even
though it's optimized for 3 in this case).
This commit changes `UsePath` so it also contains a
`PerNS<Option<Res<HirId>>>`. This type preserves more information and is
more self-documenting. The commit also changes a lot of the use sites to
access the result for a particular namespace. E.g. if you're looking up
a trait, it will be in the `Res` for the type namespace if it's present;
it's silly to look in the `Res` for the value namespace or macro
namespace. Overall I find the new code much easier to understand.
However, some use sites still iterate. These now use `present_items`
because that filters out the `None` results.
Also, `redundant_pub_crate.rs` gets a bigger change. A
`UseKind:ListStem` item gets no `Res` results, which means the old `all`
call in `is_not_macro_export` would succeed (because `all` succeeds on
an empty iterator) and the `ListStem` would be ignored. This is what we
want, but was more by luck than design. The new code detects `ListStem`
explicitly. The commit generalizes the name of that function
accordingly.
Finally, the commit also removes the `use_path` arena, because
`PerNS<Option<Res>>` impls `Copy` (unlike `SmallVec`) and it can be
allocated in the arena shared by all `Copy` types.
This helps with efforts to deduplicate the `MutVisitor` and the
`Visitor` code. All users of `Visitor`'s methods that have extra
`NodeId` as parameters really just want to visit the id on its
own.
Also includes some methods deduplicated and cleaned up as
a result of this change.
Specifically `TyAlias`, `Enum`, `Struct`, `Union`. So the fields match
the textual order in the source code.
The interesting part of the change is in
`compiler/rustc_hir/src/hir.rs`. The rest is extremely mechanical
refactoring.
So they match the order of the parts in the source code, e.g.:
```
struct Foo<T, U> { t: T, u: U }
<-><----> <------------>
/ | \
ident generics variant_data
```