mono collector: Reduce # of locking while walking the graph
While profiling Zed's dev build I've noticed that while most of the time `upstream_monomorphizations` takes a lot of time in monomorpization_collector, in some cases (e.g. build of `editor` itself) the rest of monomorphization_collector_graph_walk dominates it. Most of the time is spent in collect_items_rec.
This PR aims to reduce the number of locks taking place; instead of locking output MonoItems once per children of current node, we do so once per *current node*. We also get to reuse locks for mentioned and used items. While this commit does not reduce Wall time of Zed's build, it does shave off CPU time (measured with `cargo build -j1`) from 48s to 47s. I've also tested it with parallel frontend against Zed and ripgrep and found no regressions.
Rollup of 9 pull requests
Successful merges:
- #140485 (Optimize the codegen for `Span::from_expansion`)
- #140509 (transmutability: merge contiguous runs with a common destination)
- #140519 (Use select in projection lookup in `report_projection_error`)
- #140521 (interpret: better error message for out-of-bounds pointer arithmetic and accesses)
- #140536 (Rename `*Guard::try_map` to `filter_map`.)
- #140550 (Stabilize `select_unpredictable`)
- #140563 (extend the list of registered dylibs on `test::prepare_cargo_test`)
- #140572 (Add useful comments on `ExprKind::If` variants.)
- #140574 (Add regression test for 133065)
r? `@ghost`
`@rustbot` modify labels: rollup
Use select in projection lookup in `report_projection_error`
Using `for_each_relevant_impl` doesn't actually select the correct impl; we can use `select` here to actually get the correct impl with certainty. Follow-up to https://github.com/rust-lang/rust/pull/140278.
r? oli-obk
Optimize the codegen for `Span::from_expansion`
See https://godbolt.org/z/bq65Y6bc4 for the difference. the new version is less than half the number of instructions.
Also tried fully writing the function by hand:
```rust
sp.ctxt_or_parent_or_marker != 0
&& (
sp.len_with_tag_or_marker == BASE_LEN_INTERNED_MARKER
|| sp.len_with_tag_or_marker & PARENT_TAG == 0
)
```
But that was no better than this PR's current use of `match_span_kind`.
Remove `avx512dq` and `avx512vl` implication for `avx512fp16`
According to Intel, `avx512fp16` requires only `avx512bw`, but LLVM also enables `avx512vl` and `avx512dq` when `avx512fp16` is active. This is relic code, and will be fixed in LLVM soon. We should remove this from Rust too asap, especially before the stabilization of AVX512
Related:
- llvm/llvm-project#136209
- #138940
- rust-lang/stdarch#1781
- #111137
``@rustbot`` label O-x86_64 O-x86_32 A-SIMD A-target-feature T-compiler -T-libs
r? ``@Amanieu``
**Update: the LLVM fix has been merged**
cc ``@rust-lang/wg-llvm`` will it be possible to update the rustc llvm version to something after llvm/llvm-project#137450
rustc_target: RISC-V `Zfinx` is incompatible with `{ILP32,LP64}[FD]` ABIs
Because RISC-V Calling Conventions note that:
> This means code targeting the `Zfinx` extension always uses the ILP32, ILP32E or LP64 integer calling-convention only ABIs as there is no dedicated hardware floating-point register file.
`{ILP32,LP64}[FD]` ABIs with hardware floating-point calling conventions are incompatible with the `Zfinx` extension.
This commit adds `"zfinx"` to the incompatible feature list to those ABIs and tests whether trying to add `"zdinx"` (that is analogous to `"zfinx"` but in double-precision) on a LP64D ABI configuration results in an error (it also tests extension implication; `Zdinx` requires `Zfinx` extension).
Links: RISC-V psABI specification version 1.0
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/v1.0/riscv-cc.adoc#named-abis>
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0>
handle paren in macro expand for let-init-else expr
Fixes#131655
This PR modifies the codegen logic of the macro expansion within `let-init-else` expression:
- Before: The expression `let xxx = (mac! {}) else {}` expands to `let xxx = (expanded_ast) else {}`.
- After: The same expression expands to `let xxx = expanded_ast else {}`.
An alternative solution to this issue could involve handling the source code directly when encountering unused parentheses in `let-init-else` expressions. However, this approach might be more cumbersome due to the absence of the necessary data structure.
r? `@petrochenkov`
This commit does the following:
- Replaces use of rustc_type_ir by rustc_middle in rustc_infer.
- The DelayedMap type is exposed by rustc_middle so everything can be
accessed through rustc_middle in a coherent manner.
- API-layer traits, like InferCtxtLike, Interner or inherent::* must be
accessed via rustc_type_ir, not rustc_middle::ty. For this reason
these are not reexported by rustc_middle::ty.
- Replaces use of ty::Interner by rustc_type_ir::Interner in
rustc_trait_selection
allow `#[rustc_std_internal_symbol]` in combination with `#[naked]`
The need for this came up in https://github.com/rust-lang/compiler-builtins/pull/897, but in general this seems useful and valid to allow.
Based on a quick scan, I don't think changes to the generated assembly are needed.
cc ``@bjorn3``
Clean up "const" situation in format_args!().
This cleans up the "const" situation in the format_args!() expansion/lowering.
Rather than marking the Argument::new_display etc. functions as non-const, this marks the Arguments::new_v1 functions as non-const.
Example expansion/lowering of format_args!() in const:
```rust
// Error: cannot call non-const formatting macro in constant functions
const {
fmt::Arguments::new_v1( // Now the error is produced here.
&["Hello, ", "!\n"],
&[
fmt::Argument::new_display(&world) // The error used to be produced here.
],
)
}
```
In LLVM 21 PR https://github.com/llvm/llvm-project/pull/130940
`TargetRegistry::createTargetMachine` was changed to take a `const
Triple&` and has deprecated the old `StringRef` method.
@rustbot label llvm-main
Decouple SCC annotations from SCCs
This rewires SCC annotations to have them be a separate, visitor-type data structure. It was broken out of #130227, which needed them to be able to remove unused annotations after computation without recomputing the SCCs themselves.
As a drive-by it also removes some redundant code from the hot loop in SCC construction for a performance improvement.
r? lcnr
shared-generics: Do not share instantiations that contain local-only types
In Zed shared-generics loading takes up a significant chunk of time in incremental build, as rustc deserializes rmeta of all dependencies of a crate. I've recently realized that shared-generics includes all instantiations of some_generic_function in the following snippet:
```rs
pub fn some_generic_function(_: impl Fn()) {}
pub fn non_generic_function() {
some_generic_function(|| {});
some_generic_function(|| {});
some_generic_function(|| {});
some_generic_function(|| {});
some_generic_function(|| {});
some_generic_function(|| {});
some_generic_function(|| {});
}
```
even though none of these instantiations can actually be created from outside of `non_generic_function`. This is a dummy example, but we do rely on invoking callbacks with FnOnce a lot in our codebase.
This PR makes shared-generics account for visibilities of generic arguments; an item is only considered for exporting if it is reachable from the outside or if all of it's arguments are visible outside of the local crate.
This PR reduces incremental build time for Zed (touch editor.rs scenario) from 12.4s to 10.4s. I'd love to see a perf run if possible; per my checks this PR does not incur new instantiations in downstream crates, so if there'd be perf regressions, I'd expect them to come from newly-introduced visibility checks.