Always inline check in `assert_unsafe_precondition` with cfg(debug_assertions)
The current complexities in `assert_unsafe_precondition` are delicately balancing several concerns, among them compile times for the cases where there are no debug assertions. This comes at a large runtime cost when the assertions are enabled, making the debug assertion compiler a lot slower, which is very annoying.
To avoid this, we always inline the check when building with debug assertions.
Numbers (compiling stage1 library after touching core):
- master: 80s
- just adding `#[inline(always)]` to the `cfg(bootstrap)` `debug_assertions` (equivalent to a bootstrap bump (uhh, i just realized that i was on a slightly outdated master so this bump might have happened already), (#121112)): 67s
- this: 54s
So this seems like a good solution. I think we can still get the same run-time perf improvements for other users too by massaging this code further (see my other PR about adding `#[rustc_no_mir_inline]` #121114) but this is a simpler step that solves the imminent problem of "holy shit my rustc is sooo slow".
Funny consequence: This now means compiling the standard library with dbeug assertions makes it faster (than without, when using debug assertions downstream)!
r? ```@saethlin``` (or anyone else if someone wants to review this)
fixes#121110, supposedly
Use intrinsics::debug_assertions in debug_assert_nounwind
This is the first item in https://github.com/rust-lang/rust/issues/120848.
Based on the benchmarking in this PR, it looks like, for the programs in our benchmark suite, enabling all these additional checks does not introduce significant compile-time overhead, with the single exception of `Alignment::new_unchecked`. Therefore, I've added `#[cfg(debug_assertions)]` to that one call site, so that it remains compiled out in the distributed standard library.
The trailing commas in the previous calls to `debug_assert_nounwind!` were causing the macro to expand to `panic_nouwnind_fmt`, which requires more work to set up its arguments, and that overhead alone is measured between this perf run and the next: https://github.com/rust-lang/rust/pull/120863#issuecomment-1937423502
This change was prompted by the stage1 compiler spending 4% of its time
when compiling the polymorphic-recursion MIR opt test in `unlikely`.
Intrinsic fallback bodies like `unlikely` should always be inlined, it's
very silly if they are not. To do this, we enable the fallback bodies to
be cross-crate inlineable. Not that this matters for our workloads since
the compiler never actually _uses_ the "fallback bodies", it just uses
whatever was cfg(bootstrap)ped, so I've also added `#[inline]` to those.
The current complexities in `assert_unsafe_precondition` are delicately
balancing several concerns, among them compile times for the cases where
there are no debug assertions. This comes at a large runtime cost when
the assertions are enabled, making the debug assertion compiler a lot
slower, which is very annoying.
To avoid this, we always inline the check when building with debug
assertions.
Numbers (compiling stage1 library after touching core):
- master: 80s
- just adding `#[inline(always)]` to the `cfg(bootstrap)`
`debug_assertions`: 67s
- this: 54s
So this seems like a good solution. I think we can still get
the same run-time perf improvements for other users too by
massaging this code further (see my other PR about adding
`#[rustc_no_mir_inline]`) but this is a simpler step that
solves the imminent problem of "holy shit my rustc is sooo slow".
Funny consequence: This now means compiling the standard library with
dbeug assertions makes it faster (than without, when using debug
assertions downstream)!
Store core::str::CharSearcher::utf8_size as u8
This is already relied on being smaller than u8 due to the `safety invariant: utf8_size must be less than 5`, so this helps LLVM optimize and maybe improve copies due to padding instead of unused bytes.
Add examples to document the return type of quickselect functions
Currently, `select_nth_unstable`, `select_nth_unstable_by`, and `select_nth_unstable_by_key`'s examples do not show how to use the return values of the functions in an example, so this PR adds that in.
Note: I didn't know what to call the parameters, so I settled on lesser, median, greater because the example is used for median finding so I retained that naming for the pivot, but lesser and greater are poor names for the example that sorts in descending order, because lesser and greater are then flipped.
I think it's common to say "lo" and "hi" for low and high respectively, but that's also not great when the comparator flips the elements. Otherwise, "left" and "right" are also commonly used but I think that's poor naming because some languages read right to left so those names are also unintuitive.
Lesser and greater are also not that great but I found a test that used `less`, `equal`, `greater` so I took that: dfa88b328f/library/core/tests/slice.rs (L1962)
Make `io::BorrowedCursor::advance` safe
This also keeps the old `advance` method under `advance_unchecked` name.
This makes pattern like `std::io::default_read_buf` safe to write.
Rename MaybeUninit::write_slice
A step to push #79995 forward.
https://github.com/rust-lang/libs-team/issues/122 also suggested to make them inherent methods, but they can't be — they'd conflict with slice's regular methods.
Implement intrinsics with fallback bodies
fixes#93145 (though we can port many more intrinsics)
cc #63585
The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for
* codegen_ssa (so llvm and gcc)
* codegen_cranelift
other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).
cc `@scottmcm` `@WaffleLapkin`
### todo
* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
doc: add note about panicking examples for strict_overflow_ops
The first commit adds a note before the panicking examples for strict_overflow_ops to make it clearer that the following examples should panic and why, without needing the reader to hover the mouse over the information icon.
The second commit adds panicking examples for division by zero operations for strict division operations on unsigned numbers. The signed numbers already have two panicking examples each: one for division by zero and one for overflowing division (`MIN/-1`); this commit includes the division by zero examples for the unsigned numbers.
Waker::will_wake: Compare vtable address instead of its content
Optimize will_wake implementation by comparing vtable address instead of its content.
The existing best practice to avoid false negatives from will_wake is to define a waker vtable as a static item. That approach continues to works with the new implementation.
While this potentially changes the observable behaviour, the function is documented to work on a best-effort basis. The PartialEq impl for RawWaker remains as it was.
Clarified docs on non-atomic oprations on owned/mut refs to atomics
I originally misinterpreted the documentation to mean that the compiler can/will automatically optimise away atomic operations whenever the data is owned or mutably referenced.
On re-reading I think it is not technically incorrect, but specifically mentioning _how_ the atomic operations can be avoided also prevents this misunderstanding.
implement `Default` for `AsciiChar`
This implements `Default` for `AsciiChar` in order to match `char`'s implementation.
From all the different possible ways to do this I think the clearest one is to have both `char` and `AsciiChar` impls together.
I've also updated the doc-comment of the default variant since rustdoc doesn't seem to indicate it otherwise. Probably the text could be improved, though. I couldn't find any similar examples in the codebase and suggestions are welcomed.
r? `@scottmcm`
Clarify the lifetimes of allocations returned by the `Allocator` trait
The previous definition (accidentally) disallowed the implementation of stack-based allocators whose memory would become invalid once the lifetime of the allocator type ended.
This also ensures the validity of the following blanket implementation:
```rust
impl<A: Allocator> Allocator for &'_ A {}
```
Additional doc links and explanation of `Wake`.
This is intended to clarify:
* That `Wake` exists and can be used instead of `RawWaker`.
* How to construct a `Waker` when you are looking at `Wake` (which was previously only documented in the example).