tidy: Fix paths to `coretests` and `alloctests`
Following `#135937` and `#136642`, tests for core and alloc are in coretests and alloctests. Fix tidy to lint for the new paths. Also, update comments referring to the old locations.
Some context for changes which don't match that pattern:
- `library/std/src/thread/local/dynamic_tests.rs` and `library/std/src/sync/mpsc/sync_tests.rs` were moved under `library/std/tests/` in 332fb7e6f1 (Move std::thread_local unit tests to integration tests, 2025-01-17) and b8ae372e48 (Move std::sync unit tests to integration tests, 2025-01-17), respectively, so are no longer special cases.
- There never was a `library/core/tests/fmt.rs` file. That comment previously referred to `src/test/ui/ifmt.rs`, which was folded into `library/alloc/tests/fmt.rs` in 949c96660c (move format! interface tests, 2020-09-08).
Now, the only matches for `(alloc|core)/tests` are in `compiler/rustc_codegen_{cranelift,gcc}/patches`. I don't know why CI hasn't broken because those patches can't apply. Or maybe they somehow still can apply?
r? `@bjorn3`
Following `#135937` and `#136642`, tests for core and alloc are in
coretests and alloctests. Fix tidy to lint for the new paths. Also,
update comments referring to the old locations.
Some context for changes which don't match that pattern:
* library/std/src/thread/local/dynamic_tests.rs and
library/std/src/sync/mpsc/sync_tests.rs were moved under
library/std/tests/ in 332fb7e6f1 (Move std::thread_local unit tests
to integration tests, 2025-01-17) and b8ae372e48 (Move std::sync unit
tests to integration tests, 2025-01-17), respectively, so are no
longer special cases.
* There never was a library/core/tests/fmt.rs file. That comment
previously referred to src/test/ui/ifmt.rs, which was folded into
library/alloc/tests/fmt.rs in 949c96660c (move format! interface
tests, 2020-09-08).
Fix missing const for inherent pointer `replace` methods
`ptr::replace` (the free fn) is already const stable. However, there are inherent convenience methods on `*mut T` and `NonNull<T>`, allowing you to write eg. `unsafe { foo.replace(bar) }` where `foo` is `*mut T` or `NonNull<T>`.
It seems const was never added to the inherent method (likely oversight), so this PR adds it.
I don't believe this needs another[^1] FCP as the inherent methods are already stable and `ptr::replace` is already const stable, so this adds no new API.
Original tracking issue: #83164
`ptr::replace` constified in #83091
`ptr::replace` const stabilized in #130954
[^1]: `const_replace` FCP completed: https://github.com/rust-lang/rust/issues/83164#issuecomment-2385670050
Implement `SliceIndex` for `ByteStr`
Implement `Index` and `IndexMut` for `ByteStr` in terms of `SliceIndex`. Implement it for the same types that `&[u8]` supports (a superset of those supported for `&str`, which does not have `usize` and `ops::IndexRange`).
At the same time, move compare and index traits to a separate file in the `bstr` module, to give it more space to grow as more functionality is added (e.g., iterators and string-like ops). Order the items in `bstr/traits.rs` similarly to `str/traits.rs`.
cc `@joshtriplett`
`ByteStr`/`ByteString` tracking issue: https://github.com/rust-lang/rust/issues/134915
Expose algebraic floating point intrinsics
# Problem
A stable Rust implementation of a simple dot product is 8x slower than C++ on modern x86-64 CPUs. The root cause is an inability to let the compiler reorder floating point operations for better vectorization.
See https://github.com/calder/dot-bench for benchmarks. Measurements below were performed on a i7-10875H.
### C++: 10us ✅
With Clang 18.1.3 and `-O2 -march=haswell`:
<table>
<tr>
<th>C++</th>
<th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="cc">
float dot(float *a, float *b, size_t len) {
#pragma clang fp reassociate(on)
float sum = 0.0;
for (size_t i = 0; i < len; ++i) {
sum += a[i] * b[i];
}
return sum;
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/739573c0-380a-4d84-9fd9-141343ce7e68" />
</td>
</tr>
</table>
### Nightly Rust: 10us ✅
With rustc 1.86.0-nightly (8239a37f9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
<th>Rust</th>
<th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
let mut sum = 0.0;
for i in 0..a.len() {
sum = fadd_algebraic(sum, fmul_algebraic(a[i], b[i]));
}
sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/9dcf953a-2cd7-42f3-bc34-7117de4c5fb9" />
</td>
</tr>
</table>
### Stable Rust: 84us ❌
With rustc 1.84.1 (e71f9a9a9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
<th>Rust</th>
<th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
let mut sum = 0.0;
for i in 0..a.len() {
sum += a[i] * b[i];
}
sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/936a1f7e-33e4-4ff8-a732-c3cdfe068dca" />
</td>
</tr>
</table>
# Proposed Change
Add `core::intrinsics::f*_algebraic` wrappers to `f16`, `f32`, `f64`, and `f128` gated on a new `float_algebraic` feature.
# Alternatives Considered
https://github.com/rust-lang/rust/issues/21690 has a lot of good discussion of various options for supporting fast math in Rust, but is still open a decade later because any choice that opts in more than individual operations is ultimately contrary to Rust's design principles.
In the mean time, processors have evolved and we're leaving major performance on the table by not supporting vectorization. We shouldn't make users choose between an unstable compiler and an 8x performance hit.
# References
* https://github.com/rust-lang/rust/issues/21690
* https://github.com/rust-lang/libs-team/issues/532
* https://github.com/rust-lang/rust/issues/136469
* https://github.com/calder/dot-bench
* https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231ps
try-job: x86_64-gnu-nopt
try-job: x86_64-gnu-aux
slice: Remove some uses of unsafe in first/last chunk methods
Remove unsafe `split_at_unchecked` and `split_at_mut_unchecked` in some slice `split_first_chunk`/`split_last_chunk` methods.
Replace those calls with the safe `split_at` and `split_at_checked` where applicable.
Add codegen tests to check for no panics when calculating the last chunk index using `checked_sub` and `split_at`.
Better viewed with whitespace disabled in diff view
---
The unchecked calls are mostly manual implementations of the safe methods, but with the safety condition negated from `mid <= len` to `len < mid`.
```rust
if self.len() < N {
None
} else {
// SAFETY: We manually verified the bounds of the split.
let (first, tail) = unsafe { self.split_at_unchecked(N) };
// Or for the last_chunk methods
let (init, last) = unsafe { self.split_at_unchecked(self.len() - N) };
```
Unsafe is still needed for the pointer array casts. Their safety comments are unmodified.
Remove mention of `exhaustive_patterns` from `never` docs
The example shows an exhaustive match:
```rust
#![feature(exhaustive_patterns)]
use std::str::FromStr;
let Ok(s) = String::from_str("hello");
```
But https://github.com/rust-lang/rust/issues/119612 moved this functionality to `#![feature(min_exhaustive_patterns)` and then stabilized it.
Switch some rustc_on_unimplemented uses to diagnostic::on_unimplemented
The use on the SliceIndex impl appears unreachable, there is no mention of "vector indices" in any test output and I could not get it to show up in error messages.
Simplify expansion for format_args!().
Instead of calling `Placeholder::new()`, we can just use a struct expression directly.
Before:
```rust
Placeholder::new(…, …, …, …)
```
After:
```rust
Placeholder {
position: …,
flags: …,
width: …,
precision: …,
}
```
(I originally avoided the struct expression, because `Placeholder` had a lot of fields. But now that https://github.com/rust-lang/rust/pull/136974 is merged, it only has four fields left.)
This will make the `fmt` argument to `fmt::Arguments::new_v1_formatted()` a candidate for const promotion, which is important if we ever hope to tackle https://github.com/rust-lang/rust/issues/92698 (It doesn't change anything yet though, because the `args` argument to `fmt::Arguments::new_v1_formatted()` is not const-promotable.)
Remove unsafe `split_at_unchecked` and `split_at_mut_unchecked`
in some slice `split_first_chunk`/`split_last_chunk` methods.
Replace those calls with the safe `split_at` and `split_at_checked` where
applicable.
Add codegen tests to check for no panics when calculating the last
chunk index using `checked_sub` and `split_at`
Instead of calling new(), we can just use a struct expression directly.
Before:
Placeholder::new(…, …, …, …)
After:
Placeholder {
position: …,
flags: …,
width: …,
precision: …,
}
stabilize const_cell
``@rust-lang/libs-api`` ``@rust-lang/wg-const-eval`` I see no reason to wait any longer, so I propose we stabilize the use of `Cell` in `const fn` -- specifically the APIs listed here:
```rust
// core::cell
impl<T> Cell<T> {
pub const fn replace(&self, val: T) -> T;
}
impl<T: Copy> Cell<T> {
pub const fn get(&self) -> T;
}
impl<T: ?Sized> Cell<T> {
pub const fn get_mut(&mut self) -> &mut T;
pub const fn from_mut(t: &mut T) -> &Cell<T>;
}
impl<T> Cell<[T]> {
pub const fn as_slice_of_cells(&self) -> &[Cell<T>];
}
```
Unfortunately, `set` cannot be made `const fn` yet as it drops the old contents.
Fixes https://github.com/rust-lang/rust/issues/131283
Override PartialOrd methods for bool
I noticed that `PartialOrd` implementation for `bool` does not override the individual operator methods, unlike the other primitive types like `char` and integers.
This commit extracts these `PartialOrd` overrides shared by the other primitive types into a macro and calls it on `bool` too.
CC `@scottmcm` for our recent adventures in `PartialOrd` land
I noticed that `PartialOrd` implementation for `bool` does not override the
individual operator methods, unlike the other primitive types like `char`
and integers.
This commit extracts these `PartialOrd` overrides shared by the other
primitive types into a macro and calls it on `bool` too.
Simplify `PartialOrd` on tuples containing primitives
We noticed in https://github.com/rust-lang/rust/pull/133984#issuecomment-2704011800 that currently the tuple comparison code, while it [does optimize down](https://github.com/rust-lang/rust/blob/master/tests/codegen/comparison-operators-2-tuple.rs) today, is kinda huge: <https://rust.godbolt.org/z/xqMoeYbhE>
This PR changes the tuple code to go through an overridable "chaining" version of the comparison functions, so that for simple things like `(i16, u16)` and `(f32, f32)` (as seen in the new MIR pre-codegen test) we just directly get the
```rust
if lhs.0 == rhs.0 { lhs.0 OP rhs.0 }
else { lhs.1 OP rhs.1 }
```
version in MIR, rather than emitting a mess for LLVM to have to clean up.
Test added in the first commit, so you can see the MIR diff in the second one.
core: optimize `RepeatN`
...by adding an optimized implementation of `try_fold` and `fold` as well as replacing some unnecessary `mem::replace` calls with `MaybeUninit` helper methods.
...by adding an optimized implementation of `try_fold` and `fold` as well as replacing some unnecessary `mem::replace` calls with `MaybeUninit` helper methods.
Reduce FormattingOptions to 64 bits
This is part of https://github.com/rust-lang/rust/issues/99012
This reduces FormattingOptions from 6-7 machine words (384 bits on 64-bit platforms, 224 bits on 32-bit platforms) to just 64 bits (a single register on 64-bit platforms).
Before:
```rust
pub struct FormattingOptions {
flags: u32, // only 6 bits used
fill: char,
align: Option<Alignment>,
width: Option<usize>,
precision: Option<usize>,
}
```
After:
```rust
pub struct FormattingOptions {
/// Bits:
/// - 0-20: fill character (21 bits, a full `char`)
/// - 21: `+` flag
/// - 22: `-` flag
/// - 23: `#` flag
/// - 24: `0` flag
/// - 25: `x?` flag
/// - 26: `X?` flag
/// - 27: Width flag (if set, the width field below is used)
/// - 28: Precision flag (if set, the precision field below is used)
/// - 29-30: Alignment (0: Left, 1: Right, 2: Center, 3: Unknown)
/// - 31: Always set to 1
flags: u32,
/// Width if width flag above is set. Otherwise, always 0.
width: u16,
/// Precision if precision flag above is set. Otherwise, always 0.
precision: u16,
}
```
Add an attribute that makes the spans from a macro edition 2021, and fix pin on edition 2024 with it
Fixes a regression, see issue below. This is a temporary fix, super let is the real solution.
Closes#138596
This commit adds the 5f00::/16 range defined by RFC9602 to those ranges which Ipv6Addr::is_global recognises as a non-global IP. This range is used for Segment Routing (SRv6) SIDs.
Optimize `io::Write::write_fmt` for constant strings
When the formatting args to `fmt::Write::write_fmt` are a statically known string, it simplifies to only calling `write_str` without a runtime branch. Do the same in `io::Write::write_fmt` with `write_all`.
Also, match the convention of `fmt::Write` for the name of `args`.