Try not to use verbatim paths in `Command::current_dir`
If possible, we should try not to use verbatim paths in `Command::current_dir`. It might work but it might also break code in the subprocess that assume the current directory isn't verbatim (including Windows APIs). cc ``@ehuss``
Side note: we now have a lot of ad-hoc fixes like this spread about the place. It'd be good to make a proper `WindowsPath` type that handles all this in one place. But that's a bigger job for another PR.
If possible, we should try not to use verbatim paths in Command::current_dir. It might work but it might also break code (including some Windows APIs) that assume the current directory isn't verbatim.
Trusty: Implement `write_vectored` for stdio
Currently, `write` for stdout and stderr on Trusty is implemented with the semantics of `write_all`. Instead, call the underlying syscall only once in `write` and use the default implementation of `write_all` like other platforms. Also, implement `write_vectored` by adding support for `IoSlice`.
Refactor stdin to reuse the unsupported type like https://github.com/rust-lang/rust/pull/136769.
It requires #138875 to fix the build for Trusty, though they do not conflict and can merge in either order.
cc `@randomPoison`
Fix std build for all NuttX targets. It is the single largest set of
failures on <https://does-it-build.noratrieb.dev/>. Although, ESP-IDF
also requires these same gates, there are other issues for those
targets.
This can verified be running `x check library/std --target=` for all
NuttX targets.
Rename internal module from `statik` to `no_threads`
This module is named in reference to the keyword, but the term is somewhat overloaded. Rename it to more clearly describe it and avoid the misspelling.
Move `fd` into `std::sys`
Move platform definitions of `fd` into `std::sys`, as part of https://github.com/rust-lang/rust/issues/117276.
Unlike other modules directly under `std::sys`, this is only available on some platforms and I have not provided a fallback abstraction for unsupported platforms. That is similar to how `std::os::fd` is gated to only supported platforms.
Also, fix the `unsafe_op_in_unsafe_fn` lint, which was allowed for the Unix fd impl. Since macro expansions from `std::sys::pal::unix::weak` trigger this lint, fix it there too.
cc `@joboet,` `@ChrisDenton`
try-job: x86_64-gnu-aux
This module is named in reference to the keyword, but the term is
somewhat overloaded. Rename it to more clearly describe it and avoid the
misspelling.
Expose algebraic floating point intrinsics
# Problem
A stable Rust implementation of a simple dot product is 8x slower than C++ on modern x86-64 CPUs. The root cause is an inability to let the compiler reorder floating point operations for better vectorization.
See https://github.com/calder/dot-bench for benchmarks. Measurements below were performed on a i7-10875H.
### C++: 10us ✅
With Clang 18.1.3 and `-O2 -march=haswell`:
<table>
<tr>
<th>C++</th>
<th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="cc">
float dot(float *a, float *b, size_t len) {
#pragma clang fp reassociate(on)
float sum = 0.0;
for (size_t i = 0; i < len; ++i) {
sum += a[i] * b[i];
}
return sum;
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/739573c0-380a-4d84-9fd9-141343ce7e68" />
</td>
</tr>
</table>
### Nightly Rust: 10us ✅
With rustc 1.86.0-nightly (8239a37f9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
<th>Rust</th>
<th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
let mut sum = 0.0;
for i in 0..a.len() {
sum = fadd_algebraic(sum, fmul_algebraic(a[i], b[i]));
}
sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/9dcf953a-2cd7-42f3-bc34-7117de4c5fb9" />
</td>
</tr>
</table>
### Stable Rust: 84us ❌
With rustc 1.84.1 (e71f9a9a9) and `-C opt-level=3 -C target-feature=+avx2,+fma`:
<table>
<tr>
<th>Rust</th>
<th>Assembly</th>
</tr>
<tr>
<td>
<pre lang="rust">
fn dot(a: &[f32], b: &[f32]) -> f32 {
let mut sum = 0.0;
for i in 0..a.len() {
sum += a[i] * b[i];
}
sum
}
</pre>
</td>
<td>
<img src="https://github.com/user-attachments/assets/936a1f7e-33e4-4ff8-a732-c3cdfe068dca" />
</td>
</tr>
</table>
# Proposed Change
Add `core::intrinsics::f*_algebraic` wrappers to `f16`, `f32`, `f64`, and `f128` gated on a new `float_algebraic` feature.
# Alternatives Considered
https://github.com/rust-lang/rust/issues/21690 has a lot of good discussion of various options for supporting fast math in Rust, but is still open a decade later because any choice that opts in more than individual operations is ultimately contrary to Rust's design principles.
In the mean time, processors have evolved and we're leaving major performance on the table by not supporting vectorization. We shouldn't make users choose between an unstable compiler and an 8x performance hit.
# References
* https://github.com/rust-lang/rust/issues/21690
* https://github.com/rust-lang/libs-team/issues/532
* https://github.com/rust-lang/rust/issues/136469
* https://github.com/calder/dot-bench
* https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231ps
try-job: x86_64-gnu-nopt
try-job: x86_64-gnu-aux
ToSocketAddrs: fix typo
It's "a function", never "an function".
I noticed the same typo somewhere in the compiler sources so figured I'd fix it there as well.
Remove creation of duplicate `AnonPipe`
The `File` is unwrapped to a `Handle` into an `AnonPipe`, and then that `AnonPipe` was unwrapped to a `Handle` into another `AnonPipe`. The second operation is entirely redundant.
io: Avoid marking some bytes as uninit
These bytes were marked as uninit, which would cause them to be initialized multiple times even though it was not necessary.
The File is unwrapped to a Handle into an AnonPipe, and then that AnonPipe was unwrapped to a Handle into another AnonPipe. The second operation is entirely redundant.
Getting the address of `errno` should be just as cheap as `pthread_self()` and avoids having to use the expensive `Mutex` logic because it always results in a pointer.
std: deduplicate `errno` accesses
By marking `__errno_location` as `#[ffi_const]` and `std::sys::os::errno` as `#[inline]`, this PR allows merging multiple calls to `io::Error::last_os_error()` into one.
Start using `with_native_path` in `std::sys::fs`
Ideally, each platform should use their own native path type internally. This will, for example, allow passing a `CStr` directly to `std::fs::File::open` and therefore avoid the need for allocating a new null-terminated C string.
However, doing that for every function and platform all at once makes for a large PR that is way too prone to breaking. So this PR does some minimal refactoring which should help progress towards that goal. The changes are Unix-only and even then I avoided functions that require more changes so that this PR is just moving things around.
r? joboet
Change the syntax of the internal `weak!` macro
Change the syntax to include parameter names and a trailing semicolon.
Motivation:
- Mirror the `syscall!` macro.
- Allow rustfmt to format it (when wrapped in parentheses, and when not inside `cfg_if!`).
- For better documentation (having the parameter names available in the source code is a bit nicer).
- Allow a future improvement to this macro where we can sometimes use the symbol directly when it's statically known to be available (and thus need the parameter names to be available), see https://github.com/rust-lang/rust/pull/136868.
r? libs
wasm: increase default thread stack size to 1 MB
The default stack size for the [main thread is 1 MB as specified by linker options](38cf49dde8/compiler/rustc_target/src/spec/base/wasm.rs (L14)).
However, the default stack size for threads was only 64 kB.
This is surprisingly small and thus we increase it to 1 MB to match the main thread.
By marking `__errno_location` as `#[ffi_const]` and `std::sys::os::errno` as `#[inline]`, this PR allows merging multiple calls to `io::Error::last_os_error()` into one.
Currently, `write` for stdout and stderr on Trusty is implemented with
the semantics of `write_all`. Instead, call the underlying syscall only
once in `write` and use the default implementation of `write_all` like
other platforms. Also, implement `write_vectored` by adding support for
`IoSlice`.
Refactor stdin to reuse the unsupported type like #136769.