std: Enable usage of `thread_local!` through imports
The `thread_local!` macro delegated to an internal macro but it didn't
do so in a macros-and-the-module-system compatible fashion, meaning if a
`#![no_std]` crate imported `std` and tried to use `thread_local!` it
would fail due to missing a lookup of an internal macro.
This commit switches the macro to instead use `$crate` to invoke other
macros, ensuring that it'll work when `thread_local!` is imported alone.
std: Improve codegen size of accessing TLS
Some code in the TLS implementation in libstd stores `Some(val)` into an
`&mut Option<T>` (effectively) and then pulls out `&T`, but it currently
uses `.unwrap()` which can codegen into a panic even though it can never
panic. With sufficient optimizations enabled (like LTO) the compiler can
see through this but this commit helps it along in normal mode
(`--release` with Cargo by default) to avoid codegen'ing the panic path.
This ends up improving the optimized codegen on wasm by ensuring that a
call to panic pulling in more file size doesn't stick around.
This commit removes all jemalloc related submodules, configuration, etc,
from the bootstrap, from the standard library, and from the compiler.
This will be followed up with a change to use jemalloc specifically as
part of rustc on blessed platforms.
Fixes#46775 -- don't mutate the process's environment in Command::exec
Instead, pass the environment to execvpe, so the kernel can apply it directly to the new process. This avoids a use-after-free in the case where exec'ing the new process fails for any reason, as well as a race condition if there are other threads alive during the exec.
Fixes#46775
The `thread_local!` macro delegated to an internal macro but it didn't
do so in a macros-and-the-module-system compatible fashion, meaning if a
`#![no_std]` crate imported `std` and tried to use `thread_local!` it
would fail due to missing a lookup of an internal macro.
This commit switches the macro to instead use `$crate` to invoke other
macros, ensuring that it'll work when `thread_local!` is imported alone.
Some code in the TLS implementation in libstd stores `Some(val)` into an
`&mut Option<T>` (effectively) and then pulls out `&T`, but it currently
uses `.unwrap()` which can codegen into a panic even though it can never
panic. With sufficient optimizations enabled (like LTO) the compiler can
see through this but this commit helps it along in normal mode
(`--release` with Cargo by default) to avoid codegen'ing the panic path.
This ends up improving the optimized codegen on wasm by ensuring that a
call to panic pulling in more file size doesn't stick around.
Instead, pass the environment to execvpe, so the kernel can apply it directly to the new process. This avoids a use-after-free in the case where exec'ing the new process fails for any reason, as well as a race condition if there are other threads alive during the exec.
This means when the other thread wakes it can continue right away
instead of having to wait for the mutex.
Also add some comments explaining why the mutex needs to be locked in
the first place.
Unchecked thread spawning
# Summary
Add an unsafe interface for spawning lifetime-unrestricted threads for
library authors to build less-contrived, less-hacky safe abstractions
on.
# Motivation
So a few years back scoped threads were entirely removed from the Rust
stdlib, the reason being that it was possible to leak the scoped thread's
join guards without resorting to unsafe code, which meant the concept
was not completely safe, either.
Only a maximally-restrictive safe API for thread spawning was kept in the
stdlib, that requires `'static` lifetime bounds on both the thread closure
and its return type.
A number of 3rd party libraries sprung up to offer their implementations
for safe scoped threads implementations.
These work by essentially hiding the join guards from the user, thus
forcing them to join at the end of an (internal) function scope.
However, since these libraries have to use the maximally restrictive
thread spawning API, they have to resort to some very contrived manipulations
and subversions of Rust's type system to basically achieve what this commit does
with some minimal restructuring of the current code and exposing a new unsafe
function signature for spawning threads without lifetime restrictions.
Obviously this is unsafe, but its main use would be to allow library authors
to write safe abstractions with and around it.
To further illustrate my point, here's a quick summary of the hoops that,
for instance `crossbeam`, has to jump through to spawn a lifetime unrestricted
thread, all of which would not be necessary if an unsafe API existed as part
of the stdlib:
1. Allocate an `Arc<Option<T>>` on the heap where the result with type
`T: 'a` will go (in practice requires `Mutex` or `UnsafeCell` as well).
2. Wrap the desired thread closure with lifetime bound `'a` into another
closure (also `..: 'a`) that returns `()`, executes the inner closure and
writes its result into the pre-allocated `Option<T>`.
3. Box the wrapping closure, cast it to a trait object (`FnBox`) and
(unsafely) transmute its lifetime bound from `'a` to `'static`.
So while this new `spawn_unchecked` function is certainly not very relevant
for general use, since scoped threads are so common I think it makes sense
to expose an interface for libraries implementing these to build on.
The changes implemented are also very minimal: The current `spawn` function
(which internally contains unsafe code) is moved into an unsafe `spawn_unchecked`
function, which the safe function then wraps around.
# Issues
- ~~so far, no documentation for the new function (yet)~~
- the name of the function might be controversial, as `*_unchecked` more commonly
indicates that some sort of runtime check is omitted (`unrestricted` may be
more fitting)
- if accepted, it might make sense to add a freestanding `thread::spawn_unchecked`
function similar to the current `thread::spawn` for convenience.
Clarified code example in char primitive doc
The example was not as clear as it could be because it was making an assumption about the structure of the data in order to multiply the number of elements in the slice by the item size. This change demonstrates the idea more straightforwardly, without needing a calculation, by just comparing the size of the slices.
Documents `From` implementations for `Stdio`
This PR solves part of #51430 by adding a basic summary and an example to each `impl From` inside `process` module (`ChildStdin`, `ChildStdout`, `ChildStderr`, `File`).
It does not document if the conversions allocate memory and how expensive they are.
Gradually expanding libstd's keyword documentation
I'm working on adding new keywords to the documentation and refreshing the incomplete older ones, and I'm hoping that I can eventually add all the standalone-usable keywords after a bunch of incremental work. It would be cool to see the keywords section of std's docs be a definitive reference as to what each keyword means when you see it, and that's what I'm aiming towards with this work.
I'm far from a Rust expert so there will inevitably be things to fix in this, also I'm not sure if this should be a bunch of quickly-merged PRs or one gradually-updated PR that gets merged once it's done.
The example was not as clear as it could be because it was making an assumption about the structure of the data in order to multiply the number of collection elements by the item size. This change demonstrates the idea more straightforwardly, without the calculation.
Prefer unwrap_or_else to unwrap_or in case of function calls/allocations
The contents of `unwrap_or` are evaluated eagerly, so it's not a good pick in case of function calls and allocations. This PR also changes a few `unwrap_or`s with `unwrap_or_default`.
An added bonus is that in some cases this change also reveals if the object it's called on is an `Option` or a `Result` (based on whether the closure takes an argument).
Add a `copysign` function to f32 and f64
This patch adds a `copysign` function to the float primitive types. It is an exceptionally useful function for writing efficient numeric code, as it often avoids branches, is auto-vectorizable, and there are efficient intrinsics for most platforms.
I think this might work as-is, as the relevant `copysign` intrinsic is already used internally for the implementation of `signum`. It's possible that an implementation might be needed in japaric/libm for portability across all platforms, in which case I'll do that also.
Part of the work towards #55107