Windows: Add experimental support for linking std-required system DLLs using raw-dylib
For Windows, this allows std to define system imports without needing the user to have import libraries. It's intended for this to become the default.
For now it's an experimental feature so it can be tested using build-std.
fix interleaved output in the default panic hook when multiple threads panic simultaneously
previously, we only held a lock for printing the backtrace itself. since all threads were printing to the same file descriptor, that meant random output in the default panic hook from one thread would be interleaved with the backtrace from another. now, we hold the lock for the full duration of the hook, and the output is ordered.
---
i noticed some odd things while working on this you may or may not already be aware of.
- libbacktrace is included as a submodule instead of a normal rustc crate, and as a result uses `cfg(backtrace_in_std)` instead of a more normal `cfg(feature = "rustc-dep-of-std")`. probably this is left over from before rust used a cargo-based build system?
- the default panic handler uses `trace_unsynchronized`, etc, in `sys::backtrace::print`. as a result, the lock only applies to concurrent *panic handlers*, not concurrent *threads*. in other words, if another, non-panicking, thread tried to print a backtrace at the same time as the panic handler, we may have UB, especially on windows.
- we have the option of changing backtrace to enable locking when `backtrace_in_std` is set so we can reuse their lock instead of trying to add our own.
Guard against calling `libc::exit` multiple times on Linux.
Mitigates (but does not fix) #126600 by ensuring only one thread which calls Rust `exit` actually calls `libc::exit`, and all other callers of Rust `exit` block.
previously, we only held a lock for printing the backtrace itself. since all threads were printing to the same file descriptor, that meant random output in the default panic hook would be interleaved with the backtrace. now, we hold the lock for the full duration of the hook, and the output is ordered.
Use pidfd_spawn for faster process spawning when a PidFd is requested
glibc 2.39 added `pidfd_spawnp` and `pidfd_getpid` which makes it possible to get pidfds while staying on the CLONE_VFORK path.
verified that vfork gets used with strace:
```
$ strace -ff -e pidfd_open,clone3,openat,execve,waitid,close ./x test std --no-doc -- pidfd
[...]
[pid 2820532] clone3({flags=CLONE_VM|CLONE_PIDFD|CLONE_VFORK|CLONE_CLEAR_SIGHAND, pidfd=0x7b7f885fec6c, exit_signal=SIGCHLD, stack=0x7b7f88aff000, stack_size=0x9000}strace: Process 2820533 attached
<unfinished ...>
[pid 2820533] execve("/home/the8472/bin/sleep", ["sleep", "1000"], 0x7ffdd0e268d8 /* 107 vars */) = -1 ENOENT (No such file or directory)
[pid 2820533] execve("/home/the8472/.cargo/bin/sleep", ["sleep", "1000"], 0x7ffdd0e268d8 /* 107 vars */) = -1 ENOENT (No such file or directory)
[pid 2820533] execve("/usr/local/bin/sleep", ["sleep", "1000"], 0x7ffdd0e268d8 /* 107 vars */) = -1 ENOENT (No such file or directory)
[pid 2820533] execve("/usr/bin/sleep", ["sleep", "1000"], 0x7ffdd0e268d8 /* 107 vars */ <unfinished ...>
[pid 2820532] <... clone3 resumed> => {pidfd=[3]}, 88) = 2820533
[pid 2820533] <... execve resumed>) = 0
[pid 2820532] openat(AT_FDCWD, "/proc/self/fdinfo/3", O_RDONLY|O_CLOEXEC) = 4
[pid 2820532] close(4) = 0
```
Tracking issue: #82971
Update windows-bindgen to 0.58.0
This also switches from the bespoke `std` generated bindings to the normal `sys` ones everyone else uses.
This has almost no difference except that the `sys` bindings use the `windows_targets::links!` macro for FFI imports, which we implement manually. This does cause the diff to look much larger than it really is but the bulk of the changes are mostly contained to the generated code.
Remove unqualified form import of io::Error in process_vxworks.rs and fallback on remove_dir_impl for vxworks
Hi all,
This is to address issue #127084. On inspections it was found that io::Error refrences were all of qualified form and there was no need to add a unqualified form import. Also to successfully build rust for vxworks, we need to fallback on the remove_impl_dir implementations.
Thank you.
std: separate TLS key creation from TLS access
Currently, `std` performs an atomic load to get the OS key on every access to `StaticKey` even when the key is already known. This PR thus replaces `StaticKey` with the platform-specific `get` and `set` function and a new `LazyKey` type that acts as a `LazyLock<Key>`, allowing the reuse of the retreived key for multiple accesses.
Related to #110897.
Currently, `std` performs an atomic load to get the OS key on every access to `StaticKey` even when the key is already known. This PR thus replaces `StaticKey` with the platform-specific `get` and `set` function and a new `LazyKey` type that acts as a `LazyLock<Key>`, allowing the reuse of the retreived key for multiple accesses.
Remove `MaybeUninit::uninit_array()` and replace it with inline const blocks.
\[This PR originally contained the changes in #125995 too. See edit history for the original PR description.]
The documentation of `MaybeUninit::uninit_array()` says:
> Note: in a future Rust version this method may become unnecessary when Rust allows [inline const expressions](https://github.com/rust-lang/rust/issues/76001). The example below could then use `let mut buf = [const { MaybeUninit::<u8>::uninit() }; 32];`.
The PR adding it also said: <https://github.com/rust-lang/rust/pull/65580#issuecomment-544200681>
> if it’s stabilized soon enough maybe it’s not worth having a standard library method that will be replaceable with `let buffer = [MaybeUninit::<T>::uninit(); $N];`
That time has come to pass — inline const expressions are stable — so `MaybeUninit::uninit_array()` is now unnecessary. The only remaining question is whether it is an important enough *convenience* to keep it around.
I believe it is net good to remove this function, on the principle that it is better to compose two orthogonal features (`MaybeUninit` and array construction) than to have a specific function for the specific combination, now that that is possible.
This is possible now that inline const blocks are stable; the idea was
even mentioned as an alternative when `uninit_array()` was added:
<https://github.com/rust-lang/rust/pull/65580#issuecomment-544200681>
> if it’s stabilized soon enough maybe it’s not worth having a
> standard library method that will be replaceable with
> `let buffer = [MaybeUninit::<T>::uninit(); $N];`
Const array repetition and inline const blocks are now stable (in the
next release), so that circumstance has come to pass, and we no longer
have reason to want `uninit_array()` other than convenience. Therefore,
let’s evaluate the inconvenience by not using `uninit_array()` in
the standard library, before potentially deleting it entirely.
std: refactor the TLS implementation
As discovered by Mara in #110897, our TLS implementation is a total mess. In the past months, I have simplified the actual macros and their expansions, but the majority of the complexity comes from the platform-specific support code needed to create keys and register destructors. In keeping with #117276, I have therefore moved all of the `thread_local_key`/`thread_local_dtor` modules to the `thread_local` module in `sys` and merged them into a new structure, so that future porters of `std` can simply mix-and-match the existing code instead of having to copy the same (bad) implementation everywhere. The new structure should become obvious when looking at `sys/thread_local/mod.rs`.
Unfortunately, the documentation changes associated with the refactoring have made this PR rather large. That said, this contains no functional changes except for two small ones:
* the key-based destructor fallback now, by virtue of sharing the implementation used by macOS and others, stores its list in a `#[thread_local]` static instead of in the key, eliminating one indirection layer and drastically simplifying its code.
* I've switched over ZKVM (tier 3) to use the same implementation as WebAssembly, as the implementation was just a way worse version of that
Please let me know if I can make this easier to review! I know these large PRs aren't optimal, but I couldn't think of any good intermediate steps.
`@rustbot` label +A-thread-locals