The posix_spawnattr_init & posix_spawn_file_actions_init might fail,
but their return code is not checked.
Check for non-zero return code and destroy only succesfully initialized
objects.
The cvt function compares the argument with -1 and when equal returns a new
io::Error constructed from errno. It is used together posix_spawn_* functions.
This is incorrect. Those functions do not set errno. Instead they return
non-zero error code directly.
Check for non-zero return code and use it to construct a new io::Error.
Unbox mutexes and condvars on some platforms
Both mutexes and condition variables contained a Box containing the actual os-specific object. This was done because moving these objects may cause undefined behaviour on some platforms.
However, this is not needed on Windows[1], Wasm[2], cloudabi[2], and 'unsupported'[3], were the box was only needlessly making them less efficient.
This change gets rid of the box on those platforms.
On those platforms, `Condvar` can no longer verify it is only used with one `Mutex`, as mutexes no longer have a stable address. This was addressed and considered acceptable in #76932.
[1]\: https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initializesrwlock
[2]\: These are just a single atomic integer together with futex wait/wake calls/instructions.
[3]\: The `unsupported` platform doesn't support multiple threads at all.
Use posix_spawn on musl targets
The posix_spawn had been available in a form suitable for use in a
Command implementation since musl 0.9.12. Use it in a preference to a
fork when possible, to benefit from CLONE_VM|CLONE_VFORK used there.
Previously `Command::spawn` would fall back to the non-posix_spawn based
implementation if the `PATH` environment variable was possibly changed.
On systems with a modern (g)libc `posix_spawn()` can be significantly
faster. If program is a path itself the `PATH` environment variable is
not used for the lookup and it should be safe to use the
`posix_spawnp()` method. [1]
We found this, because we have a cli application that effectively runs a
lot of subprocesses. It would sometimes noticeably hang while printing
output. Profiling showed that the process was spending the majority of
time in the kernel's `copy_page_range` function while spawning
subprocesses. During this time the process is completely blocked from
running, explaining why users were reporting the cli app hanging.
Through this we discovered that `std::process::Command` has a fast and
slow path for process execution. The fast path is backed by
`posix_spawnp()` and the slow path by fork/exec syscalls being called
explicitly. Using fork for process creation is supposed to be fast, but
it slows down as your process uses more memory. It's not because the
kernel copies the actual memory from the parent, but it does need to
copy the references to it (see `copy_page_range` above!). We ended up
using the slow path, because the command spawn implementation in falls
back to the slow path if it suspects the PATH environment variable was
changed.
Here is a smallish program demonstrating the slowdown before this code
change:
```
use std::process::Command;
use std::time::Instant;
fn main() {
let mut args = std::env::args().skip(1);
if let Some(size) = args.next() {
// Allocate some memory
let _xs: Vec<_> = std::iter::repeat(0)
.take(size.parse().expect("valid number"))
.collect();
let mut command = Command::new("/bin/sh");
command
.arg("-c")
.arg("echo hello");
if args.next().is_some() {
println!("Overriding PATH");
command.env("PATH", std::env::var("PATH").expect("PATH env var"));
}
let now = Instant::now();
let child = command
.spawn()
.expect("failed to execute process");
println!("Spawn took: {:?}", now.elapsed());
let output = child.wait_with_output().expect("failed to wait on process");
println!("Output: {:?}", output);
} else {
eprintln!("Usage: prog [size]");
std::process::exit(1);
}
()
}
```
Running it and passing different amounts of elements to use to allocate
memory shows that the time taken for `spawn()` can differ quite
significantly. In latter case the `posix_spawnp()` implementation is 30x
faster:
```
$ cargo run --release 10000000
...
Spawn took: 324.275µs
hello
$ cargo run --release 10000000 changepath
...
Overriding PATH
Spawn took: 2.346809ms
hello
$ cargo run --release 100000000
...
Spawn took: 387.842µs
hello
$ cargo run --release 100000000 changepath
...
Overriding PATH
Spawn took: 13.434677ms
hello
```
[1]: 5f72f9800b/posix/execvpe.c (L81)
Add accessors to Command.
This adds some accessor methods to `Command` to provide a way to access the values set when building the `Command`. An example where this can be useful is to display the command to be executed. This is roughly based on the [`ProcessBuilder`](13b73cdaf7/src/cargo/util/process_builder.rs (L105-L134)) in Cargo.
Possible concerns about the API:
- Values with NULs on Unix will be returned as `"<string-with-nul>"`. I don't think it is practical to avoid this, since otherwise a whole separate copy of all the values would need to be kept in `Command`.
- Does not handle `arg0` on Unix. This can be awkward to support in `get_args` and is rarely used. I figure if someone really wants it, it can be added to `CommandExt` as a separate method.
- Does not offer a way to detect `env_clear`. I'm uncertain if it would be useful for anyone.
- Does not offer a way to get an environment variable by name (`get_env`). I figure this can be added later if anyone really wants it. I think the motivation for this is weak, though. Also, the API could be a little awkward (return a `Option<Option<&OsStr>>`?).
- `get_envs` could skip "cleared" entries and just return `&OsStr` values instead of `Option<&OsStr>`. I'm on the fence here. My use case is to display a shell command, and I only intend it to be roughly equivalent to the actual execution, and I probably won't display `None` entries. I erred on the side of providing extra information, but I suspect many situations will just filter out the `None`s.
- Could implement more iterator stuff (like `DoubleEndedIterator`).
I have not implemented new std items before, so I'm uncertain if the existing issue should be reused, or if a new tracking issue is needed.
cc #44434
Split sys_common::Mutex in StaticMutex and MovableMutex.
The (unsafe) `Mutex` from `sys_common` had a rather complicated interface. You were supposed to call `init()` manually, unless you could guarantee it was neither moved nor used reentrantly.
Calling `destroy()` was also optional, although it was unclear if 1) resources might be leaked or not, and 2) if `destroy()` should only be called when `init()` was called.
This allowed for a number of interesting (confusing?) different ways to use this `Mutex`, all captured in a single type.
In practice, this type was only ever used in two ways:
1. As a static variable. In this case, neither `init()` nor `destroy()` are called. The variable is never moved, and it is never used reentrantly. It is only ever locked using the `LockGuard`, never with `raw_lock`.
2. As a `Box`ed variable. In this case, both `init()` and `destroy()` are called, it will be moved and possibly used reentrantly.
No other combinations are used anywhere in `std`.
This change simplifies things by splitting this `Mutex` type into two types matching the two use cases: `StaticMutex` and `MovableMutex`.
The interface of both new types is now both safer and simpler. The first one does not call nor expose `init`/`destroy`, and the second one calls those automatically in its `new()` and `Drop` functions. Also, the locking functions of `MovableMutex` are no longer unsafe.
---
This will also make it easier to conditionally box mutexes later, by moving that decision into sys/sys_common. Some of the mutex implementations (at least those of Wasm and 'sys/unsupported') are safe to move, so wouldn't need a box. ~~(But that's blocked on #76932 for now.)~~ (See #77380.)
Improve std::sys::windows::compat
Improves the compat_fn macro in sys::windows, which is used for conditionally loading APIs that might not be available.
- The module (dll) name can now be any string, not just an ident. (Not all Windows api modules are valid Rust identifiers. E.g. `WaitOnAddress` comes from `API-MS-Win-Core-Synch-l1-2-0.dll`.)
- Adds `FuncName::is_available()` for checking if a function is really available without having to do a duplicate lookup.
- Add comment explaining the lack of locking.
- Use `$_:block` to simplify the macro_rules.
- Apply `allow(unused_variables)` only to the fallback instead of everything.
---
The second point (`is_available()`) simplifies code that needs to pick an implementation depening on what is available, like `sys/windows/mutex.rs`. Before this change, it'd do its own lookup and keep its own `AtomicUsize` to track the result. Now it can just use `c::AcquireSRWLockExclusive::is_available()` directly.
This will also be useful when park/unpark/CondVar/etc. get improved implementations (e.g. from parking_lot or something else), as the best APIs for those are not available before Windows 8.
Make RawFd implement the RawFd traits
This PR makes `RawFd` implement `AsRawFd`, `IntoRawFd` and `FromRawFd`, so it can be passed to interfaces that use one of those traits as a bound.
- Module name can now be any string, not just an ident.
(Not all Windows api modules are valid Rust identifiers.)
- Adds c::FuncName::is_available() for checking if a function is really
available without having to do a duplicate lookup.
- Add comment explaining the lack of locking.
- Use `$_:block` to simplify the macro_rules.
- Apply allow(unused_variables) only to the fallback instead of
everything.
Use futex-based thread::park/unpark on Linux.
This moves the parking/unparking logic out of `thread/mod.rs` into a module named `thread_parker` in `sys_common`. The current implementation is moved to `sys_common/thread_parker/generic.rs` and the new implementation using futexes is added in `sys_common/thread_parker/futex.rs`.
The posix_spawn had been available in a form suitable for use in a
Command implementation since musl 0.9.12. Use it in a preference to a
fork when possible, to benefit from CLONE_VM|CLONE_VFORK used there.
Use `rtassert!` instead of `assert!` from the child process after fork() in std::sys::unix::process::Command::spawn()
As discussed in #73894, `assert!` panics on failure, which is not signal-safe, and `rtassert!` is a suitable replacement.
Fixes#73894.
r? @Amanieu @cuviper @joshtriplett
The syscalls returning a new file descriptors generally use
lowest-numbered file descriptor not currently opened, without any
exceptions for those corresponding to the standard streams.
Previously when any of standard streams has been closed before starting
the application, operations on std::io::{stderr,stdin,stdout} objects
were likely to operate on other logically unrelated file resources
opened afterwards.
Avoid the issue by reopening the standard streams when they are closed.