Commit Graph

299 Commits

Author SHA1 Message Date
Dan Gohman
23a5c21415 Fix a typo in a comment. 2020-10-18 07:21:41 -07:00
LinkTed
79273fa30c Fix cannot find type ucred for MacOs by using fake definitions 2020-10-17 19:36:11 +02:00
bors
4cb7ef0f94 Auto merge of #77455 - asm89:faster-spawn, r=kennytm
Use posix_spawn() on unix if program is a path

Previously `Command::spawn` would fall back to the non-posix_spawn based
implementation if the `PATH` environment variable was possibly changed.
On systems with a modern (g)libc `posix_spawn()` can be significantly
faster. If program is a path itself the `PATH` environment variable is
not used for the lookup and it should be safe to use the
`posix_spawnp()` method. [1]

We found this, because we have a cli application that effectively runs a
lot of subprocesses. It would sometimes noticeably hang while printing
output. Profiling showed that the process was spending the majority of
time in the kernel's `copy_page_range` function while spawning
subprocesses. During this time the process is completely blocked from
running, explaining why users were reporting the cli app hanging.

Through this we discovered that `std::process::Command` has a fast and
slow path for process execution. The fast path is backed by
`posix_spawnp()` and the slow path by fork/exec syscalls being called
explicitly. Using fork for process creation is supposed to be fast, but
it slows down as your process uses more memory.  It's not because the
kernel copies the actual memory from the parent, but it does need to
copy the references to it (see `copy_page_range` above!).  We ended up
using the slow path, because the command spawn implementation in falls
back to the slow path if it suspects the PATH environment variable was
changed.

Here is a smallish program demonstrating the slowdown before this code
change:

```
use std::process::Command;
use std::time::Instant;

fn main() {
    let mut args = std::env::args().skip(1);
    if let Some(size) = args.next() {
        // Allocate some memory
        let _xs: Vec<_> = std::iter::repeat(0)
            .take(size.parse().expect("valid number"))
            .collect();

        let mut command = Command::new("/bin/sh");
        command
            .arg("-c")
            .arg("echo hello");

        if args.next().is_some() {
            println!("Overriding PATH");
            command.env("PATH", std::env::var("PATH").expect("PATH env var"));
        }

        let now = Instant::now();
        let child = command
            .spawn()
            .expect("failed to execute process");

        println!("Spawn took: {:?}", now.elapsed());

        let output = child.wait_with_output().expect("failed to wait on process");
        println!("Output: {:?}", output);
    } else {
        eprintln!("Usage: prog [size]");
        std::process::exit(1);
    }
    ()
}
```

Running it and passing different amounts of elements to use to allocate
memory shows that the time taken for `spawn()` can differ quite
significantly. In latter case the `posix_spawnp()` implementation is 30x
faster:

```
$ cargo run --release 10000000
...
Spawn took: 324.275µs
hello
$ cargo run --release 10000000 changepath
...
Overriding PATH
Spawn took: 2.346809ms
hello
$ cargo run --release 100000000
...
Spawn took: 387.842µs
hello
$ cargo run --release 100000000 changepath
...
Overriding PATH
Spawn took: 13.434677ms
hello
```

[1]: 5f72f9800b/posix/execvpe.c (L81)
2020-10-17 06:16:00 +00:00
Yuki Okushi
9abf81afa8 Rollup merge of #77900 - Thomasdezeeuw:fdatasync, r=dtolnay
Use fdatasync for File::sync_data on more OSes

Add support for the following OSes:
 * Android
 * FreeBSD: https://www.freebsd.org/cgi/man.cgi?query=fdatasync&sektion=2
 * OpenBSD: https://man.openbsd.org/OpenBSD-5.8/fsync.2
 * NetBSD: https://man.netbsd.org/fdatasync.2
 * illumos: https://illumos.org/man/3c/fdatasync
2020-10-17 05:36:45 +09:00
Dan Gohman
91a9f83dd1 Define fs::hard_link to not follow symlinks.
POSIX leaves it implementation-defined whether `link` follows symlinks.
In practice, for example, on Linux it does not and on FreeBSD it does.
So, switch to `linkat`, so that we can pick a behavior rather than
depending on OS defaults.

Pick the option to not follow symlinks. This is somewhat arbitrary, but
seems the less surprising choice because hard linking is a very
low-level feature which requires the source and destination to be on
the same mounted filesystem, and following a symbolic link could end
up in a different mounted filesystem.
2020-10-16 12:05:49 -07:00
Mara Bos
0f0257be10 Take some of sys/vxworks/process/* from sys/unix instead. 2020-10-16 06:22:05 +02:00
Mara Bos
408db0da85 Take sys/vxworks/{os,path,pipe} from sys/unix instead. 2020-10-16 06:22:00 +02:00
Mara Bos
71bb1dc2a0 Take sys/vxworks/{fd,fs,io} from sys/unix instead. 2020-10-16 06:19:00 +02:00
Mara Bos
ba483c51df Take sys/vxworks/args from sys/unix instead. 2020-10-16 06:19:00 +02:00
Mara Bos
dce405ae3d Take sys/vxworks/net from sys/unix instead. 2020-10-16 06:19:00 +02:00
Mara Bos
5d526f6eee Take sys/vxworks/thread from sys/unix instead. 2020-10-16 06:19:00 +02:00
Mara Bos
c8628f43bf Take sys/vxworks/stack_overflow from sys/unix instead. 2020-10-16 06:18:59 +02:00
Mara Bos
44a2af32cc Remove lifetime from StaticMutex and assume 'static.
StaticMutex is only ever used with as a static (as the name already
suggests). So it doesn't have to be generic over a lifetime, but can
simply assume 'static.

This 'static lifetime guarantees the object is never moved, so this is
no longer a manually checked requirement for unsafe calls to lock().
2020-10-14 09:52:03 +02:00
Thomas de Zeeuw
8c0c7ec4ec Use fdatasync for File::sync_data on more OSes
Add support for the following OSes:
 * Android
 * FreeBSD: https://www.freebsd.org/cgi/man.cgi?query=fdatasync&sektion=2
 * OpenBSD: https://man.openbsd.org/OpenBSD-5.8/fsync.2
 * NetBSD: https://man.netbsd.org/fdatasync.2
 * illumos: https://illumos.org/man/3c/fdatasync
2020-10-13 15:57:31 +02:00
LinkTed
d8c75d9f91 Fix unresolved imports for recv_vectored_with_ancillary_from, send_vectored_with_ancillary_to and SocketAncillary 2020-10-11 19:23:41 +02:00
bors
bc74dd711f Auto merge of #77727 - thomcc:mach-info-order, r=Amanieu
Avoid SeqCst or static mut in mach_timebase_info and QueryPerformanceFrequency caches

This patch went through a couple iterations but the end result is replacing a pattern where an `AtomicUsize` (updated with many SeqCst ops) guards a `static mut` with a single `AtomicU64` that is known to use 0 as a value indicating that it is not initialized.

The code in both places exists to cache values used in the conversion of Instants to Durations on macOS, iOS, and Windows.

I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), but it's much simpler, safer, and in practice we'd expect it to be faster everywhere where Relaxed operations on AtomicU64 are cheaper than SeqCst operations on AtomicUsize, which is a lot of places.

Anyway, it also removes a bunch of unsafe code and greatly simplifies the logic, so IMO that alone would be worth it unless it was a regression.

If you want to take a look at the assembly output though, see https://godbolt.org/z/rbr6vn for x86_64, https://godbolt.org/z/cqcbqv for aarch64 (Note that this just the output of the mac side, but i'd expect the windows part to be the same and don't feel like doing another godbolt for it). There are several versions of this function in the godbolt:

- `info_new`: version in the current patch
- `info_less_new`: version in initial PR
- `info_original`: version currently in the tree
- `info_orig_but_better_orderings`: a version that just tries to change the original code's orderings from SeqCst to the (probably) minimal orderings required for soundness/correctness.

The biggest concern I have here is if we can use AtomicU64, or if there are targets that dont have it that this code supports. AFAICT: no. (If that changes in the future, it's easy enough to do something different for them)

r? `@Amanieu` because he caught a couple issues last time I tried to do a patch reducing orderings 😅

---

<details>
<summary>I rewrote this whole message so the original is inside here</summary>

I happened to notice the code we use for caching the result of mach_timebase_info uses SeqCst exclusively.

However, thinking a little more, it's actually pretty easy to avoid the static mut by packing the timebase info into an AtomicU64.

This entirely avoids needing to do the compare_exchange. The AtomicU64 can be read/written using Relaxed ops, which on current macos/ios platforms (x86_64/aarch64) have no overhead compared to direct loads/stores. This simplifies the code and makes it a lot safer too.

I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), although it should do that on both targets it applies to.

That said, it also removes a bunch of unsafe code and simplifies the logic (arguably at least — there are only two states now, initialized or not), so I think it's a net win even without concrete numbers.

If you want to take a look at the assembly output though, see below. It has the new version, the original, and a version of the original with lower Orderings (which is still worse than the version in this PR)

- godbolt.org/z/obfqf9 x86_64-apple-darwin

- godbolt.org/z/Wz5cWc aarch64-unknown-linux-gnu (godbolt can't do aarch64-apple-ios but that doesn't matter here)

A different (and more efficient) option than this would be to just use the AtomicU64 and use the knowledge that after initialization the denominator should be nonzero... That felt like it's relying on too many things I'm not confident in, so I didn't want to do that.
</details>
2020-10-11 14:06:04 +00:00
LinkTed
64facfef51 Fix unresolved link to SocketAncillary 2020-10-10 15:19:13 +02:00
LinkTed
7b596f2e13 Fix libc is ambiguous for Windows 2020-10-10 15:19:13 +02:00
LinkTed
fc65f6a0ce Fix import errors for #[cfg(doc)] target 2020-10-10 15:19:13 +02:00
LinkTed
a81764731c Add fake definitions for Windows 2020-10-10 15:19:13 +02:00
LinkTed
d0069a0cc5 Fix imports for MacOs 2020-10-10 15:19:13 +02:00
LinkTed
1ae54e560a Change imports for cfg(doc) 2020-10-10 15:19:13 +02:00
LinkTed
e9bf69954c Remove passcred for emscripten 2020-10-10 15:19:13 +02:00
LinkTed
6b0c3dfe00 Remove unnecessary trailing semicolon 2020-10-10 15:19:13 +02:00
LinkTed
ce167f8be7 Fix type mismatching for different OSes. 2020-10-10 15:19:13 +02:00
LinkTed
889c9272cb Remove SocketCred for emscripten 2020-10-10 15:19:13 +02:00
LinkTed
0fcb834832 Fix unused import for IoSliceMut for macos 2020-10-10 15:19:13 +02:00
LinkTed
31e6e3896d Fix SO_PASSCRED for macos 2020-10-10 15:19:13 +02:00
LinkTed
b01ce2cfd0 Fix MSG_CMSG_CLOEXEC for macos 2020-10-10 15:19:13 +02:00
LinkTed
c2a1b50140 Add conditional compilation for import 2020-10-10 15:19:13 +02:00
LinkTed
e0cedba63e Fix cfg condition for test 2020-10-10 15:19:13 +02:00
LinkTed
d30508f95c Remove target_os, which does not have SO_PASSCRED constant in libc 2020-10-10 15:19:13 +02:00
LinkTed
1f6d7dcc0a Remove target_os, which does not have cmsghdr struct in libc 2020-10-10 15:19:13 +02:00
LinkTed
7b476d87fb Remove target_os, which does not have MSG_CMSG_CLOEXEC constant in libc 2020-10-10 15:19:12 +02:00
LinkTed
db902bca3a Add the code of the tracking issue 2020-10-10 15:19:12 +02:00
LinkTed
cc085e9170 Replace assert with unreachable 2020-10-10 15:19:12 +02:00
LinkTed
e61148f98a Cast boolean into int directly in function set_passcred 2020-10-10 15:19:12 +02:00
LinkTed
d0b133cdc6 Remove unsupported target_os for SocketCred 2020-10-10 15:19:12 +02:00
LinkTed
5964d599ac Change standard types to libc types 2020-10-10 15:19:12 +02:00
LinkTed
1902711f38 Change name of struct to SocketCred 2020-10-10 15:19:12 +02:00
LinkTed
eeea5c23b4 Change API to unsafe and add doc comments 2020-10-10 15:19:12 +02:00
LinkTed
686964f0f5 Add set_passcred and passcred methods to UnixStream and UnixDatagram 2020-10-10 15:19:12 +02:00
LinkTed
19c5fdda7c Rename test.rs to tests.rs 2020-10-10 15:19:12 +02:00
LinkTed
a91fd7328c Add doc comments 2020-10-10 15:19:12 +02:00
LinkTed
46764d48bb Add doc(cfg(...)) 2020-10-10 15:19:12 +02:00
LinkTed
1869141e54 Reduce impl trait by using macro in raw_fd.rs 2020-10-10 15:19:12 +02:00
LinkTed
53791b3ff4 Move conditional compilation to the upper module and sort the target OS list alphabetically 2020-10-10 15:19:12 +02:00
LinkTed
07ed6afc6d Remove unnecessary path 2020-10-10 15:19:12 +02:00
LinkTed
6ed9bface6 Use fill instead of memset 2020-10-10 15:19:12 +02:00
LinkTed
1f3195a5df Remove inner function in bind, connect and send_to 2020-10-10 15:19:12 +02:00