Commit Graph

116 Commits

Author SHA1 Message Date
Ralf Jung
20ab57e582 library: allow some unused things in Miri 2022-10-26 09:48:47 +02:00
bors
7fcf850d79 Auto merge of #103137 - dtolnay:readdir, r=Mark-Simulacrum
Eliminate 280-byte memset from ReadDir iterator

This guy:

1536ab1b38/library/std/src/sys/unix/fs.rs (L589)

It turns out `libc::dirent64` is quite big—https://docs.rs/libc/0.2.135/libc/struct.dirent64.html. In #103135 this memset accounted for 0.9% of the runtime of iterating a big directory.

Almost none of the big zeroed value is ever used. We memcpy a tiny prefix (19 bytes) into it, and then read just 9 bytes (`d_ino` and `d_type`) back out. We can read exactly those 9 bytes we need directly from the original entry_ptr instead.

## History

This code got added in #93459 and tweaked in #94272 and #94750.

Prior to #93459, there was no memset but a full 280 bytes were being copied from the entry_ptr.

<table><tr><td>copy 280 bytes</td></tr></table>

This was not legal because not all of those bytes might be initialized, or even allocated, depending on the length of the directory entry's name, leading to a segfault. That PR fixed the segfault by creating a new zeroed dirent64 and copying just the guaranteed initialized prefix into it.

<table><tr><td>memset 280 bytes</td><td>copy 19 bytes</td></tr></table>

However this was still buggy because it used `addr_of!((*entry_ptr).d_name)`, which is considered UB by Miri in the case that the full extent of entry_ptr is not in bounds of the same allocation. (Arguably this shouldn't be a requirement, but here we are.)

The UB got fixed by #94272 by replacing `addr_of` with some pointer manipulation based on `offset_from`, but still fundamentally the same operation.

<table><tr><td>memset 280 bytes</td><td>copy 19 bytes</td></tr></table>

Then #94750 noticed that only 9 of those 19 bytes were even being used, so we could pick out only those 9 to put in the ReadDir value.

<table><tr><td>memset 280 bytes</td><td>copy 19 bytes</td><td>copy 9 bytes</td></tr></table>

After my PR we just grab the 9 needed bytes directly from entry_ptr.

<table><tr><td>copy 9 bytes</td></tr></table>

The resulting code is more complex but I believe still worthwhile to land for the following reason. This is an extremely straightforward thing to accomplish in C and clearly libc assumes that; literally just `entry_ptr->d_name`. The extra work in comparison to accomplish it in Rust is not an example of any actual safety being provided by Rust. I believe it's useful to have uncovered that and think about what could be done in the standard library or language to support this obvious operation better.

## References

- https://man7.org/linux/man-pages/man3/readdir.3.html
2022-10-23 18:55:40 +00:00
David Tolnay
0bb6eb1526 Eliminate 280-byte memset from ReadDir iterator 2022-10-16 23:43:35 -07:00
Alex Saveau
727335878d Support DirEntry metadata calls in miri
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
2022-10-16 12:14:27 -07:00
Matthias Krüger
51320b3a16 Rollup merge of #102227 - devnexen:solarish_get_path, r=m-ou-se
fs::get_path solarish version.

similar to linux, albeit there is no /proc/self notion on solaris
 based system thus flattening the difference for simplification sake.
2022-10-11 18:59:47 +02:00
bors
81f3919303 Auto merge of #102850 - JohnTitor:rollup-lze1w03, r=JohnTitor
Rollup of 8 pull requests

Successful merges:

 - #101118 (fs::get_mode enable getting the data via fcntl/F_GETFL on major BSD)
 - #102072 (Add `ptr::Alignment` type)
 - #102799 (rustdoc: remove hover gap in file picker)
 - #102820 (Show let-else suggestion on stable.)
 - #102829 (rename `ImplItemKind::TyAlias` to `ImplItemKind::Type`)
 - #102831 (Don't use unnormalized type in `Ty::fn_sig` call in rustdoc `clean_middle_ty`)
 - #102834 (Remove unnecessary `lift`/`lift_to_tcx` calls from rustdoc)
 - #102838 (remove cfg(bootstrap) from Miri)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
2022-10-09 18:15:26 +00:00
Yuki Okushi
d0f1cf5de7 Rollup merge of #101118 - devnexen:fs_getmode_bsd, r=Mark-Simulacrum
fs::get_mode enable getting the data via fcntl/F_GETFL on major BSD

supporting this flag.
2022-10-10 00:09:39 +09:00
Alex Saveau
86974b83af Reduce CString allocations in std as much as possible
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
2022-10-03 11:13:17 -07:00
beetrees
39c0b00cf9 Error instead of panicking when setting file times if the passed SystemTime doesn't fit into the required type 2022-10-01 03:22:55 +01:00
David Carlier
2ea770d067 fs::get_path solarish version. 2022-09-26 06:41:27 +01:00
David Carlier
c8f73e79b3 fs::get_mode enable getting the data via fcntl/F_GETFL on major BSD
supporting this flag.
2022-08-28 10:43:30 +01:00
Matthias Krüger
b9306c231a Rollup merge of #97015 - nrc:read-buf-cursor, r=Mark-Simulacrum
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor

This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`.

The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`.

In addition to the major API split, I've made the following smaller changes:

* Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one)
* Unified naming, e.g., by replacing initialized with init and assume_init with set_init
* Added an easy way to get the number of bytes written to a cursor (`written` method)

As well as simplifying the API (IMO), this approach has the following advantages:

* Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`.
* Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour.

## Outline

```rust
pub struct BorrowBuf<'a>

impl Debug for BorrowBuf<'_>

impl<'a> From<&'a mut [u8]> for BorrowBuf<'a>
impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a>

impl<'a> BorrowBuf<'a> {
    pub fn capacity(&self) -> usize
    pub fn len(&self) -> usize
    pub fn init_len(&self) -> usize
    pub fn filled(&self) -> &[u8]
    pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a>
    pub fn clear(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
}

pub struct BorrowCursor<'buf, 'data>

impl<'buf, 'data> BorrowCursor<'buf, 'data> {
    pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data>
    pub fn capacity(&self) -> usize
    pub fn written(&self) -> usize
    pub fn init_ref(&self) -> &[u8]
    pub fn init_mut(&mut self) -> &mut [u8]
    pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn advance(&mut self, n: usize) -> &mut Self
    pub fn ensure_init(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
    pub fn append(&mut self, buf: &[u8])
}
```

## TODO

* ~~Migrate non-unix libs and tests~~
* ~~Naming~~
  * ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~
  * ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~.
  * ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~
* ~~Documentation~~
* Tests are incomplete (I adjusted existing tests, but did not add new ones).

cc https://github.com/rust-lang/rust/issues/78485, https://github.com/rust-lang/rust/issues/94741
supersedes: https://github.com/rust-lang/rust/pull/95770, https://github.com/rust-lang/rust/pull/93359
fixes #93305
2022-08-28 09:35:11 +02:00
Ralf Jung
438e49c1cb silence some unused-fn warnings in miri std builds 2022-08-18 18:07:39 -04:00
Nick Cameron
ac70aea985 Address reviewer comments
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-08-18 10:34:40 +01:00
Matthias Krüger
b8b3ead67a Rollup merge of #100249 - Meziu:master, r=joshtriplett
Fix HorizonOS regression in FileTimes

The changes in #98246 caused a regression for multiple Newlib-based systems. This is just a fix including HorizonOS to the list of  targets which require a workaround.

``@AzureMarker`` ``@ian-h-chamberlain``
r? ``@nagisa``
2022-08-14 20:16:00 +02:00
Vincenzo Palazzo
d91dff3c1b promote debug_assert to assert
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
2022-08-11 01:18:45 +00:00
Andrea Ciliberti
926f58745e Fix HorizonOS regression in FileTimes 2022-08-07 19:30:05 +02:00
Nick Cameron
1a2122fff0 non-linux platforms
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-08-05 17:18:51 +01:00
Nick Cameron
c1aae4d279 std::io: migrate ReadBuf to BorrowBuf/BorrowCursor
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-08-04 15:29:32 +01:00
Ivan Markov
e86c128aa3 FilesTimes support does not build for ESP-IDF 2022-08-03 19:30:23 +00:00
bors
1f5d8d49eb Auto merge of #98246 - joshtriplett:times, r=m-ou-se
Support setting file accessed/modified timestamps

Add `struct FileTimes` to contain the relevant file timestamps, since
most platforms require setting all of them at once. (This also allows
for future platform-specific extensions such as setting creation time.)

Add `File::set_file_time` to set the timestamps for a `File`.

Implement the `sys` backends for UNIX, macOS (which needs to fall back
to `futimes` before macOS 10.13 because it lacks `futimens`), Windows,
and WASI.
2022-08-01 06:44:43 +00:00
Josh Triplett
f8061ddb03 Fix warnings in stubbed out set_times 2022-07-30 13:28:17 -07:00
David CARLIER
e39b44a076 Implement fs::get_path for FreeBSD.
Using `F_KINFO` fcntl flag, the kf_structsize field
needs to be set beforehand for that effect.
2022-07-25 23:25:15 +01:00
Josh Triplett
11d9be6359 Stub out set_times to return unsupported on Redox
Redox doesn't appear to support `UTIME_OMIT`, so we can't set file times
individually.
2022-07-22 03:52:50 -07:00
Vladimir Michael Eatwell
439d64a83c Library changes for Apple WatchOS 2022-07-20 08:57:36 +01:00
Josh Triplett
3da17293e7 Don't fall back to futimes on Android; it needs a newer API level than futimens
Just return `io::ErrorKind::Unsupported` instead.
2022-07-15 02:54:06 -07:00
Josh Triplett
e387cff7a3 Also use fallback for futimens on Android
futimens requires Android API level 19, and std still supports older API
levels.
2022-07-15 02:54:06 -07:00
Josh Triplett
61b45c670b Support setting file accessed/modified timestamps
Add `struct FileTimes` to contain the relevant file timestamps, since
most platforms require setting all of them at once. (This also allows
for future platform-specific extensions such as setting creation time.)

Add `File::set_file_time` to set the timestamps for a `File`.

Implement the `sys` backends for UNIX, macOS (which needs to fall back
to `futimes` before macOS 10.13 because it lacks `futimens`), Windows,
and WASI.
2022-07-15 02:54:06 -07:00
Ian Chamberlain
a49d14f089 Update libc::stat field names
See https://github.com/Meziu/rust-horizon/pull/14
2022-06-13 20:44:58 -07:00
Meziu
4e808f87cc Horizon OS STD support
Co-authored-by: Ian Chamberlain <ian.h.chamberlain@gmail.com>
Co-authored-by: Mark Drobnak <mark.drobnak@gmail.com>
2022-06-13 20:44:39 -07:00
Mara Bos
4f212f08cf Use Rust 2021 prelude in std itself. 2022-05-09 11:12:32 +02:00
Josh Stone
fec4818fdb Use statx's 64-bit times on 32-bit linux-gnu 2022-05-06 08:50:53 -07:00
bors
18f314e702 Auto merge of #94609 - esp-rs:esp-idf-stat-type-fixes, r=Mark-Simulacrum
espidf: fix stat

Marking as draft as currently dependant on [a libc fix](https://github.com/rust-lang/libc/pull/2708) and release.
2022-04-24 19:16:20 +00:00
Geoffry Song
cff3f1e8d5 remove_dir_all_recursive: treat ELOOP the same as ENOTDIR 2022-04-20 00:50:03 +00:00
Scott Mabin
3569d43b50 espidf: fix stat
* corect type usage with new type definitions in libc
2022-04-19 17:00:09 +01:00
Marcus Calhoun-Lopez
8c18844324 Fix build on i686-apple-darwin systems
On 32-bit systems, fdopendir is called `_fdopendir$INODE64$UNIX2003`.
On 64-bit systems, fdopendir is called `_fdopendir$INODE64`.
2022-03-28 12:52:14 -07:00
Marcus Calhoun-Lopez
c2d5c64132 Fix build on i686-apple-darwin systems
Replace `target_arch = "x86_64"` with `not(target_arch = "aarch64")` so that i686-apple-darwin systems dynamically choose implementation.
2022-03-28 12:52:14 -07:00
Matthias Krüger
acb7ed141b Rollup merge of #94749 - RalfJung:remove-dir-all-miri, r=cuviper
remove_dir_all: use fallback implementation on Miri

Fixes https://github.com/rust-lang/miri/issues/1966

The new implementation requires `openat`, `unlinkat`, and `fdopendir`. These cannot easily be shimmed in Miri since libstd does not expose APIs corresponding to them. So for now it is probably easiest to just use the fallback code in Miri. Nobody should run Miri as root anyway...
2022-03-20 09:14:58 +01:00
bors
163c207fc2 Auto merge of #94750 - cuviper:dirent64_min, r=joshtriplett
unix: reduce the size of DirEntry

On platforms where we call `readdir` instead of `readdir_r`, we store
the name as an allocated `CString` for variable length. There's no point
carrying around a full `dirent64` with its fixed-length `d_name` too.
2022-03-09 02:17:58 +00:00
Josh Stone
e8b9ba84be unix: reduce the size of DirEntry
On platforms where we call `readdir` instead of `readdir_r`, we store
the name as an allocated `CString` for variable length. There's no point
carrying around a full `dirent64` with its fixed-length `d_name` too.
2022-03-08 13:36:01 -08:00
Ralf Jung
2a2b212ea3 remove_dir_all: use fallback implementation on Miri 2022-03-08 16:26:10 -05:00
Josh Stone
ef3e33bd16 unix: Avoid name conversions in remove_dir_all_recursive
Each recursive call was creating an `OsString` for a `&Path`, only for
it to be turned into a `CString` right away. Instead we can directly
pass `.name_cstr()`, saving two allocations each time.
2022-03-07 18:51:53 -08:00
bors
2631aeef82 Auto merge of #94272 - tavianator:readdir-reclen-for-real, r=cuviper
fs: Don't dereference a pointer to a too-small allocation

ptr::addr_of!((*ptr).field) still requires ptr to point to an
appropriate allocation for its type.  Since the pointer returned by
readdir() can be smaller than sizeof(struct dirent), we need to entirely
avoid dereferencing it as that type.

Link: https://github.com/rust-lang/miri/pull/1981#issuecomment-1048278492
Link: https://github.com/rust-lang/rust/pull/93459#discussion_r795089971
2022-03-07 04:48:23 +00:00
Hans Kratz
735f60c34f Integrate macos x86-64 remove_dir_all() impl. Step 2: readd 2022-03-04 13:47:50 +01:00
Hans Kratz
41b4423cdf Integrate macos x86-64 remove_dir_all() impl. Step 1: remove 2022-03-04 13:47:36 +01:00
Hans Kratz
e427333071 remove_dir_all(): try recursing first instead of trying to unlink()
This only affects the `slow` code path, if there is no `dirent.d_type` or if
the type is `DT_UNKNOWN`.

POSIX specifies that calling `unlink()` or `unlinkat(..., 0)` on a directory can
succeed:
> "The _path_ argument shall not name a directory unless the process has
> appropriate privileges and the implementation supports using _unlink()_ on
> directories."
This however can cause orphaned directories requiring an fsck e.g. on Illumos
UFS, so we have to avoid that in the common case. We now just try to recurse
into it first and unlink() if we can't open it as a directory.
2022-03-04 13:33:35 +01:00
Tavian Barnes
478cf8b3a4 fs: Don't dereference a pointer to a too-small allocation
ptr::addr_of!((*ptr).field) still requires ptr to point to an
appropriate allocation for its type.  Since the pointer returned by
readdir() can be smaller than sizeof(struct dirent), we need to entirely
avoid dereferencing it as that type.

Link: https://github.com/rust-lang/miri/pull/1981#issuecomment-1048278492
Link: https://github.com/rust-lang/rust/pull/93459#discussion_r795089971
2022-02-23 09:51:02 -05:00
Thom Chiovoloni
554918e311 Hide Repr details from io::Error, and rework io::Error::new_const. 2022-02-04 18:47:29 -08:00
Matthias Krüger
cd27f1b56e Rollup merge of #93471 - cuviper:direntry-file_type-stat, r=the8472
unix: Use metadata for `DirEntry::file_type` fallback

When `DirEntry::file_type` fails to match a known `d_type`, we should
fall back to `DirEntry::metadata` instead of a bare `lstat`, because
this is faster and more reliable on targets with `fstatat`.
2022-01-31 07:00:44 +01:00
Josh Stone
d70b9c03ec unix: Use metadata for DirEntry::file_type fallback
When `DirEntry::file_type` fails to match a known `d_type`, we should
fall back to `DirEntry::metadata` instead of a bare `lstat`, because
this is faster and more reliable on targets with `fstatat`.
2022-01-29 16:58:18 -08:00