Remove some redundant checks from BufReader
The implementation of BufReader contains a lot of redundant checks. While any one of these checks is not particularly expensive to execute, especially when taken together they dramatically inhibit LLVM's ability to make subsequent optimizations by confusing data flow increasing the code size of anything that uses BufReader.
In particular, these changes have a ~2x increase on the benchmark that this adds a `black_box` to. I'm adding that `black_box` here just in case LLVM gets clever enough to remove the reads entirely. Right now it can't, but these optimizations are really setting it up to do so.
We get this optimization by factoring all the actual buffer management and bounds-checking logic into a new module inside `bufreader` with a new `Buffer` type. This makes it much easier to ensure that we have correctly encapsulated the management of the region of the buffer that we have read bytes into, and it lets us provide a new faster way to do small reads. `Buffer::consume_with` lets a caller do a read from the buffer with a single bounds check, instead of the double-check that's required to use `buffer` + `consume`.
Unfortunately I'm not aware of a lot of open-source usage of `BufReader` in perf-critical environments. Some time ago I tweaked this code because I saw `BufReader` in a profile at work, and I contributed some benchmarks to the `bincode` crate which exercise `BufReader::buffer`. These changes appear to help those benchmarks at little, but all these sorts of benchmarks are kind of fragile so I'm wary of quoting anything specific.
The implementation of BufReader contains a lot of redundant checks.
While any one of these checks is not particularly expensive to execute,
especially when taken together they dramatically inhibit LLVM's ability
to make subsequent optimizations.
attempt to optimise vectored write
benchmarked:
old:
```
test io::cursor::tests::bench_write_vec ... bench: 68 ns/iter (+/- 2)
test io::cursor::tests::bench_write_vec_vectored ... bench: 913 ns/iter (+/- 31)
```
new:
```
test io::cursor::tests::bench_write_vec ... bench: 64 ns/iter (+/- 0)
test io::cursor::tests::bench_write_vec_vectored ... bench: 747 ns/iter (+/- 27)
```
More unsafe than I wanted (and less gains) in the end, but it still does the job
Fix documentation for `with_capacity` and `reserve` families of methods
Fixes#95614
Documentation for the following methods
- `with_capacity`
- `with_capacity_in`
- `with_capacity_and_hasher`
- `reserve`
- `reserve_exact`
- `try_reserve`
- `try_reserve_exact`
was inconsistent and often not entirely correct where they existed on the following types
- `Vec`
- `VecDeque`
- `String`
- `OsString`
- `PathBuf`
- `BinaryHeap`
- `HashSet`
- `HashMap`
- `BufWriter`
- `LineWriter`
since the allocator is allowed to allocate more than the requested capacity in all such cases, and will frequently "allocate" much more in the case of zero-sized types (I also checked `BufReader`, but there the docs appear to be accurate as it appears to actually allocate the exact capacity).
Some effort was made to make the documentation more consistent between types as well.
Documentation for the following methods
with_capacity
with_capacity_in
with_capacity_and_hasher
reserve
reserve_exact
try_reserve
try_reserve_exact
was inconsistent and often not entirely correct where they existed on the following types
Vec
VecDeque
String
OsString
PathBuf
BinaryHeap
HashSet
HashMap
BufWriter
LineWriter
since the allocator is allowed to allocate more than the requested capacity in all such cases, and will frequently "allocate" much more in the case of zero-sized types (I also checked BufReader, but there the docs appear to be accurate as it appears to actually allocate the exact capacity).
Some effort was made to make the documentation more consistent between types as well.
Fix with_capacity* methods for Vec
Fix *reserve* methods for Vec
Fix docs for *reserve* methods of VecDeque
Fix docs for String::with_capacity
Fix docs for *reserve* methods of String
Fix docs for OsString::with_capacity
Fix docs for *reserve* methods on OsString
Fix docs for with_capacity* methods on HashSet
Fix docs for *reserve methods of HashSet
Fix docs for with_capacity* methods of HashMap
Fix docs for *reserve methods on HashMap
Fix expect messages about OOM in doctests
Fix docs for BinaryHeap::with_capacity
Fix docs for *reserve* methods of BinaryHeap
Fix typos
Fix docs for with_capacity on BufWriter and LineWriter
Fix consistent use of `hasher` between `HashMap` and `HashSet`
Fix warning in doc test
Add test for capacity of vec with ZST
Fix doc test error
std::io: Modify some ReadBuf method signatures to return `&mut Self`
This allows using `ReadBuf` in a builder-like style and to setup a `ReadBuf` and
pass it to `read_buf` in a single expression, e.g.,
```
// With this PR:
reader.read_buf(ReadBuf::uninit(buf).assume_init(init_len))?;
// Previously:
let mut buf = ReadBuf::uninit(buf);
buf.assume_init(init_len);
reader.read_buf(&mut buf)?;
```
r? `@sfackler`
cc https://github.com/rust-lang/rust/issues/78485, https://github.com/rust-lang/rust/issues/94741
impl Read and Write for VecDeque<u8>
Implementing `Read` and `Write` for `VecDeque<u8>` fills in the VecDeque api surface where `Vec<u8>` and `Cursor<Vec<u8>>` already impl Read and Write. Not only for completeness, but VecDeque in particular is a very handy mock interface for a TCP echo service, if only it supported Read/Write.
Since this PR is just an impl trait, I don't think there is a way to limit it behind a feature flag, so it's "insta-stable". Please correct me if I'm wrong here, not trying to rush stability.
* For read and read_buf, only the front slice of a discontiguous
VecDeque is copied. The VecDeque is advanced after reading, making any
back slice available for reading with a second call to Read::read(_buf).
* For write, the VecDeque always appends the entire slice to the end,
growing its allocation when necessary.
This allows using `ReadBuf` in a builder-like style and to setup a `ReadBuf` and
pass it to `read_buf` in a single expression, e.g.,
```
// With this PR:
reader.read_buf(ReadBuf::uninit(buf).assume_init(init_len))?;
// Previously:
let mut buf = ReadBuf::uninit(buf);
buf.assume_init(init_len);
reader.read_buf(&mut buf)?;
```
Signed-off-by: Nick Cameron <nrc@ncameron.org>
Strict Provenance MVP
This patch series examines the question: how bad would it be if we adopted
an extremely strict pointer provenance model that completely banished all
int<->ptr casts.
The key insight to making this approach even *vaguely* pallatable is the
ptr.with_addr(addr) -> ptr
function, which takes a pointer and an address and creates a new pointer
with that address and the provenance of the input pointer. In this way
the "chain of custody" is completely and dynamically restored, making the
model suitable even for dynamic checkers like CHERI and Miri.
This is not a formal model, but lots of the docs discussing the model
have been updated to try to the *concept* of this design in the hopes
that it can be iterated on.
See #95228