Commit Graph

47 Commits

Author SHA1 Message Date
joboet
c14d137bfc std: update internal uses of io::const_error! 2024-11-26 18:38:24 +01:00
Alona Enraght-Moony
c496af64ed Add as_slice/into_slice for IoSlice/IoSliceMut.
Co-authored-by: Mike Pedersen <mike@mikepedersen.dk>
Co-authored-by: Nathan West <Lucretiel@gmail.com>
2024-11-09 18:52:29 +00:00
Benoît du Garreau
4b8a66c908 Add tests 2024-09-23 22:51:27 +02:00
Michael Goulet
c682aa162b Reformat using the new identifier sorting from rustfmt 2024-09-22 19:11:29 -04:00
ranger-ross
24ad26db3b Fixed some typos in the standard library documentation/comments 2024-08-31 14:41:01 +09:00
Matthias Krüger
1a9f91a43e Rollup merge of #109174 - soerenmeier:cursor_fns, r=dtolnay
Replace `io::Cursor::{remaining_slice, is_empty}`

This is a late follow up to the concerns raised in https://github.com/rust-lang/rust/issues/86369.

https://github.com/rust-lang/rust/issues/86369#issuecomment-953096691
> This API seems focussed on the `Read` side of things. When `Seek`ing around and `Write`ing data, `is_empty` becomes confusing and `remaining_slice` is not very useful. When writing, the part of the slice before the cursor is much more interesting. Maybe we should have functions for both? Or a single function that returns both slices? (If we also have a `mut` version, a single function would be useful to allow mutable access to both sides at once.)

New feature name: `cursor_remaining` > `cursor_split`.
Added functions:
```rust
fn split(&self) -> (&[u8], &[u8]);
// fn before(&self) -> &[u8];
// fn after(&self) -> &[u8];
fn split_mut(&mut self) -> (&mut [u8], &mut [u8]);
// fn before_mut(&mut self) -> &mut [u8];
// fn after_mut(&mut self) -> &mut [u8];
```

A question was raised in https://github.com/rust-lang/rust/issues/86369#issuecomment-927124211 about whether to return a lifetime that would reflect the lifetime of the underlying bytes (`impl Cursor<&'a [u8]> { fn after(&self) -> &'a [u8] }`). The downside of doing this would be that it would not be possible to implement these functions generically over `T: AsRef<[u8]>`.

## Update
Based on the review, before* and after* methods where removed.
2024-07-29 07:11:13 +02:00
Nicholas Nethercote
84ac80f192 Reformat use declarations.
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
2024-07-29 08:26:52 +10:00
Sören Meier
10da5553a8 Replace io::Cursor::{remaining_slice, is_empty} with io::Cursor::{split, split_mut} 2024-07-28 21:51:57 +02:00
John Arundel
a19472a93e Fix doc nits
Many tiny changes to stdlib doc comments to make them consistent (for example
"Returns foo", rather than "Return foo", per RFC1574), adding missing periods, paragraph
breaks, backticks for monospace style, and other minor nits.

https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md#appendix-a-full-conventions-text
2024-07-26 13:26:33 +01:00
Benoît du Garreau
a197ff3259 Address review comments 2024-05-20 17:00:11 +02:00
joboet
91fe6f9343 core: panic on overflow in BorrowedCursor 2024-04-11 18:33:46 +02:00
Ralf Jung
1dd47e0a17 disable OOM test in Miri 2024-03-10 09:24:25 +01:00
Kornel
e49cd1c578 TryReserveError to ErrorKind::OutOfMemory 2024-02-21 16:31:53 +00:00
bors
bea5bebf3d Auto merge of #105917 - a1phyr:read_chain_more_impls, r=workingjubilee
Specialize some methods of `io::Chain`

This PR specializes the implementation of some methods of `io::Chain`, which could bring performance improvements when using it.
2024-02-19 04:43:54 +00:00
Benoît du Garreau
0a42a540c6 Make io::BorrowedCursor::advance safe
This also keeps the old `advance` method under `advance_unchecked` name.

This makes pattern like `std::io::default_read_buf` safe to write.
2024-02-07 16:46:28 +01:00
Conrad Ludgate
4c694db252 add another test to make sure it still works with full reads 2024-02-03 11:46:54 +00:00
Conrad Ludgate
a27e45a71b fix #120603 by adding a check in default_read_buf 2024-02-03 11:30:26 +00:00
bors
e68f935117 Auto merge of #98943 - WilliamVenner:feat/bufread_skip_until, r=dtolnay
Add `BufRead::skip_until`

Alternative version of `BufRead::read_until` that simply discards data, rather than copying it into a buffer.

Useful for situations like skipping irrelevant data in a binary file format that is NUL-terminated.

<details>
<summary>Benchmark</summary>

```
running 2 tests
test bench_read_until ... bench:         123 ns/iter (+/- 6)
test bench_skip_until ... bench:          66 ns/iter (+/- 3)
```

```rs
#![feature(test)]
extern crate test;
use test::Bencher;

use std::io::{ErrorKind, BufRead};

fn skip_until<R: BufRead + ?Sized>(r: &mut R, delim: u8) -> Result<usize, std::io::Error> {
    let mut read = 0;
    loop {
        let (done, used) = {
            let available = match r.fill_buf() {
                Ok(n) => n,
                Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
                Err(e) => return Err(e),
            };
            match memchr::memchr(delim, available) {
                Some(i) => (true, i + 1),
                None => (false, available.len()),
            }
        };
        r.consume(used);
        read += used;
        if done || used == 0 {
            return Ok(read);
        }
    }
}

const STR: &[u8] = b"Ferris\0Hello, world!\0";

#[bench]
fn bench_skip_until(b: &mut Bencher) {
    b.iter(|| {
        let mut io = std::io::Cursor::new(test::black_box(STR));
        skip_until(&mut io, b'\0').unwrap();
        let mut hello = Vec::with_capacity(b"Hello, world!\0".len());
        let num_bytes = io.read_until(b'\0', &mut hello).unwrap();
        assert_eq!(num_bytes, b"Hello, world!\0".len());
        assert_eq!(hello, b"Hello, world!\0");
    });
}

#[bench]
fn bench_read_until(b: &mut Bencher) {
    b.iter(|| {
        let mut io = std::io::Cursor::new(test::black_box(STR));
        io.read_until(b'\0', &mut Vec::new()).unwrap();
        let mut hello = Vec::with_capacity(b"Hello, world!\0".len());
        let num_bytes = io.read_until(b'\0', &mut hello).unwrap();
        assert_eq!(num_bytes, b"Hello, world!\0".len());
        assert_eq!(hello, b"Hello, world!\0");
    });
}
```
</details>
2023-11-23 22:28:14 +00:00
William Venner
7c1ab71f71 Add assertion to test skip_until return value
The extra `\0` in this commit is needed because the assertion on line 49 will fail otherwise (as `skip_until` stops reading on EOF and therefore does not read a trailing `\0`, returning 6 read bytes rather than the expected 7)
2023-08-03 09:52:57 +01:00
Benoît du Garreau
ebc5970329 Add tests and comments about read_to_string and read_line specializations 2023-07-26 23:31:03 +02:00
William Venner
7c9ad34362 Move BufRead::skip_until test to a more appropriate location 2023-05-18 18:59:36 +01:00
Chris Denton
f74fe8bf4c Limit read size in File::read_to_end loop
This works around performance issues on  Windows by limiting reads the size of reads when the expected size is known.
2023-04-21 20:54:12 +01:00
Matthias Krüger
b9306c231a Rollup merge of #97015 - nrc:read-buf-cursor, r=Mark-Simulacrum
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor

This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`.

The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`.

In addition to the major API split, I've made the following smaller changes:

* Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one)
* Unified naming, e.g., by replacing initialized with init and assume_init with set_init
* Added an easy way to get the number of bytes written to a cursor (`written` method)

As well as simplifying the API (IMO), this approach has the following advantages:

* Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`.
* Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour.

## Outline

```rust
pub struct BorrowBuf<'a>

impl Debug for BorrowBuf<'_>

impl<'a> From<&'a mut [u8]> for BorrowBuf<'a>
impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a>

impl<'a> BorrowBuf<'a> {
    pub fn capacity(&self) -> usize
    pub fn len(&self) -> usize
    pub fn init_len(&self) -> usize
    pub fn filled(&self) -> &[u8]
    pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a>
    pub fn clear(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
}

pub struct BorrowCursor<'buf, 'data>

impl<'buf, 'data> BorrowCursor<'buf, 'data> {
    pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data>
    pub fn capacity(&self) -> usize
    pub fn written(&self) -> usize
    pub fn init_ref(&self) -> &[u8]
    pub fn init_mut(&mut self) -> &mut [u8]
    pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn advance(&mut self, n: usize) -> &mut Self
    pub fn ensure_init(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
    pub fn append(&mut self, buf: &[u8])
}
```

## TODO

* ~~Migrate non-unix libs and tests~~
* ~~Naming~~
  * ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~
  * ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~.
  * ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~
* ~~Documentation~~
* Tests are incomplete (I adjusted existing tests, but did not add new ones).

cc https://github.com/rust-lang/rust/issues/78485, https://github.com/rust-lang/rust/issues/94741
supersedes: https://github.com/rust-lang/rust/pull/95770, https://github.com/rust-lang/rust/pull/93359
fixes #93305
2022-08-28 09:35:11 +02:00
Ralf Jung
8c8dc125b1 make many std tests work in Miri 2022-08-18 18:07:39 -04:00
Nick Cameron
1a2122fff0 non-linux platforms
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-08-05 17:18:51 +01:00
Nick Cameron
c1aae4d279 std::io: migrate ReadBuf to BorrowBuf/BorrowCursor
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-08-04 15:29:32 +01:00
Yuki Okushi
0ecbcbb0ac Rollup merge of #95040 - frank-king:fix/94981, r=Mark-Simulacrum
protect `std::io::Take::limit` from overflow in `read`

Resolves #94981
2022-07-25 18:46:47 +09:00
Frank King
64ac04567b protect std::io::Take::limit from overflow in read
fixs #94981
2022-05-29 11:44:30 +08:00
Mara Bos
1890372c9e Update tests. 2022-03-11 17:38:29 +01:00
Thom Chiovoloni
554918e311 Hide Repr details from io::Error, and rework io::Error::new_const. 2022-02-04 18:47:29 -08:00
DrMeepster
98c6200b16 read_buf 2021-11-02 22:47:20 -07:00
John Kugelman
a990c76d84 Optimize File::read_to_end and read_to_string
Reading a file into an empty vector or string buffer can incur
unnecessary `read` syscalls and memory re-allocations as the buffer
"warms up" and grows to its final size. This is perhaps a necessary evil
with generic readers, but files can be read in smarter by checking the
file size and reserving that much capacity.

`std::fs::read` and `read_to_string` already perform this optimization:
they open the file, reads its metadata, and call `with_capacity` with
the file size. This ensures that the buffer does not need to be resized
and an initial string of small `read` syscalls.

However, if a user opens the `File` themselves and calls
`file.read_to_end` or `file.read_to_string` they do not get this
optimization.

```rust
let mut buf = Vec::new();
file.read_to_end(&mut buf)?;
```

I searched through this project's codebase and even here are a *lot* of
examples of this. They're found all over in unit tests, which isn't a
big deal, but there are also several real instances in the compiler and
in Cargo. I've documented the ones I found in a comment here:

https://github.com/rust-lang/rust/issues/89516#issuecomment-934423999

Most telling, the `Read` trait and the `read_to_end` method both show
this exact pattern as examples of how to use readers. What this says to
me is that this shouldn't be solved by simply fixing the instances of it
in this codebase. If it's here it's certain to be prevalent in the wider
Rust ecosystem.

To that end, this commit adds specializations of `read_to_end` and
`read_to_string` directly on `File`. This way it's no longer a minor
footgun to start with an empty buffer when reading a file in.

A nice side effect of this change is that code that accesses a `File` as
a bare `Read` constraint or via a `dyn Read` trait object will benefit.
For example, this code from `compiler/rustc_serialize/src/json.rs`:

```rust
pub fn from_reader(rdr: &mut dyn Read) -> Result<Json, BuilderError> {
    let mut contents = Vec::new();
    match rdr.read_to_end(&mut contents) {
```

Related changes:

- I also added specializations to `BufReader` to delegate to
  `self.inner`'s methods. That way it can call `File`'s optimized
  implementations if the inner reader is a file.

- The private `std::io::append_to_string` function is now marked
  `unsafe`.

- `File::read_to_string` being more efficient means that the performance
  note for `io::read_to_string` can be softened. I've added @camelid's
  suggested wording from:

  https://github.com/rust-lang/rust/issues/80218#issuecomment-936806502
2021-10-07 18:42:02 -04:00
John Kugelman
9b9c24ec7f Fix read_to_end to not grow an exact size buffer
If you know how much data to expect and use `Vec::with_capacity` to
pre-allocate a buffer of that capacity, `Read::read_to_end` will still
double its capacity. It needs some space to perform a read, even though
that read ends up returning `0`.

It's a bummer to carefully pre-allocate 1GB to read a 1GB file into
memory and end up using 2GB.

This fixes that behavior by special casing a full buffer and reading
into a small "probe" buffer instead. If that read returns `0` then it's
confirmed that the buffer was the perfect size. If it doesn't, the probe
buffer is appended to the normal buffer and the read loop continues.

Fixing this allows several workarounds in the standard library to be
removed:

- `Take` no longer needs to override `Read::read_to_end`.
- The `reservation_size` callback that allowed `Take` to inhibit the
  previous over-allocation behavior isn't needed.
- `fs::read` doesn't need to reserve an extra byte in
  `initial_buffer_size`.

Curiously, there was a unit test that specifically checked that
`Read::read_to_end` *does* over-allocate. I removed that test, too.
2021-09-22 00:54:27 -04:00
Aris Merchant
6d34a2e007 Stabilize Seek::rewind 2021-07-01 15:08:20 -07:00
bors
ce1d5611a2 Auto merge of #85815 - YuhanLiin:buf-read-data-left, r=m-ou-se
Add has_data_left() to BufRead

This is a continuation of #40747 and also addresses #40745. The problem with the previous PR was that it had "eof" in its method name. This PR uses a more descriptive method name, but I'm open to changing it.
2021-06-18 20:11:51 +00:00
Mara Bos
b7dd942e15 Rollup merge of #86202 - a1phyr:spec_io_bytes_size_hint, r=m-ou-se
Specialize `io::Bytes::size_hint` for more types

Improve the result of `<io::Bytes as Iterator>::size_hint` for some readers. I did not manage to specialize `SizeHint` for `io::Cursor`

Side question: would it be interesting for `io::Read` to have an optional `size_hint` method ?
2021-06-17 23:40:58 +02:00
Benoît du Garreau
2cbd5d1df5 Specialize io::Bytes::size_hint for more types 2021-06-10 19:16:55 +02:00
Thomas de Zeeuw
fd14c52075 Rename IoSlice(Mut)::advance_slice to advance_slices 2021-06-05 13:06:10 +02:00
YuhanLiin
e76929ff98 Add has_data_left() to BufRead 2021-05-29 17:47:51 -04:00
Thomas de Zeeuw
3803c090f8 Rename IoSlice(Mut)::advance to advance_slice
To make way for a new IoSlice(Mut)::advance function that advances a
single slice.

Also changes the signature to accept a `&mut &mut [IoSlice]`, not
returning anything. This will better match the future IoSlice::advance
function.
2021-05-29 10:08:00 +02:00
Mara Bos
7b71719faf Use io::Error::new_const everywhere to avoid allocations. 2021-03-21 20:22:38 +01:00
Xavientois
389e638c05 Add tests for SizeHint implementations 2021-01-31 08:34:42 -05:00
Xavientois
c8e0f8aaa3 Use fully qualified syntax to avoid dyn 2021-01-31 08:31:35 -05:00
The8472
18bfe2a66b move copy specialization tests to their own module 2020-11-13 22:38:27 +01:00
The8472
ad9b07c7e5 add benchmarks 2020-11-13 19:46:37 +01:00
The8472
67a6059aa5 move tests module into separate file 2020-11-13 19:45:38 +01:00
Lzu Tao
a4e926daee std: move "mod tests/benches" to separate files
Also doing fmt inplace as requested.
2020-08-31 02:56:59 +00:00