Commit Graph

28 Commits

Author SHA1 Message Date
Nicole Mazzuca
7f25580512 [stdio][windows] Use MBTWC and WCTMB 2023-02-25 11:15:23 -08:00
Lukas Markeffsky
76e216f29b Use associated items of char instead of freestanding items in core::char 2023-01-14 11:58:41 +01:00
Thom Chiovoloni
1b8b2dc2ff Avoid MaybeUninit::uninit_array() 2022-08-30 06:10:55 -07:00
Thom Chiovoloni
2f9bd1a236 Avoid zeroing large stack buffers in stdio on Windows 2022-08-30 03:06:22 -07:00
Thom Chiovoloni
554918e311 Hide Repr details from io::Error, and rework io::Error::new_const. 2022-02-04 18:47:29 -08:00
Matthias Krüger
50a66b75dc Rollup merge of #91754 - Patrick-Poitras:rm-4byte-minimum-stdio-windows, r=Mark-Simulacrum
Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().

This is an attempted fix of issue #91722, where a too-small buffer was passed to the read function of stdio on Windows. This caused an error to be returned when `read_to_end` or `read_to_string` were called. Both delegate to `std::io::default_read_to_end`, which creates a buffer that is of length >0, and forwards it to `std::io::Stdin::read()`. The latter method returns an error if the length of the buffer is less than 4, as there might not be enough space to allocate a UTF-16 character. This creates a problem when the buffer length is in `0 < N < 4`, causing the bug.

The current modification creates an internal buffer, much like the one used for the write functions

I'd also like to acknowledge the help of ``@agausmann`` and ``@hkratz`` in detecting and isolating the bug, and for suggestions that made the fix possible.

Couple disclaimers:

- Firstly, I didn't know where to put code to replicate the bug found in the issue. It would probably be wise to add that case to the testing suite, but I'm afraid that I don't know _where_ that test should be added.
- Secondly, the code is fairly fundamental to IO operations, so my fears are that this may cause some undesired side effects ~or performance loss in benchmarks.~ The testing suite runs on my computer, and it does fix the issue noted in #91722.
- Thirdly, I left the "surrogate" field in the Stdin struct, but from a cursory glance, it seems to be serving the same purpose for other functions. Perhaps merging the two would be appropriate.

Finally, this is my first pull request to the rust language, and as such some things may be weird/unidiomatic/plain out bad. If there are any obvious improvements I could do to the code, or any other suggestions, I would appreciate them.

Edit: Closes #91722
2022-01-04 16:34:14 +01:00
PFPoitras
d49d1d4499 Modifications to buffer UTF-16 internally so that there is no longer a 4-byte buffer minimum. Include suggestions from @agausmann and @Mark-Simulacrum. 2021-12-15 18:35:29 -04:00
Frank Steffahn
a957cefda6 Fix a bunch of typos 2021-12-14 16:40:43 +01:00
Arlo Siemsen
273e522af6 Fix ctrl-c causing reads of stdin to return empty on Windows.
Fixes #89177
2021-10-01 08:53:13 -07:00
bors
cc9bb1522e Auto merge of #83342 - Count-Count:win-console-incomplete-utf8, r=m-ou-se
Allow writing of incomplete UTF-8 sequences to the Windows console via stdout/stderr

# Problem
Writes of just an incomplete UTF-8 byte sequence (e.g. `b"\xC3"` or `b"\xF0\x9F"`)  to stdout/stderr with a Windows console attached error with `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` even though further writes could complete the codepoint. This is currently a rare occurence since the [linewritershim](2c56ea38b0/library/std/src/io/buffered/linewritershim.rs) implementation flushes complete lines immediately and buffers up to 1024 bytes for incomplete lines. It can still happen as described in #83258.

The problem will become more pronounced once the developer can switch stdout/stderr from line-buffered to block-buffered or immediate when the changes in the "Switchable buffering for Stdout" pull request (#78515) get merged.

# Patch description
If there is at least one valid UTF-8 codepoint all valid UTF-8 is passed through to the extracted `write_valid_utf8_to_console()` fn. The new code only comes into play if `write()` is being passed a short byte slice comprising an incomplete UTF-8 codepoint. In this case up to three bytes are buffered in the `IncompleteUtf8` struct associated with `Stdout` / `Stderr`. The bytes are accepted one at a time. As soon as an error can be detected `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` is returned. Once a complete UTF-8 codepoint is received it is passed to the `write_valid_utf8_to_console()` and the buffer length is set to zero.

Calling `flush()` will neither error nor write anything if an incomplete codepoint is present in the buffer.

# Tests
Currently there are no Windows-specific tests for console writing code at all. Writing (regression) tests for this problem is a bit challenging since unit tests and UI tests don't run in a console and suddenly popping up another console window might be surprising to developers running the testsuite and it might not work at all in CI builds. To just test the new functionality in unit tests the code would need to be refactored. Some guidance on how to proceed would be appreciated.

# Public API changes
* `std::str::verifications::utf8_char_width()` would be exposed as `std::str::utf8_char_width()` behind the "str_internals" feature gate.

# Related issues
* Fixes #83258.
* PR #78515 will exacerbate the problem.

# Open questions
* Add tests?
* Squash into one commit with better commit message?
2021-09-02 03:31:17 +00:00
Frank Steffahn
2f9ddf3bc7 Fix typos “an”→“a” and a few different ones that appeared in the same search 2021-08-22 18:15:49 +02:00
Dan Gohman
d15418586c I/O safety.
Introduce `OwnedFd` and `BorrowedFd`, and the `AsFd` trait, and
implementations of `AsFd`, `From<OwnedFd>` and `From<T> for OwnedFd`
for relevant types, along with Windows counterparts for handles and
sockets.

Tracking issue:
 - <https://github.com/rust-lang/rust/issues/87074>

RFC:
 - <https://github.com/rust-lang/rfcs/blob/master/text/3128-io-safety.md>
2021-08-19 12:02:39 -07:00
Ali Malik
e43254aad1 Fix may not to appropriate might not or must not 2021-07-29 01:15:20 -04:00
Count Count
dd3b79e9ff comment pos 2021-03-24 18:23:01 +01:00
Count Count
7cfbe54294 assert!() instead of panic!() for expected invariant 2021-03-24 18:22:25 +01:00
Count Count
3103f5f550 rename fn write_valid_utf8() to write_valid_utf8_to_console() 2021-03-24 10:24:05 +01:00
Count Count
34cfe383e5 correct comment 2021-03-24 10:06:31 +01:00
Count Count
fb1fa97fdc use io::Error::new_const() everywhere 2021-03-24 07:17:24 +01:00
Count Count
52713a4781 fix 2021-03-24 07:11:55 +01:00
Count Count
d114694042 Reject byte if it cannot start a valid UTF-8 sequence. 2021-03-24 07:11:54 +01:00
Count Count
0202273d40 fix c&p error 2021-03-24 07:11:54 +01:00
Count Count
60b149f182 Export utf8_char_width() publicly in core::std behind the "str_internals" feature gate and use it in sys::windows::stdio instead of reimplementing it there. 2021-03-24 07:11:54 +01:00
Count Count
a941e68e08 fix fmt 2021-03-24 07:11:51 +01:00
Count Count
27393d5ca6 fix incomplete UTF-8 writes in Windows console stdio 2021-03-24 07:10:22 +01:00
Mara Bos
7b71719faf Use io::Error::new_const everywhere to avoid allocations. 2021-03-21 20:22:38 +01:00
Tomasz Miąsko
4a00421ba4 Make raw standard stream constructors const 2020-08-21 13:17:20 +02:00
Tomasz Miąsko
479c23bb49 Remove result type from raw standard streams constructors
Raw standard streams constructors are infallible. Remove unnecessary
result type.
2020-08-21 13:17:20 +02:00
mark
2c31b45ae8 mv std libs to library/ 2020-07-27 19:51:13 -05:00