Commit Graph

467 Commits

Author SHA1 Message Date
Corey Farwell
6e7533f3ae Rollup merge of #40722 - stjepang:doc-consistency-fixes, r=steveklabnik
Various fixes to wording consistency in the docs

A bunch of random fixes, added punctuation, plurals, backticks, and so on...

r? @steveklabnik
2017-03-22 19:30:32 -04:00
Simonas Kazlauskas
99e4c0ad8b Tracking issue numbers 2017-03-22 18:43:01 +02:00
Simonas Kazlauskas
2f0dd63bbe Checked (and unchecked) slicing for strings?
What is this magic‽
2017-03-22 18:43:01 +02:00
Stjepan Glavina
d6da1d9b46 Various fixes to wording consistency in the docs 2017-03-22 17:19:52 +01:00
Sam Whited
49db656b06 str: Make docs consistently punctuated 2017-03-21 16:09:31 -04:00
bors
6738cd4d47 Auto merge of #40281 - jimmycuadra:try-from-from-str, r=aturon
Rename TryFrom's associated type and implement str::parse using TryFrom.

Per discussion on the tracking issue, naming `TryFrom`'s associated type `Error` is generally more consistent with similar traits in the Rust ecosystem, and what people seem to assume it should be called. It also helps disambiguate from `Result::Err`, the most common "Err".

See https://github.com/rust-lang/rust/issues/33417#issuecomment-269108968.

`TryFrom<&str>` and `FromStr` are equivalent, so have the latter provide the former to ensure that. Using `TryFrom` in the implementation of `str::parse` means types that implement either trait can use it. When we're ready to stabilize `TryFrom`, we should update `FromStr` to
suggest implementing `TryFrom<&str>` instead for new code.

See https://github.com/rust-lang/rust/issues/33417#issuecomment-277175994
and https://github.com/rust-lang/rust/issues/33417#issuecomment-277253827.

Refs #33417.
2017-03-20 05:36:36 +00:00
Corey Farwell
69717170a4 Rollup merge of #40456 - frewsxcv:frewsxcv-docs-function-parens, r=GuillaumeGomez
Remove function invokation parens from documentation links.

This was never established as a convention we should follow in the 'More
API Documentation Conventions' RFC:

https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md
2017-03-17 08:48:51 -04:00
Jimmy Cuadra
2561dcddf9 Rename TryFrom's associated type and implement str::parse using TryFrom.
Per discussion on the tracking issue, naming `TryFrom`'s associated type
`Error` is generally more consistent with similar traits in the Rust
ecosystem, and what people seem to assume it should be called. It
also helps disambiguate from `Result::Err`, the most common "Err".

See
https://github.com/rust-lang/rust/issues/33417#issuecomment-269108968.

TryFrom<&str> and FromStr are equivalent, so have the latter provide the
former to ensure that. Using TryFrom in the implementation of
`str::parse` means types that implement either trait can use it.
When we're ready to stabilize `TryFrom`, we should update `FromStr` to
suggest implementing `TryFrom<&str>` instead for new code.

See
https://github.com/rust-lang/rust/issues/33417#issuecomment-277175994
and
https://github.com/rust-lang/rust/issues/33417#issuecomment-277253827.

Refs #33417.
2017-03-15 07:51:54 -07:00
Simon Sapin
73370c543e Add tracking issue number for Utf8Error::error_len 2017-03-14 10:03:08 +01:00
Simon Sapin
b5f16a10e9 Replace Utf8Error::resume_from with Utf8Error::error_len
Their relationship is:

* `resume_from = error_len.map(|l| l + valid_up_to)`
* error_len is always one of None, Some(1), Some(2), or Some(3).

When I started using resume_from I almost always ended up subtracting
valid_up_to to obtain error_len.
Therefore the latter is what should be provided in the first place.
2017-03-14 10:02:55 +01:00
Simon Sapin
182044248c Add Utf8Error::resume_from, to help incremental and/or lossy decoding.
Without this, code outside of the standard library needs to reimplement
most of the logic `from_utf8` to interpret the bytes after `valid_up_to()`.
2017-03-14 10:02:45 +01:00
Corey Farwell
e7b0f2badf Remove function invokation parens from documentation links.
This was never established as a convention we should follow in the 'More
API Documentation Conventions' RFC:

https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md
2017-03-13 21:43:18 -04:00
Simon Sapin
031f9b15df Only keep one copy of the UTF8_CHAR_WIDTH table.
… instead of one of each of libcore and libstd_unicode.

Move the `utf8_char_width` function to `core::str`
under the `str_internals` unstable feature.
2017-03-01 23:25:27 +01:00
Matt Brubeck
b2ac1c9c6b Additional docs for Vec, String, and slice trait impls 2017-02-16 12:12:17 -08:00
bors
408c2f7827 Auto merge of #37926 - bluss:from-utf8-small-simplification, r=sfackler
UTF-8 validation: Compute block end upfront

Simplify the conditional used for ensuring that the whole word loop is
only used if there are at least two whole words left to read.

This makes the function slightly smaller and simpler, a 0-5% reduction
in runtime for various test cases.
2017-01-12 05:14:50 +00:00
bors
468227129d Auto merge of #38066 - bluss:string-slice-error, r=sfackler
Use more specific panic message for &str slicing errors

Separate out of bounds errors from character boundary errors, and print
more details for character boundary errors.

It reports the first error it finds in:

1. begin out of bounds
2. end out of bounds
3. begin <= end violated
3. begin not char boundary
5. end not char boundary.

Example:

    &"abcαβγ"[..4]

    thread 'str::test_slice_fail_boundary_1' panicked at 'byte index 4 is not
    a char boundary; it is inside 'α' (bytes 3..5) of `abcαβγ`'

Fixes #38052
2017-01-03 23:51:42 +00:00
Ulrik Sverdrup
dd3e63aea5 core: Forward ExactSizeIterator::is_empty for Bytes 2016-12-04 15:46:36 +01:00
Ulrik Sverdrup
d83fff3b3b Use more specific panic message for &str slicing errors
Separate out of bounds errors from character boundary errors, and print
more details for character boundary errors.

Example:

    &"abcαβγ"[..4]

    thread 'str::test_slice_fail_boundary_1' panicked at 'byte index 4 is not
    a char boundary; it is inside `α` (bytes 3..5) of `abcαβγ`'
2016-11-30 18:59:58 +01:00
Ulrik Sverdrup
0dffc1e193 utf8 validation: Cleanup code by renaming index variable 2016-11-22 13:47:45 +01:00
Ulrik Sverdrup
4a8b04eda0 utf8 validation: Cleanup code in the ascii fast path 2016-11-22 13:47:45 +01:00
Ulrik Sverdrup
20bd7f000f utf8 validation: Compute block end upfront
Simplify the conditional used for ensuring that the whole word loop is
only used if there are at least two whole words left to read.

This makes the function slightly smaller and simpler, a 0-5% reduction
in runtime for various test cases.
2016-11-21 23:26:31 +01:00
bors
fc2373c5a2 Auto merge of #37888 - bluss:chars-count, r=alexcrichton
Improve .chars().count()

Use a simpler loop to count the `char` of a string: count the
number of non-continuation bytes. Use `count += <conditional>` which the
compiler understands well and can apply loop optimizations to.

benchmark descriptions and results for two configurations:

- ascii: ascii text
- cy: cyrillic text
- jp: japanese text
- words ascii: counting each split_whitespace item from the ascii text
- words jp: counting each split_whitespace item from the jp text

```
x86-64 rustc -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff %
 count_ascii        1,453 (1755 MB/s)  1,398 (1824 MB/s)           -55   -3.79%
 count_cy           5,990 (856 MB/s)   2,545 (2016 MB/s)        -3,445  -57.51%
 count_jp           3,075 (1169 MB/s)  1,772 (2029 MB/s)        -1,303  -42.37%
 count_words_ascii  4,157 (521 MB/s)   1,797 (1205 MB/s)        -2,360  -56.77%
 count_words_jp     3,337 (1071 MB/s)  1,772 (2018 MB/s)        -1,565  -46.90%

x86-64 rustc -Ctarget-feature=+avx -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff %
 count_ascii        1,444 (1766 MB/s)  763 (3343 MB/s)            -681  -47.16%
 count_cy           5,871 (874 MB/s)   1,527 (3360 MB/s)        -4,344  -73.99%
 count_jp           2,874 (1251 MB/s)  1,073 (3351 MB/s)        -1,801  -62.67%
 count_words_ascii  4,131 (524 MB/s)   1,871 (1157 MB/s)        -2,260  -54.71%
 count_words_jp     3,253 (1099 MB/s)  1,331 (2686 MB/s)        -1,922  -59.08%
```

I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning `count_words_ascii` in particular (counting
many small strings); this solution is an improvement without tradeoffs.
2016-11-20 17:06:53 -06:00
Oliver Middleton
9e86e18092 Optimise CharIndices::last()
The default implementation of last() goes through the entire iterator
but that's not needed here.
2016-11-20 00:37:48 +00:00
Ulrik Sverdrup
5a3aa2f73c str: Improve .chars().count()
Use a simpler loop to count the `char` of a string: count the
number of non-continuation bytes. Use `count += <conditional>` which the
compiler understands well and can apply loop optimizations to.
2016-11-19 23:46:39 +01:00
Oliver Middleton
de2f61740d Optimise Chars::last()
The default implementation of last() goes through the entire iterator
but that's not needed here.
2016-11-19 18:43:41 +00:00
David Henningsson
971e0be77c str: Fix documentation typo
from_utf8 returns a Result, not an Option.

Signed-off-by: David Henningsson <diwic@ubuntu.com>
2016-09-30 06:13:14 +02:00
Jonathan Turner
b60fc5d16a Rollup merge of #36423 - GuillaumeGomez:eq_impl, r=pnkfelix
Add missing Eq implementations

Part of #36301.
2016-09-22 11:25:01 -07:00
Guillaume Gomez
b4c739dbdd Add missing Eq implementations 2016-09-18 14:26:49 +02:00
athulappadan
49e77dbf25 Documentation of what does for each type 2016-09-11 17:00:09 +05:30
Jeffrey Seyfried
e2ad3be178 Use #[prelude_import] in libcore. 2016-08-24 22:12:23 +00:00
Guillaume Gomez
bc940193c8 Rollup merge of #35910 - tbu-:pr_weird_linebreak, r=alexcrichton
Change a weird line break in `core::str`
2016-08-23 22:48:02 +02:00
Tobias Bucher
0f9cb1b97c Change a weird line break in core::str 2016-08-23 02:05:53 +02:00
Steven Allen
de91872a33 Add a FusedIterator trait.
This trait can be used to avoid the overhead of a fuse wrapper when an iterator
is already well-behaved.

Conforming to: RFC 1581
Closes: #35602
2016-08-18 12:16:29 -04:00
Corey Farwell
f98c55d933 Add documentation example for str::Chars::as_str. 2016-07-28 08:54:48 -04:00
bors
4b89debc7b Auto merge of #34425 - tbu-:pr_len_instead_of_size_hint, r=alexcrichton
Use `len` instead of `size_hint` where appropiate

This makes it clearer that we're not just looking for a lower bound but
rather know that the iterator is an `ExactSizeIterator`.
2016-06-24 09:03:54 -07:00
Alex Crichton
c3e8c178ab std: Fix up stabilization discrepancies
* Remove the deprecated `CharRange` type which was forgotten to be removed
  awhile back.
* Stabilize the `os::$platform::raw::pthread_t` type which was intended to be
  stabilized as part of #32804
2016-06-23 14:08:11 -07:00
Tobias Bucher
8ff5c4394c Use len instead of size_hint where appropiate
This makes it clearer that we're not just looking for a lower bound but
rather know that the iterator is an `ExactSizeIterator`.
2016-06-23 12:26:15 +02:00
bors
728eea7dc1 Auto merge of #33853 - alexcrichton:remove-deprecated, r=aturon
std: Clean out old unstable + deprecated APIs

These should all have been deprecated for at least one cycle, so this commit
cleans them all out.
2016-06-01 15:11:38 -07:00
Alex Crichton
b64c9d5670 std: Clean out old unstable + deprecated APIs
These should all have been deprecated for at least one cycle, so this commit
cleans them all out.
2016-05-30 20:46:32 -07:00
M Farkas-Dyck
db84fc1403 make core::str::next_code_point work on arbitrary iterator 2016-05-27 08:54:52 -08:00
bors
054a4b4019 Auto merge of #32909 - sanxiyn:unused-trait-import-2, r=alexcrichton
Remove unused trait imports
2016-04-16 18:31:11 -07:00
Seo Sanghyeon
01fb27f648 Remove unused trait imports 2016-04-12 22:58:55 +09:00
bors
bed32d83fc Auto merge of #32804 - alexcrichton:stabilize-1.9, r=brson
std: Stabilize APIs for the 1.9 release

This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes #27719
cc #27751 (deprecating the `Slice` bits)
Closes #27754
Closes #27780
Closes #27809
Closes #27811
Closes #27830
Closes #28050
Closes #29453
Closes #29791
Closes #29935
Closes #30014
Closes #30752
Closes #31262
cc #31398 (still need to deal with `before_exec`)
Closes #31405
Closes #31572
Closes #31755
Closes #31756
2016-04-12 04:17:36 -07:00
Alex Crichton
552eda70d3 std: Stabilize APIs for the 1.9 release
This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes #27719
cc #27751 (deprecating the `Slice` bits)
Closes #27754
Closes #27780
Closes #27809
Closes #27811
Closes #27830
Closes #28050
Closes #29453
Closes #29791
Closes #29935
Closes #30014
Closes #30752
Closes #31262
cc #31398 (still need to deal with `before_exec`)
Closes #31405
Closes #31572
Closes #31755
Closes #31756
2016-04-11 08:57:53 -07:00
Raph Levien
b2db97347b Bit-magic for faster is_char_boundary
The asm generated for b < 128 || b >= 192 is not ideal, as it computes
both sub-inequalities. This patch replaces it with bit magic.

Fixes #32471
2016-04-09 17:16:54 -07:00
Ulrik Sverdrup
5d56e1daed Specialize equality for [T] and comparison for [u8]
Where T is a type that can be compared for equality bytewise, we can use
memcmp. We can also use memcmp for PartialOrd, Ord for [u8] and by
extension &str.

This is an improvement for example for the comparison [u8] == [u8] that
used to emit a loop that compared the slices byte by byte.

One worry here could be that this introduces function calls to memcmp
in contexts where it should really inline the comparison or even
optimize it out, but llvm takes care of recognizing memcmp specifically.
2016-04-05 14:06:20 +02:00
Manish Goregaokar
671027817c Rollup merge of #32456 - bluss:str-zero, r=alexcrichton
Hardcode accepting 0 as a valid str char boundary

If we check explicitly for index == 0, that removes the need to read the
byte at index 0, so it avoids a trip to the string's memory, and it
optimizes out the slicing index' bounds check whenever it is (a constant) zero.
2016-03-26 13:42:04 +05:30
Ulrik Sverdrup
f621193e5e Accept 0 as a valid str char boundary
Index 0 must be a valid char boundary (invariant of str that it contains
valid UTF-8 data).

If we check explicitly for index == 0, that removes the need to read the
byte at index 0, so it avoids a trip to the string's memory, and it
optimizes out the slicing index' bounds check whenever it is zero.

With this change, the following examples all change from having a read of
the byte at 0 and a branch to possibly panicing, to having the bounds
checking optimized away.

```rust
pub fn split(s: &str) -> (&str, &str) {
    s.split_at(0)
}

pub fn both(s: &str) -> &str {
    &s[0..s.len()]
}

pub fn first(s: &str) -> &str {
    &s[..0]
}

pub fn last(s: &str) -> &str {
    &s[0..]
}
```
2016-03-24 01:25:59 +01:00
Ulrik Sverdrup
80e7a1be35 Mark str::split_at inline 2016-03-23 22:03:06 +01:00
Jorge Aparicio
0f02309e4b try! -> ?
Automated conversion using the untry tool [1] and the following command:

```
$ find -name '*.rs' -type f | xargs untry
```

at the root of the Rust repo.

[1]: https://github.com/japaric/untry
2016-03-22 22:01:37 -05:00