This commit transitions the compiler to using the new exception handling
instructions in LLVM for implementing unwinding for MSVC. This affects both 32
and 64-bit MSVC as they're both now using SEH-based strategies. In terms of
standard library support, lots more details about how SEH unwinding is
implemented can be found in the commits.
In terms of trans, this change necessitated a few modifications:
* Branches were added to detect when the old landingpad instruction is used or
the new cleanuppad instruction is used to `trans::cleanup`.
* The return value from `cleanuppad` is not stored in an `alloca` (because it
cannot be).
* Each block in trans now has an `Option<LandingPad>` instead of `is_lpad: bool`
for indicating whether it's in a landing pad or not. The new exception
handling intrinsics require that on MSVC each `call` inside of a landing pad
is annotated with which landing pad that it's in. This change to the basic
block means that whenever a `call` or `invoke` instruction is generated we
know whether to annotate it as part of a cleanuppad or not.
* Lots of modifications were made to the instruction builders to construct the
new instructions as well as pass the tagging information for the call/invoke
instructions.
* The translation of the `try` intrinsics for MSVC has been overhauled to use
the new `catchpad` instruction. The filter function is now also a
rustc-generated function instead of a purely libstd-defined function. The
libstd definition still exists, it just has a stable ABI across architectures
and leaves some of the really weird implementation details to the compiler
(e.g. the `localescape` and `localrecover` intrinsics).
This commit removes the `-D warnings` flag being passed through the makefiles to
all crates to instead be a crate attribute. We want these attributes always
applied for all our standard builds, and this is more amenable to Cargo-based
builds as well.
Note that all `deny(warnings)` attributes are gated with a `cfg(stage0)`
attribute currently to match the same semantics we have today
This commit implements the stabilization of the custom hasher support intended
for 1.7 but left out due to some last-minute questions that needed some
decisions. A summary of the actions done in this PR are:
Stable
* `std:#️⃣:BuildHasher`
* `BuildHasher::Hasher`
* `BuildHasher::build_hasher`
* `std:#️⃣:BuildHasherDefault`
* `HashMap::with_hasher`
* `HashMap::with_capacity_and_hasher`
* `HashSet::with_hasher`
* `HashSet::with_capacity_and_hasher`
* `std::collections::hash_map::RandomState`
* `RandomState::new`
Deprecated
* `std::collections::hash_state`
* `std::collections::hash_state::HashState` - this trait was also moved into
`std::hash` with a reexport here to ensure that we can have a blanket impl to
prevent immediate breakage on nightly. Note that this is unstable in both
location.
* `HashMap::with_hash_state` - renamed
* `HashMap::with_capacity_and_hash_state` - renamed
* `HashSet::with_hash_state` - renamed
* `HashSet::with_capacity_and_hash_state` - renamed
Closes#27713
This commit implements the stabilization of the custom hasher support intended
for 1.7 but left out due to some last-minute questions that needed some
decisions. A summary of the actions done in this PR are:
Stable
* `std:#️⃣:BuildHasher`
* `BuildHasher::Hasher`
* `BuildHasher::build_hasher`
* `std:#️⃣:BuildHasherDefault`
* `HashMap::with_hasher`
* `HashMap::with_capacity_and_hasher`
* `HashSet::with_hasher`
* `HashSet::with_capacity_and_hasher`
* `std::collections::hash_map::RandomState`
* `RandomState::new`
Deprecated
* `std::collections::hash_state`
* `std::collections::hash_state::HashState` - this trait was also moved into
`std::hash` with a reexport here to ensure that we can have a blanket impl to
prevent immediate breakage on nightly. Note that this is unstable in both
location.
* `HashMap::with_hash_state` - renamed
* `HashMap::with_capacity_and_hash_state` - renamed
* `HashSet::with_hash_state` - renamed
* `HashSet::with_capacity_and_hash_state` - renamed
Closes#27713
The previous example did not do what its description said. In it panicked on integer overflow in debug mode, and went into an infinite loop in release mode (wrapping back to 0 after printing 254).
This commit removes the `-D warnings` flag being passed through the makefiles to
all crates to instead be a crate attribute. We want these attributes always
applied for all our standard builds, and this is more amenable to Cargo-based
builds as well.
Note that all `deny(warnings)` attributes are gated with a `cfg(stage0)`
attribute currently to match the same semantics we have today
Use cold functions for panic formatting Option::expect, Result::unwrap, expect
These methods are marked inline, but insert a big chunk of formatting
code, as well as other error path related code, such as
deallocating a std::io::Error if you have one.
We can explicitly separate out that code path into a function that is
never inline, since the panicking case should always be rare.
Option::expect, Result::unwrap, unwrap_err, expect
These methods are marked inline, but insert a big chunk of formatting
code, as well as other error path related code, such as deallocating
a std::io::Error if you have one.
We can explicitly separate out that code path into a function that is
never inline, since the panicking case should always be rare.
Use raw pointers to avoid aliasing in str::split_at_mut
Introduce private function from_raw_parts_mut for str to factor out the logic.
We want to use raw pointers here instead of duplicating a &mut str, to
be on safer ground w.r.t rust aliasing rules.
This has already been fixed for slices in PR #27358, issue #27357
Introduce private function from_raw_parts_mut for str to factor out the logic.
We want to use raw pointers here instead of duplicating a &mut str, to
be on safer ground w.r.t rust aliasing rules.
Fix spacing style of `T: Bound` in docs
The space between `T` and `Bound` is the typical style used in code and
produced by rustdoc's rendering. Fixed first in Reflect's docs and then
I fixed all occurrences in docs I could find.
For good codegen here, we need a lock step iteration where the loop
bound is only checked once per iteration; .zip() unfortunately does not
optimize this way.
If we use a counted loop, and make sure that llvm sees that the bounds
check condition is the same as the loop bound condition, the bounds
checks are optimized out. For this reason we need to slice `from`
(apparently) redundantly.
This commit restores the old formulation of clone_from_slice. In this
shape, clone_from_slice will again optimize into calling memcpy where possible
(for example for &[u8] or &[i32]).
The space between `T` and `Bound` is the typical style used in code and
produced by rustdoc's rendering. Fixed first in Reflect's docs and then
I fixed all occurrences in docs I could find.
This commit stabilizes and deprecates the FCP (final comment period) APIs for
the upcoming 1.7 beta release. The specific APIs which changed were:
Stabilized
* `Path::strip_prefix` (renamed from `relative_from`)
* `path::StripPrefixError` (new error type returned from `strip_prefix`)
* `Ipv4Addr::is_loopback`
* `Ipv4Addr::is_private`
* `Ipv4Addr::is_link_local`
* `Ipv4Addr::is_multicast`
* `Ipv4Addr::is_broadcast`
* `Ipv4Addr::is_documentation`
* `Ipv6Addr::is_unspecified`
* `Ipv6Addr::is_loopback`
* `Ipv6Addr::is_unique_local`
* `Ipv6Addr::is_multicast`
* `Vec::as_slice`
* `Vec::as_mut_slice`
* `String::as_str`
* `String::as_mut_str`
* `<[T]>::clone_from_slice` - the `usize` return value is removed
* `<[T]>::sort_by_key`
* `i32::checked_rem` (and other signed types)
* `i32::checked_neg` (and other signed types)
* `i32::checked_shl` (and other signed types)
* `i32::checked_shr` (and other signed types)
* `i32::saturating_mul` (and other signed types)
* `i32::overflowing_add` (and other signed types)
* `i32::overflowing_sub` (and other signed types)
* `i32::overflowing_mul` (and other signed types)
* `i32::overflowing_div` (and other signed types)
* `i32::overflowing_rem` (and other signed types)
* `i32::overflowing_neg` (and other signed types)
* `i32::overflowing_shl` (and other signed types)
* `i32::overflowing_shr` (and other signed types)
* `u32::checked_rem` (and other unsigned types)
* `u32::checked_shl` (and other unsigned types)
* `u32::saturating_mul` (and other unsigned types)
* `u32::overflowing_add` (and other unsigned types)
* `u32::overflowing_sub` (and other unsigned types)
* `u32::overflowing_mul` (and other unsigned types)
* `u32::overflowing_div` (and other unsigned types)
* `u32::overflowing_rem` (and other unsigned types)
* `u32::overflowing_neg` (and other unsigned types)
* `u32::overflowing_shl` (and other unsigned types)
* `u32::overflowing_shr` (and other unsigned types)
* `ffi::IntoStringError`
* `CString::into_string`
* `CString::into_bytes`
* `CString::into_bytes_with_nul`
* `From<CString> for Vec<u8>`
* `From<CString> for Vec<u8>`
* `IntoStringError::into_cstring`
* `IntoStringError::utf8_error`
* `Error for IntoStringError`
Deprecated
* `Path::relative_from` - renamed to `strip_prefix`
* `Path::prefix` - use `components().next()` instead
* `os::unix::fs` constants - moved to the `libc` crate
* `fmt::{radix, Radix, RadixFmt}` - not used enough to stabilize
* `IntoCow` - conflicts with `Into` and may come back later
* `i32::{BITS, BYTES}` (and other integers) - not pulling their weight
* `DebugTuple::formatter` - will be removed
* `sync::Semaphore` - not used enough and confused with system semaphores
Closes#23284
cc #27709 (still lots more methods though)
Closes#27712Closes#27722Closes#27728Closes#27735Closes#27729Closes#27755Closes#27782Closes#27798
This is a bit weird since unsized types can't be used in trait objects,
but Any is *also* used as pure marker trait since Reflect isn't stable.
There are many cases (e.g. TypeMap) where all you need is a TypeId.
r? @aturon
This commit stabilizes and deprecates the FCP (final comment period) APIs for
the upcoming 1.7 beta release. The specific APIs which changed were:
Stabilized
* `Path::strip_prefix` (renamed from `relative_from`)
* `path::StripPrefixError` (new error type returned from `strip_prefix`)
* `Ipv4Addr::is_loopback`
* `Ipv4Addr::is_private`
* `Ipv4Addr::is_link_local`
* `Ipv4Addr::is_multicast`
* `Ipv4Addr::is_broadcast`
* `Ipv4Addr::is_documentation`
* `Ipv6Addr::is_unspecified`
* `Ipv6Addr::is_loopback`
* `Ipv6Addr::is_unique_local`
* `Ipv6Addr::is_multicast`
* `Vec::as_slice`
* `Vec::as_mut_slice`
* `String::as_str`
* `String::as_mut_str`
* `<[T]>::clone_from_slice` - the `usize` return value is removed
* `<[T]>::sort_by_key`
* `i32::checked_rem` (and other signed types)
* `i32::checked_neg` (and other signed types)
* `i32::checked_shl` (and other signed types)
* `i32::checked_shr` (and other signed types)
* `i32::saturating_mul` (and other signed types)
* `i32::overflowing_add` (and other signed types)
* `i32::overflowing_sub` (and other signed types)
* `i32::overflowing_mul` (and other signed types)
* `i32::overflowing_div` (and other signed types)
* `i32::overflowing_rem` (and other signed types)
* `i32::overflowing_neg` (and other signed types)
* `i32::overflowing_shl` (and other signed types)
* `i32::overflowing_shr` (and other signed types)
* `u32::checked_rem` (and other unsigned types)
* `u32::checked_neg` (and other unsigned types)
* `u32::checked_shl` (and other unsigned types)
* `u32::saturating_mul` (and other unsigned types)
* `u32::overflowing_add` (and other unsigned types)
* `u32::overflowing_sub` (and other unsigned types)
* `u32::overflowing_mul` (and other unsigned types)
* `u32::overflowing_div` (and other unsigned types)
* `u32::overflowing_rem` (and other unsigned types)
* `u32::overflowing_neg` (and other unsigned types)
* `u32::overflowing_shl` (and other unsigned types)
* `u32::overflowing_shr` (and other unsigned types)
* `ffi::IntoStringError`
* `CString::into_string`
* `CString::into_bytes`
* `CString::into_bytes_with_nul`
* `From<CString> for Vec<u8>`
* `From<CString> for Vec<u8>`
* `IntoStringError::into_cstring`
* `IntoStringError::utf8_error`
* `Error for IntoStringError`
Deprecated
* `Path::relative_from` - renamed to `strip_prefix`
* `Path::prefix` - use `components().next()` instead
* `os::unix::fs` constants - moved to the `libc` crate
* `fmt::{radix, Radix, RadixFmt}` - not used enough to stabilize
* `IntoCow` - conflicts with `Into` and may come back later
* `i32::{BITS, BYTES}` (and other integers) - not pulling their weight
* `DebugTuple::formatter` - will be removed
* `sync::Semaphore` - not used enough and confused with system semaphores
Closes#23284
cc #27709 (still lots more methods though)
Closes#27712Closes#27722Closes#27728Closes#27735Closes#27729Closes#27755Closes#27782Closes#27798
Add fast path for ASCII in UTF-8 validation
This speeds up the ASCII case (and long stretches of ASCII in otherwise
mixed UTF-8 data) when checking UTF-8 validity.
Benchmark results suggest that on purely ASCII input, we can improve
throughput (megabytes verified / second) by a factor of 13 to 14 (smallish input).
On XML and mostly English language input (en.wikipedia XML dump),
throughput improves by a factor 7 (large input).
On mostly non-ASCII input, performance increases slightly or is the
same.
The UTF-8 validation is rewritten to use indexed access; since all
access is preceded by a (mandatory for validation) length check, bounds
checks are statically elided by LLVM and this formulation is in fact the best
for performance. A previous version had losses due to slice to iterator
conversions.
A large credit to Björn Steinbrink who improved this patch immensely,
writing this second version.
Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3.
Old code is `regular`, this PR is called `fast`.
Datasets:
- `ascii` is just ASCII (2.5 kB)
- `cyr` is cyrillic script with ascii spaces (5 kB)
- `dewik10` is 10MB of a de.wikipedia XML dump
- `enwik8` is 100MB of an en.wikipedia XML dump
- `jawik10` is 10MB of a ja.wikipedia XML dump
```
test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s
test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s
test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s
test from_utf8_cyr_regular ... bench: 10,944 ns/iter (+/- 795) = 468 MB/s
test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s
test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s
test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s
test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s
test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s
test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s
```
Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>
This is a bit weird since unsized types can't be used in trait objects,
but Any is *also* used as pure marker trait since Reflect isn't stable.
There are many cases (e.g. TypeMap) where all you need is a TypeId.
This commit migrates all of the methods on `num::wrapping::OverflowingOps` onto
inherent methods of the integer types. This also fills out some missing gaps in
the saturating and checked departments such as:
* `saturating_mul`
* `checked_{neg,rem,shl,shr}`
This is done in preparation for stabilization,
cc #27755
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.
Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.
[benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.
Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.
[benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
This speeds up the ascii case (and long stretches of ascii in otherwise
mixed UTF-8 data) when checking UTF-8 validity.
Benchmark results suggest that on purely ASCII input, we can improve
throughput (megabytes verified / second) by a factor of 13 to 14!
On xml and mostly english language input (en.wikipedia xml dump),
throughput increases by a factor 7.
On mostly non-ASCII input, performance increases slightly or is the
same.
The UTF-8 validation is rewritten to use indexed access; since all
access is preceded by a (mandatory for validation) length check, they
are statically elided by llvm and this formulation is in fact the best
for performance. A previous version had losses due to slice to iterator
conversions.
A large credit to Björn Steinbrink who improved this patch immensely,
writing this second version.
Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3.
Old code is `regular`, this PR is called `fast`.
Datasets:
- `ascii` is just ascii (2.5 kB)
- `cyr` is cyrillic script with ascii spaces (5 kB)
- `dewik10` is 10MB of a de.wikipedia xml dump
- `enwik10` is 100MB of an en.wikipedia xml dump
- `jawik10` is 10MB of a ja.wikipedia xml dump
```
test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s
test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s
test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s
test from_utf8_cyr_regular ... bench: 12,250 ns/iter (+/- 437) = 418 MB/s
test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s
test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s
test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s
test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s
test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s
test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s
```
Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>
This commit migrates all of the methods on `num::wrapping::OverflowingOps` onto
inherent methods of the integer types. This also fills out some missing gaps in
the saturating and checked departments such as:
* `saturating_mul`
* `checked_{neg,rem,shl,shr}`
This is done in preparation for stabilization,
cc #27755