Rework the std::iter::Step trait
Previous attempts: #43127#62886#68807
Tracking issue: #42168
This PR reworks the `Step` trait to be phrased in terms of the *successor* and *predecessor* operations. With this, `Step` hopefully has a consistent identity that can have a path towards stabilization. The proposed trait:
```rust
/// Objects that have a notion of *successor* and *predecessor* operations.
///
/// The *successor* operation moves towards values that compare greater.
/// The *predecessor* operation moves towards values that compare lesser.
///
/// # Safety
///
/// This trait is `unsafe` because its implementation must be correct for
/// the safety of `unsafe trait TrustedLen` implementations, and the results
/// of using this trait can otherwise be trusted by `unsafe` code to be correct
/// and fulful the listed obligations.
pub unsafe trait Step: Clone + PartialOrd + Sized {
/// Returns the number of *successor* steps required to get from `start` to `end`.
///
/// Returns `None` if the number of steps would overflow `usize`
/// (or is infinite, or if `end` would never be reached).
///
/// # Invariants
///
/// For any `a`, `b`, and `n`:
///
/// * `steps_between(&a, &b) == Some(n)` if and only if `Step::forward(&a, n) == Some(b)`
/// * `steps_between(&a, &b) == Some(n)` if and only if `Step::backward(&a, n) == Some(a)`
/// * `steps_between(&a, &b) == Some(n)` only if `a <= b`
/// * Corollary: `steps_between(&a, &b) == Some(0)` if and only if `a == b`
/// * Note that `a <= b` does _not_ imply `steps_between(&a, &b) != None`;
/// this is the case wheen it would require more than `usize::MAX` steps to get to `b`
/// * `steps_between(&a, &b) == None` if `a > b`
fn steps_between(start: &Self, end: &Self) -> Option<usize>;
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`, returns `None`.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`:
///
/// * `Step::forward_checked(a, n).and_then(|x| Step::forward_checked(x, m)) == Step::forward_checked(a, m).and_then(|x| Step::forward_checked(x, n))`
///
/// For any `a`, `n`, and `m` where `n + m` does not overflow:
///
/// * `Step::forward_checked(a, n).and_then(|x| Step::forward_checked(x, m)) == Step::forward_checked(a, n + m)`
///
/// For any `a` and `n`:
///
/// * `Step::forward_checked(a, n) == (0..n).try_fold(a, |x, _| Step::forward_checked(&x, 1))`
/// * Corollary: `Step::forward_checked(&a, 0) == Some(a)`
fn forward_checked(start: Self, count: usize) -> Option<Self>;
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`,
/// this function is allowed to panic, wrap, or saturate.
/// The suggested behavior is to panic when debug assertions are enabled,
/// and to wrap or saturate otherwise.
///
/// Unsafe code should not rely on the correctness of behavior after overflow.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`, where no overflow occurs:
///
/// * `Step::forward(Step::forward(a, n), m) == Step::forward(a, n + m)`
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::forward_checked(a, n) == Some(Step::forward(a, n))`
/// * `Step::forward(a, n) == (0..n).fold(a, |x, _| Step::forward(x, 1))`
/// * Corollary: `Step::forward(a, 0) == a`
/// * `Step::forward(a, n) >= a`
/// * `Step::backward(Step::forward(a, n), n) == a`
fn forward(start: Self, count: usize) -> Self {
Step::forward_checked(start, count).expect("overflow in `Step::forward`")
}
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// # Safety
///
/// It is undefined behavior for this operation to overflow the
/// range of values supported by `Self`. If you cannot guarantee that this
/// will not overflow, use `forward` or `forward_checked` instead.
///
/// # Invariants
///
/// For any `a`:
///
/// * if there exists `b` such that `b > a`, it is safe to call `Step::forward_unchecked(a, 1)`
/// * if there exists `b`, `n` such that `steps_between(&a, &b) == Some(n)`,
/// it is safe to call `Step::forward_unchecked(a, m)` for any `m <= n`.
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::forward_unchecked(a, n)` is equivalent to `Step::forward(a, n)`
#[unstable(feature = "unchecked_math", reason = "niche optimization path", issue = "none")]
unsafe fn forward_unchecked(start: Self, count: usize) -> Self {
Step::forward(start, count)
}
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`, returns `None`.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`:
///
/// * `Step::backward_checked(a, n).and_then(|x| Step::backward_checked(x, m)) == n.checked_add(m).and_then(|x| Step::backward_checked(a, x))`
/// * `Step::backward_checked(a, n).and_then(|x| Step::backward_checked(x, m)) == try { Step::backward_checked(a, n.checked_add(m)?) }`
///
/// For any `a` and `n`:
///
/// * `Step::backward_checked(a, n) == (0..n).try_fold(a, |x, _| Step::backward_checked(&x, 1))`
/// * Corollary: `Step::backward_checked(&a, 0) == Some(a)`
fn backward_checked(start: Self, count: usize) -> Option<Self>;
/// Returns the value that would be obtained by taking the *predecessor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`,
/// this function is allowed to panic, wrap, or saturate.
/// The suggested behavior is to panic when debug assertions are enabled,
/// and to wrap or saturate otherwise.
///
/// Unsafe code should not rely on the correctness of behavior after overflow.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`, where no overflow occurs:
///
/// * `Step::backward(Step::backward(a, n), m) == Step::backward(a, n + m)`
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::backward_checked(a, n) == Some(Step::backward(a, n))`
/// * `Step::backward(a, n) == (0..n).fold(a, |x, _| Step::backward(x, 1))`
/// * Corollary: `Step::backward(a, 0) == a`
/// * `Step::backward(a, n) <= a`
/// * `Step::forward(Step::backward(a, n), n) == a`
fn backward(start: Self, count: usize) -> Self {
Step::backward_checked(start, count).expect("overflow in `Step::backward`")
}
/// Returns the value that would be obtained by taking the *predecessor*
/// of `self` `count` times.
///
/// # Safety
///
/// It is undefined behavior for this operation to overflow the
/// range of values supported by `Self`. If you cannot guarantee that this
/// will not overflow, use `backward` or `backward_checked` instead.
///
/// # Invariants
///
/// For any `a`:
///
/// * if there exists `b` such that `b < a`, it is safe to call `Step::backward_unchecked(a, 1)`
/// * if there exists `b`, `n` such that `steps_between(&b, &a) == Some(n)`,
/// it is safe to call `Step::backward_unchecked(a, m)` for any `m <= n`.
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::backward_unchecked(a, n)` is equivalent to `Step::backward(a, n)`
#[unstable(feature = "unchecked_math", reason = "niche optimization path", issue = "none")]
unsafe fn backward_unchecked(start: Self, count: usize) -> Self {
Step::backward(start, count)
}
}
```
Note that all of these are associated functions and not callable via method syntax; the calling syntax is always `Step::forward(start, n)`. This version of the trait additionally changes the stepping functions to talk their arguments by value.
As opposed to previous attempts which provided a "step by one" method directly, this version of the trait only exposes "step by n". There are a few reasons for this:
- `Range*`, the primary consumer of `Step`, assumes that the "step by n" operation is cheap. If a single step function is provided, it will be a lot more enticing to implement "step by n" as n repeated calls to "step by one". While this is not strictly incorrect, this behavior would be surprising for anyone used to using `Range<{primitive integer}>`.
- With a trivial default impl, this can be easily added backwards-compatibly later.
- The debug-wrapping "step by n" needs to exist for `RangeFrom` to be consistent between "step by n" and "step by one" operation. (Note: the behavior is not changed by this PR, but making the behavior consistent is made tenable by this PR.)
Three "kinds" of step are provided: `_checked`, which returns an `Option` indicating attempted overflow; (unsuffixed), which provides "safe overflow" behavior (is allowed to panic, wrap, or saturate, depending on what is most convenient for a given type); and `_unchecked`, which is a version which assumes overflow does not happen.
Review is appreciated to check that:
- The invariants as described on the `Step` functions are enough to specify the "common sense" consistency for successor/predecessor.
- Implementation of `Step` functions is correct in the face of overflow and the edges of representable integers.
- Added tests of `Step` functions are asserting the correct behavior (and not just the implemented behavior).
We can use `usize::try_from` to convert steps from any size of integer.
This enables a meaningful `size_hint()` for larger ranges, rather than
always just `(0, None)`. Now they return the true `(len, Some(len))`
when it fits, otherwise `(usize::MAX, None)` for overflow.
RangeInclusive internal iteration performance improvement.
Specialize `Iterator::try_fold` and `DoubleEndedIterator::try_rfold` to improve code generation in all internal iteration scenarios.
This changes brings the performance of internal iteration with `RangeInclusive` on par with the performance of iteration with `Range`:
- Single conditional jump in hot loop,
- Unrolling and vectorization,
- And even Closed Form substitution.
Unfortunately, it only applies to internal iteration. Despite various attempts at stream-lining the implementation of `next` and `next_back`, LLVM has stubbornly refused to optimize external iteration appropriately, leaving me with a choice between:
- The current implementation, for which Closed Form substitution is performed, but which uses 2 conditional jumps in the hot loop when optimization fail.
- An implementation using a `is_done` boolean, which uses 1 conditional jump in the hot loop when optimization fail, allowing unrolling and vectorization, but for which Closed Form substitution fails.
In the absence of any conclusive evidence as to which usecase matters most, and with no assurance that the lack of Closed Form substitution is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the statu quo as far as `next` and `next_back` are concerned.
Specialize Iterator::try_fold and DoubleEndedIterator::try_rfold to
improve code generation in all internal iteration scenarios.
This changes brings the performance of internal iteration with
RangeInclusive on par with the performance of iteration with Range:
- Single conditional jump in hot loop,
- Unrolling and vectorization,
- And even Closed Form substitution.
Unfortunately, it only applies to internal iteration. Despite various
attempts at stream-lining the implementation of next and next_back,
LLVM has stubbornly refused to optimize external iteration
appropriately, leaving me with a choice between:
- The current implementation, for which Closed Form substitution is
performed, but which uses 2 conditional jumps in the hot loop when
optimization fail.
- An implementation using a "is_done" boolean, which uses 1
conditional jump in the hot loop when optimization fail, allowing
unrolling and vectorization, but for which Closed Form substitution
fails.
In the absence of any conclusive evidence as to which usecase matters
most, and with no assurance that the lack of Closed Form substitution
is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the
statu quo as far as next and next_back are concerned.
Correct a few stability attributes
* `const_indexing` language feature was stabilized in 1.26.0 by #46882
* `Display` impls for `PanicInfo` and `Location` were stabilized in 1.26.0 by #47687
* `TrustedLen` is still unstable so its impls should be as well even though `RangeInclusive` was stabilized by #47813
* `!Send` and `!Sync` for `Args` and `ArgsOs` were stabilized in 1.26.0 by #48005
* `EscapeDefault` has been stable since 1.0.0 so should continue to show that even though it was moved to core in #48735
This could be backported to beta like #49612
FusedIterator is a marker trait that promises that the implementing
iterator continues to return `None` from `.next()` once it has returned
`None` once (and/or `.next_back()`, if implemented).
The effects of FusedIterator are already widely available through
`.fuse()`, but with stable `FusedIterator`, stable Rust users can
implement this trait for their iterators when appropriate.
`match`ing on an `Option<Ordering>` seems cause some confusion for LLVM; switching to just using comparison operators removes a few jumps from the simple `for` loops I was trying.
Implement TrustedLen for Take<Repeat> and Take<RangeFrom>
This will allow optimization of simple `repeat(x).take(n).collect()` iterators, which are currently not vectorized and have capacity checks.
This will only support a few aggregates on `Repeat` and `RangeFrom`, which might be enough for simple cases, but doesn't optimize more complex ones. Namely, Cycle, StepBy, Filter, FilterMap, Peekable, SkipWhile, Skip, FlatMap, Fuse and Inspect are not marked `TrustedLen` when the inner iterator is infinite.
Previous discussion can be found in #47082
r? @alexcrichton
Because the last item needs special handling, it seems that LLVM has trouble canonicalizing the loops in external iteration. With the override, it becomes obvious that the start==end case exits the loop (as opposed to the one *after* that exiting the loop in external iteration).
Adds `impl<T, U> TryFrom<T> for U where U: From<T>`.
Removes `impl<'a, T> TryFrom<&'a str> for T where T: FromStr` due to
overlapping impls caused by the new blanket impl. This removal is to
be discussed further on the tracking issue for TryFrom.
Refs #33417.
This helps with vectorization in some cases, such as (0..u16::MAX).collect::<Vec<u16>>(),
as LLVM is able to change the loop condition to use equality instead of less than