Commit Graph

76 Commits

Author SHA1 Message Date
CAD97
f34322d71f Adjust Step::forward_checked docs for large types
Co-Authored-By: Nadrieril Feneanar <nadrieril@users.noreply.github.com>
2020-04-08 18:56:39 -04:00
CAD97
2fcfd233f7 Redesign the Step trait 2020-04-08 02:24:16 -04:00
CAD97
3e115b6c9d Remove problematic specialization from RangeInclusive 2020-02-08 18:47:41 -05:00
Matthew Jasper
a81c59f9b8 Remove some unsound specializations 2020-02-01 09:11:41 +00:00
Mark Rousskov
a06baa56b9 Format the world 2019-12-22 17:42:47 -05:00
Phlosioneer
983cae77dd Clarify Step Documentation
While the redesign is in progress (#62886), clarify the purpose of replace_zero and replace_one.
2019-11-20 14:40:54 -05:00
Adrian Friedli
8590074a01 implement nth_back for RangeInclusive 2019-06-09 22:45:11 +02:00
Adrian Friedli
26d4c8f01c implement nth_back for Range 2019-06-08 22:30:45 +02:00
Tim Vermeulen
f1d0829e20 Add Step::sub_usize 2019-05-25 02:53:08 +02:00
Taiki Endo
360432f1e8 libcore => 2018 2019-04-18 14:47:35 +09:00
Josh Stone
a548d835ce impl TrustedLen for 128-bit ranges too 2019-03-26 14:45:54 -07:00
Josh Stone
01162d86c5 Implement useful steps_between for all integers
We can use `usize::try_from` to convert steps from any size of integer.
This enables a meaningful `size_hint()` for larger ranges, rather than
always just `(0, None)`. Now they return the true `(len, Some(len))`
when it fits, otherwise `(usize::MAX, None)` for overflow.
2019-03-26 10:13:48 -07:00
Mazdak Farrokhzad
f19bec89d7 Rollup merge of #58122 - matthieu-m:range_incl_perf, r=dtolnay
RangeInclusive internal iteration performance improvement.

Specialize `Iterator::try_fold` and `DoubleEndedIterator::try_rfold` to improve code generation in all internal iteration scenarios.

This changes brings the performance of internal iteration with `RangeInclusive` on par with the performance of iteration with `Range`:

 - Single conditional jump in hot loop,
 - Unrolling and vectorization,
 - And even Closed Form substitution.

Unfortunately, it only applies to internal iteration. Despite various attempts at stream-lining the implementation of `next` and `next_back`, LLVM has stubbornly refused to optimize external iteration appropriately, leaving me with a choice between:

 - The current implementation, for which Closed Form substitution is performed, but which uses 2 conditional jumps in the hot loop when optimization fail.
 - An implementation using a `is_done` boolean, which uses 1 conditional jump in the hot loop when optimization fail, allowing unrolling and vectorization, but for which Closed Form substitution fails.

In the absence of any conclusive evidence as to which usecase matters most, and with no assurance that the lack of Closed Form substitution is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the statu quo as far as `next` and `next_back` are concerned.
2019-02-23 09:25:12 +01:00
Alexander Regueiro
99ed06eb88 libs: doc comments 2019-02-10 23:57:25 +00:00
Matthieu M
4fed67f942 Fix exhaustion of inclusive range try_fold and try_rfold 2019-02-09 18:42:34 +01:00
Matthieu M
eb5b096886 RangeInclusive internal iteration performance improvement.
Specialize Iterator::try_fold and DoubleEndedIterator::try_rfold to
improve code generation in all internal iteration scenarios.

This changes brings the performance of internal iteration with
RangeInclusive on par with the performance of iteration with Range:

 - Single conditional jump in hot loop,
 - Unrolling and vectorization,
 - And even Closed Form substitution.

Unfortunately, it only applies to internal iteration. Despite various
attempts at stream-lining the implementation of next and next_back,
LLVM has stubbornly refused to optimize external iteration
appropriately, leaving me with a choice between:

 - The current implementation, for which Closed Form substitution is
   performed, but which uses 2 conditional jumps in the hot loop when
   optimization fail.
 - An implementation using a "is_done" boolean, which uses 1
   conditional jump in the hot loop when optimization fail, allowing
   unrolling and vectorization, but for which Closed Form substitution
   fails.

In the absence of any conclusive evidence as to which usecase matters
most, and with no assurance that the lack of Closed Form substitution
is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the
statu quo as far as next and next_back are concerned.
2019-02-03 16:58:29 +01:00
Mark Rousskov
2a663555dd Remove licenses 2018-12-25 21:08:33 -07:00
Daniel Alley
999c2e2433 Fix #[cfg] for step impl on ranges 2018-11-09 23:00:44 -05:00
Andre Bogus
9dab56c4a2 fix u32 steps_between for 16-bit systems 2018-08-30 12:35:00 +02:00
kennytm
6093128ef3 Changed implementation of the third field to make LLVM optimize it better. 2018-07-13 13:26:07 +08:00
kennytm
0d7e9933d3 Change RangeInclusive to a three-field struct.
Fix #45222.
2018-07-13 09:53:36 +08:00
Simon Sapin
e7c122c5b5 Revert "Remove TryFrom impls that might become conditionally-infallible with a portability lint"
This reverts commit 837d6c7023.

Fixes https://github.com/rust-lang/rust/issues/49415
2018-06-06 13:52:22 +02:00
bors
8ae79efce3 Auto merge of #49673 - ollie27:stab, r=sfackler
Correct a few stability attributes

* `const_indexing` language feature was stabilized in 1.26.0 by #46882
* `Display` impls for `PanicInfo` and `Location` were stabilized in 1.26.0 by #47687
* `TrustedLen` is still unstable so its impls should be as well even though `RangeInclusive` was stabilized by #47813
* `!Send` and `!Sync` for `Args` and `ArgsOs` were stabilized in 1.26.0 by #48005
* `EscapeDefault` has been stable since 1.0.0 so should continue to show that even though it was moved to core in #48735

This could be backported to beta like #49612
2018-04-09 03:32:32 +00:00
Oliver Middleton
521e41e77d Correct a few stability attributes 2018-04-05 15:39:29 +01:00
Vadzim Dambrouski
f5c42655b5 Fix warning when compilin libcore on 16bit targets.
Fixes #49617
2018-04-03 15:33:32 +03:00
Simon Sapin
837d6c7023 Remove TryFrom impls that might become conditionally-infallible with a portability lint
https://github.com/rust-lang/rust/pull/49305#issuecomment-376293243
2018-03-27 09:48:42 +02:00
kennytm
b5913f2e76 Stabilize inclusive_range library feature.
Stabilize std::ops::RangeInclusive and std::ops::RangeInclusiveTo.
2018-03-15 16:58:01 +08:00
Ulrik Sverdrup
c7c23fe948 core: Update stability attributes for FusedIterator 2018-03-03 14:23:05 +01:00
Ulrik Sverdrup
bc651cac8d core: Stabilize FusedIterator
FusedIterator is a marker trait that promises that the implementing
iterator continues to return `None` from `.next()` once it has returned
`None` once (and/or `.next_back()`, if implemented).

The effects of FusedIterator are already widely available through
`.fuse()`, but with stable `FusedIterator`, stable Rust users can
implement this trait for their iterators when appropriate.
2018-03-03 14:14:03 +01:00
bors
932c736479 Auto merge of #48057 - scottmcm:less-match-more-compare, r=dtolnay
Simplify RangeInclusive::next[_back]

`match`ing on an `Option<Ordering>` seems cause some confusion for LLVM; switching to just using comparison operators removes a few jumps from the simple `for` loops I was trying.

cc https://github.com/rust-lang/rust/issues/45222 https://github.com/rust-lang/rust/issues/28237#issuecomment-363706510

Example:
```rust
#[no_mangle]
pub fn coresum(x: std::ops::RangeInclusive<u64>) -> u64 {
    let mut sum = 0;
    for i in x {
        sum += i ^ (i-1);
    }
    sum
}
```
Today:
```asm
coresum:
    xor r8d, r8d
    mov r9, -1
    xor eax, eax
    jmp .LBB0_1
.LBB0_4:
    lea rcx, [rdi - 1]
    xor rcx, rdi
    add rax, rcx
    mov rsi, rdx
    mov rdi, r10
.LBB0_1:
    cmp rdi, rsi
    mov ecx, 1
    cmovb   rcx, r9
    cmove   rcx, r8
    test    rcx, rcx
    mov edx, 0
    mov r10d, 1
    je  .LBB0_4         // 1
    cmp rcx, -1
    jne .LBB0_5         // 2
    lea r10, [rdi + 1]
    mov rdx, rsi
    jmp .LBB0_4         // 3
.LBB0_5:
    ret
```
With this PR:
```asm
coresum:
	cmp	rcx, rdx
	jbe	.LBB0_2
	xor	eax, eax
	ret
.LBB0_2:
	xor	r8d, r8d
	mov	r9d, 1
	xor	eax, eax
	.p2align	4, 0x90
.LBB0_3:
	lea	r10, [rcx + 1]
	cmp	rcx, rdx
	cmovae	rdx, r8
	cmovae	r10, r9
	lea	r11, [rcx - 1]
	xor	r11, rcx
	add	rax, r11
	mov	rcx, r10
	cmp	r10, rdx
	jbe	.LBB0_3         // Just this
	ret
```

<details><summary>Though using internal iteration (`.map(|i| i ^ (i-1)).sum()`) is still shorter to type, and lets the compiler unroll it</summary>

```asm
coresum_inner:
.Lcfi0:
.seh_proc coresum_inner
	sub	rsp, 168
.Lcfi1:
	.seh_stackalloc 168
	vmovdqa	xmmword ptr [rsp + 144], xmm15
.Lcfi2:
	.seh_savexmm 15, 144
	vmovdqa	xmmword ptr [rsp + 128], xmm14
.Lcfi3:
	.seh_savexmm 14, 128
	vmovdqa	xmmword ptr [rsp + 112], xmm13
.Lcfi4:
	.seh_savexmm 13, 112
	vmovdqa	xmmword ptr [rsp + 96], xmm12
.Lcfi5:
	.seh_savexmm 12, 96
	vmovdqa	xmmword ptr [rsp + 80], xmm11
.Lcfi6:
	.seh_savexmm 11, 80
	vmovdqa	xmmword ptr [rsp + 64], xmm10
.Lcfi7:
	.seh_savexmm 10, 64
	vmovdqa	xmmword ptr [rsp + 48], xmm9
.Lcfi8:
	.seh_savexmm 9, 48
	vmovdqa	xmmword ptr [rsp + 32], xmm8
.Lcfi9:
	.seh_savexmm 8, 32
	vmovdqa	xmmword ptr [rsp + 16], xmm7
.Lcfi10:
	.seh_savexmm 7, 16
	vmovdqa	xmmword ptr [rsp], xmm6
.Lcfi11:
	.seh_savexmm 6, 0
.Lcfi12:
	.seh_endprologue
	cmp	rdx, rcx
	jae	.LBB1_2
	xor	eax, eax
	jmp	.LBB1_13
.LBB1_2:
	mov	r8, rdx
	sub	r8, rcx
	jbe	.LBB1_3
	cmp	r8, 7
	jbe	.LBB1_5
	mov	rax, r8
	and	rax, -8
	mov	r9, r8
	and	r9, -8
	je	.LBB1_5
	add	rax, rcx
	vmovq	xmm0, rcx
	vpshufd	xmm0, xmm0, 68
	mov	ecx, 1
	vmovq	xmm1, rcx
	vpslldq	xmm1, xmm1, 8
	vpaddq	xmm1, xmm0, xmm1
	vpxor	xmm0, xmm0, xmm0
	vpcmpeqd	xmm11, xmm11, xmm11
	vmovdqa	xmm12, xmmword ptr [rip + __xmm@00000000000000010000000000000001]
	vmovdqa	xmm13, xmmword ptr [rip + __xmm@00000000000000030000000000000003]
	vmovdqa	xmm14, xmmword ptr [rip + __xmm@00000000000000050000000000000005]
	vmovdqa	xmm15, xmmword ptr [rip + __xmm@00000000000000080000000000000008]
	mov	rcx, r9
	vpxor	xmm4, xmm4, xmm4
	vpxor	xmm5, xmm5, xmm5
	vpxor	xmm6, xmm6, xmm6
	.p2align	4, 0x90
.LBB1_9:
	vpaddq	xmm7, xmm1, xmmword ptr [rip + __xmm@00000000000000020000000000000002]
	vpaddq	xmm9, xmm1, xmmword ptr [rip + __xmm@00000000000000040000000000000004]
	vpaddq	xmm10, xmm1, xmmword ptr [rip + __xmm@00000000000000060000000000000006]
	vpaddq	xmm8, xmm1, xmm12
	vpxor	xmm7, xmm8, xmm7
	vpaddq	xmm2, xmm1, xmm13
	vpxor	xmm8, xmm2, xmm9
	vpaddq	xmm3, xmm1, xmm14
	vpxor	xmm3, xmm3, xmm10
	vpaddq	xmm2, xmm1, xmm11
	vpxor	xmm2, xmm2, xmm1
	vpaddq	xmm0, xmm2, xmm0
	vpaddq	xmm4, xmm7, xmm4
	vpaddq	xmm5, xmm8, xmm5
	vpaddq	xmm6, xmm3, xmm6
	vpaddq	xmm1, xmm1, xmm15
	add	rcx, -8
	jne	.LBB1_9
	vpaddq	xmm0, xmm4, xmm0
	vpaddq	xmm0, xmm5, xmm0
	vpaddq	xmm0, xmm6, xmm0
	vpshufd	xmm1, xmm0, 78
	vpaddq	xmm0, xmm0, xmm1
	vmovq	r10, xmm0
	cmp	r8, r9
	jne	.LBB1_6
	jmp	.LBB1_11
.LBB1_3:
	xor	r10d, r10d
	jmp	.LBB1_12
.LBB1_5:
	xor	r10d, r10d
	mov	rax, rcx
	.p2align	4, 0x90
.LBB1_6:
	lea	rcx, [rax - 1]
	xor	rcx, rax
	inc	rax
	add	r10, rcx
	cmp	rdx, rax
	jne	.LBB1_6
.LBB1_11:
	mov	rcx, rdx
.LBB1_12:
	lea	rax, [rcx - 1]
	xor	rax, rcx
	add	rax, r10
.LBB1_13:
	vmovaps	xmm6, xmmword ptr [rsp]
	vmovaps	xmm7, xmmword ptr [rsp + 16]
	vmovaps	xmm8, xmmword ptr [rsp + 32]
	vmovaps	xmm9, xmmword ptr [rsp + 48]
	vmovaps	xmm10, xmmword ptr [rsp + 64]
	vmovaps	xmm11, xmmword ptr [rsp + 80]
	vmovaps	xmm12, xmmword ptr [rsp + 96]
	vmovaps	xmm13, xmmword ptr [rsp + 112]
	vmovaps	xmm14, xmmword ptr [rsp + 128]
	vmovaps	xmm15, xmmword ptr [rsp + 144]
	add	rsp, 168
	ret
	.seh_handlerdata
	.section	.text,"xr",one_only,coresum_inner
.Lcfi13:
	.seh_endproc
```

</details>
2018-02-08 06:38:30 +00:00
Scott McMurray
27d4d51670 Simplify RangeInclusive::next[_back]
`match`ing on an `Option<Ordering>` seems cause some confusion for LLVM; switching to just using comparison operators removes a few jumps from the simple `for` loops I was trying.
2018-02-07 11:11:54 -08:00
Manish Goregaokar
da6dcbc21e Rollup merge of #47944 - oberien:unboundediterator-trustedlen, r=bluss
Implement TrustedLen for Take<Repeat> and Take<RangeFrom>

This will allow optimization of simple `repeat(x).take(n).collect()` iterators, which are currently not vectorized and have capacity checks.

This will only support a few aggregates on `Repeat` and `RangeFrom`, which might be enough for simple cases, but doesn't optimize more complex ones. Namely, Cycle, StepBy, Filter, FilterMap, Peekable, SkipWhile, Skip, FlatMap, Fuse and Inspect are not marked `TrustedLen` when the inner iterator is infinite.

Previous discussion can be found in #47082

r? @alexcrichton
2018-02-07 08:30:53 -08:00
Scott McMurray
1b1e887f4d Override try_[r]fold for RangeInclusive
Because the last item needs special handling, it seems that LLVM has trouble canonicalizing the loops in external iteration.  With the override, it becomes obvious that the start==end case exits the loop (as opposed to the one *after* that exiting the loop in external iteration).
2018-02-04 23:48:40 -08:00
oberien
a1809d5784 Implement TrustedLen for Take<Repeat> and Take<RangeFrom> 2018-02-04 16:09:32 +01:00
varkor
919d643b79 Add min and last specialisations for Range 2018-01-09 19:37:44 +00:00
varkor
2d8334358a Use next and next_back 2018-01-06 22:14:02 +00:00
varkor
c23d4500fd Fix behaviour after iterator exhaustion 2018-01-05 18:57:10 +00:00
varkor
439beab41f Remove min from RangeFrom 2018-01-04 15:03:50 +00:00
varkor
087bffa78c Remove RangeInclusive::sum 2018-01-04 12:36:43 +00:00
varkor
29e6b1034b Add max and sum specialisations for Range 2018-01-04 01:51:18 +00:00
varkor
3d9c36fbf5 Add min specialisation for RangeFrom and last for RangeInclusive 2018-01-04 00:58:41 +00:00
varkor
680ebf7b16 Add min and max specialisations for RangeInclusive 2018-01-04 00:17:36 +00:00
Jimmy Cuadra
80e3f8941d Add blanket TryFrom impl when From is implemented.
Adds `impl<T, U> TryFrom<T> for U where U: From<T>`.

Removes `impl<'a, T> TryFrom<&'a str> for T where T: FromStr` due to
overlapping impls caused by the new blanket impl. This removal is to
be discussed further on the tracking issue for TryFrom.

Refs #33417.
2017-08-29 22:13:21 -07:00
oyvindln
4bb9a8b4ac Add an overflow check in the Iter::next() impl for Range<_>
This helps with vectorization in some cases, such as (0..u16::MAX).collect::<Vec<u16>>(),
 as LLVM is able to change the loop condition to use equality instead of less than
2017-08-01 19:31:50 +02:00
Simon Sapin
de4afc6797 Implement O(1)-time Iterator::nth for Range* 2017-07-08 08:55:55 +02:00
Simon Sapin
8e8fd02419 Factorize some macros in iter/range.rs 2017-07-08 08:55:55 +02:00
Simon Sapin
d1ec6c22d1 Remove Step::steps_between, rename steps_between_by_one to steps_between 2017-07-08 08:55:55 +02:00
Simon Sapin
4b2f40dfdf Remove unused Step methods 2017-07-08 08:55:55 +02:00
Simon Sapin
dbed18ca20 Remove unused Add bounds in iterator for ranges impls. 2017-07-08 08:55:28 +02:00
Scott McMurray
dcd332ed94 Delete deprecated & unstable range-specific step_by
Replacement: 41439
Deprecation: 42310 for 1.19
Fixes 41477
2017-07-01 19:18:02 -07:00