I noticed in docs, specifically http://doc.rust-lang.org/std/primitive.u8.html#method.is_power_of_two, that it was like this, and it was apparently in multiple places too.
Didn't change any occurrences through the cross-depo things. There's a lot in /src/llvm/ for instance, but I'm not confident on how to go about sending fixes for those, so this is just what's in the base rust depo.
r? @steveklabnik
Exploiting the fact that getting the length of the slices is known, we
can use a counted loop instead of iterators, which means that we only
need a single counter, instead of having to increment and check one
pointer for each iterator.
Benchmarks comparing vectors with 100,000 elements:
Before:
```
running 8 tests
test eq1_u8 ... bench: 66,757 ns/iter (+/- 113)
test eq2_u16 ... bench: 111,267 ns/iter (+/- 149)
test eq3_u32 ... bench: 126,282 ns/iter (+/- 111)
test eq4_u64 ... bench: 126,418 ns/iter (+/- 155)
test ne1_u8 ... bench: 88,990 ns/iter (+/- 161)
test ne2_u16 ... bench: 89,126 ns/iter (+/- 265)
test ne3_u32 ... bench: 96,901 ns/iter (+/- 92)
test ne4_u64 ... bench: 96,750 ns/iter (+/- 137)
```
After:
```
running 8 tests
test eq1_u8 ... bench: 46,413 ns/iter (+/- 521)
test eq2_u16 ... bench: 46,500 ns/iter (+/- 74)
test eq3_u32 ... bench: 50,059 ns/iter (+/- 92)
test eq4_u64 ... bench: 54,001 ns/iter (+/- 92)
test ne1_u8 ... bench: 47,595 ns/iter (+/- 53)
test ne2_u16 ... bench: 47,521 ns/iter (+/- 59)
test ne3_u32 ... bench: 44,889 ns/iter (+/- 74)
test ne4_u64 ... bench: 47,775 ns/iter (+/- 68)
```
Exploiting the fact that getting the length of the slices is known, we
can use a counted loop instead of iterators, which means that we only
need a single counter, instead of having to increment and check one
pointer for each iterator.
Benchmarks comparing vectors with 100,000 elements:
Before:
```
running 8 tests
test eq1_u8 ... bench: 66,757 ns/iter (+/- 113)
test eq2_u16 ... bench: 111,267 ns/iter (+/- 149)
test eq3_u32 ... bench: 126,282 ns/iter (+/- 111)
test eq4_u64 ... bench: 126,418 ns/iter (+/- 155)
test ne1_u8 ... bench: 88,990 ns/iter (+/- 161)
test ne2_u16 ... bench: 89,126 ns/iter (+/- 265)
test ne3_u32 ... bench: 96,901 ns/iter (+/- 92)
test ne4_u64 ... bench: 96,750 ns/iter (+/- 137)
```
After:
```
running 8 tests
test eq1_u8 ... bench: 46,413 ns/iter (+/- 521)
test eq2_u16 ... bench: 46,500 ns/iter (+/- 74)
test eq3_u32 ... bench: 50,059 ns/iter (+/- 92)
test eq4_u64 ... bench: 54,001 ns/iter (+/- 92)
test ne1_u8 ... bench: 47,595 ns/iter (+/- 53)
test ne2_u16 ... bench: 47,521 ns/iter (+/- 59)
test ne3_u32 ... bench: 44,889 ns/iter (+/- 74)
test ne4_u64 ... bench: 47,775 ns/iter (+/- 68)
```
Each formatting trait now has an example of implementation, as well as a
fuller description of what it's supposed to output.
It also contains a link to the module-level documentation which
Fixes#25765
Fixes#24249
I've tagged all items that were missing docs to allow them to compile for now, the ones in core/num should probably be documented at least.
This is also a breaking change for any crates using `#[deny(missing_docs)]` that have undocumented constants, not sure there is any way to avoid this without making it a separate lint?
fmt: Update docs and mention :#? pretty-printing
Expose `:#?` well in the docs for fmt and Debug itself. Also update some out of date information and fix formatting in `std::fmt` docs.
Update substring search to use the Two Way algorithm
To improve our substring search performance, revive the two way searcher
and adapt it to the Pattern API.
Fixes#25483, a performance bug: that particular case now completes faster
in optimized rust than in ruby (but they share the same order of magnitude).
Many thanks to @gereeter who helped me understand the reverse case
better and wrote the comment explaining `next_back` in the code.
I had quickcheck to fuzz test forward and reverse searching thoroughly.
The two way searcher implements both forward and reverse search,
but not double ended search. The forward and reverse parts of the two
way searcher are completely independent.
The two way searcher algorithm has very small, constant space overhead,
requiring no dynamic allocation. Our implementation is relatively fast,
especially due to the `byteset` addition to the algorithm, which speeds
up many no-match cases.
A bad case for the two way algorithm is:
```
let haystack = (0..10_000).map(|_| "dac").collect::<String>();
let needle = (0..100).map(|_| "bac").collect::<String>());
```
For this particular case, two way is not much faster than the naive
implementation it replaces.
Namely:
* Change parameter `old` to read `current` so it is clearer what the argument refers to (originally
suggested `expected`, but shot down by Steve);
* Add some formatting and fix some mistakes like referring to the method as `swap` rather than
`compare_and_swap`.
It turns out that the 32-bit toolchain for MSVC has many of these functions as
`static inline` functions in header files so there's not actually a symbol for
Rust to call. All of the implementations just cast floats to their 64-bit
variants and then cast back to 32-bit at the end, so the standard library now
takes this strategy.
This removes a footgun, since it is a reasonable assumption to make that
pointers to `T` will be aligned to `align_of::<T>()`. This also matches
the behaviour of C/C++. `min_align_of` is now deprecated.
Closes#21611.
This is needed to not drop performance, after the trait-based changes.
Force separate versions of the next method to be generated for the short
and long period cases.
Use a trait to be able to implement both the fast search that skips to
each match, and the slower search that emits `Reject` intervals
regularly. The latter is important for uses of `next_reject`.
To improve our substring search performance, revive the two way searcher
and adapt it to the Pattern API.
Fixes#25483, a performance bug: that particular case now completes faster
in optimized rust than in ruby (but they share the same order of magnitude).
Much thanks to @gereeter who helped me understand the reverse case
better and wrote the comment explaining `next_back` in the code.
I had quickcheck to fuzz test forward and reverse searching thoroughly.
The two way searcher implements both forward and reverse search,
but not double ended search. The forward and reverse parts of the two
way searcher are completely independent.
The two way searcher algorithm has very small, constant space overhead,
requiring no dynamic allocation. Our implementation is relatively fast,
especially due to the `byteset` addition to the algorithm, which speeds
up many no-match cases.
A bad case for the two way algorithm is:
```
let haystack = (0..10_000).map(|_| "dac").collect::<String>();
let needle = (0..100).map(|_| "bac").collect::<String>());
```
For this particular case, two way is not much faster than the naive
implementation it replaces.