Rollup merge of #48639 - varkor:sort_by_key-cached, r=bluss
Add slice::sort_by_cached_key as a memoised sort_by_key At present, `slice::sort_by_key` calls its key function twice for each comparison that is made. When the key function is expensive (which can often be the case when `sort_by_key` is chosen over `sort_by`), this can lead to very suboptimal behaviour. To address this, I've introduced a new slice method, `sort_by_cached_key`, which has identical semantic behaviour to `sort_by_key`, except that it guarantees the key function will only be called once per element. Where there are `n` elements and the key function is `O(m)`: - `slice::sort_by_cached_key` time complexity is `O(m n log m n)`, compared to `slice::sort_by_key`'s `O(m n + n log n)`. - `slice::sort_by_cached_key` space complexity remains at `O(n + m)`. (Technically, it now reserves a slice of size `n`, whereas before it reserved a slice of size `n/2`.) `slice::sort_unstable_by_key` has not been given an analogue, as it is important that unstable sorts are in-place, which is not a property that is guaranteed here. However, this also means that `slice::sort_unstable_by_key` is likely to be slower than `slice::sort_by_cached_key` when the key function does not have negligible complexity. We might want to explore this trade-off further in the future. Benchmarks (for a vector of 100 `i32`s): ``` # Lexicographic: `|x| x.to_string()` test bench_sort_by_key ... bench: 112,638 ns/iter (+/- 19,563) test bench_sort_by_cached_key ... bench: 15,038 ns/iter (+/- 4,814) # Identity: `|x| *x` test bench_sort_by_key ... bench: 1,346 ns/iter (+/- 238) test bench_sort_by_cached_key ... bench: 1,839 ns/iter (+/- 765) # Power: `|x| x.pow(31)` test bench_sort_by_key ... bench: 3,624 ns/iter (+/- 738) test bench_sort_by_cached_key ... bench: 1,997 ns/iter (+/- 311) # Abs: `|x| x.abs()` test bench_sort_by_key ... bench: 1,546 ns/iter (+/- 174) test bench_sort_by_cached_key ... bench: 1,668 ns/iter (+/- 790) ``` (So it seems functions that are single operations do perform slightly worse with this method, but for pretty much any more complex key, you're better off with this optimisation.) I've definitely found myself using expensive keys in the past and wishing this optimisation was made (e.g. for https://github.com/rust-lang/rust/pull/47415). This feels like both desirable and expected behaviour, at the small cost of slightly more stack allocation and minute degradation in performance for extremely trivial keys. Resolves #34447.
This commit is contained in:
@@ -23,6 +23,7 @@
|
||||
#![feature(pattern)]
|
||||
#![feature(placement_in_syntax)]
|
||||
#![feature(rand)]
|
||||
#![feature(slice_sort_by_cached_key)]
|
||||
#![feature(splice)]
|
||||
#![feature(str_escape)]
|
||||
#![feature(string_retain)]
|
||||
|
||||
@@ -425,6 +425,14 @@ fn test_sort() {
|
||||
v.sort_by(|a, b| b.cmp(a));
|
||||
assert!(v.windows(2).all(|w| w[0] >= w[1]));
|
||||
|
||||
// Sort in lexicographic order.
|
||||
let mut v1 = orig.clone();
|
||||
let mut v2 = orig.clone();
|
||||
v1.sort_by_key(|x| x.to_string());
|
||||
v2.sort_by_cached_key(|x| x.to_string());
|
||||
assert!(v1.windows(2).all(|w| w[0].to_string() <= w[1].to_string()));
|
||||
assert!(v1 == v2);
|
||||
|
||||
// Sort with many pre-sorted runs.
|
||||
let mut v = orig.clone();
|
||||
v.sort();
|
||||
@@ -477,7 +485,7 @@ fn test_sort_stability() {
|
||||
// the second item represents which occurrence of that
|
||||
// number this element is, i.e. the second elements
|
||||
// will occur in sorted order.
|
||||
let mut v: Vec<_> = (0..len)
|
||||
let mut orig: Vec<_> = (0..len)
|
||||
.map(|_| {
|
||||
let n = thread_rng().gen::<usize>() % 10;
|
||||
counts[n] += 1;
|
||||
@@ -485,16 +493,21 @@ fn test_sort_stability() {
|
||||
})
|
||||
.collect();
|
||||
|
||||
// only sort on the first element, so an unstable sort
|
||||
let mut v = orig.clone();
|
||||
// Only sort on the first element, so an unstable sort
|
||||
// may mix up the counts.
|
||||
v.sort_by(|&(a, _), &(b, _)| a.cmp(&b));
|
||||
|
||||
// this comparison includes the count (the second item
|
||||
// This comparison includes the count (the second item
|
||||
// of the tuple), so elements with equal first items
|
||||
// will need to be ordered with increasing
|
||||
// counts... i.e. exactly asserting that this sort is
|
||||
// stable.
|
||||
assert!(v.windows(2).all(|w| w[0] <= w[1]));
|
||||
|
||||
let mut v = orig.clone();
|
||||
v.sort_by_cached_key(|&(x, _)| x);
|
||||
assert!(v.windows(2).all(|w| w[0] <= w[1]));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user