Implement LazyBTreeMap and use it in a few places.
This is a thin wrapper around BTreeMap that avoids allocating upon creation.
I would prefer to change BTreeMap directly to make it lazy (like I did with HashSet in #36734) and I initially attempted that by making BTreeMap::root an Option<>. But then I also had to change Iter and Range to handle trees with no root, and those types have stability markers on them and I wasn't sure if that was acceptable. Also, BTreeMap has a lot of complex code and changing it all was challenging, and I didn't have high confidence about my general approach.
So I prototyped this wrapper instead and used it in the hottest locations to get some measurements about the effect. The measurements are pretty good!
- Doing a debug build of serde, it reduces the total number of heap allocations from 17,728,709 to 13,359,384, a 25% reduction. The number of bytes allocated drops from 7,474,672,966 to 5,482,308,388, a 27% reduction.
- It gives speedups of up to 3.6% on some rustc-perf benchmark jobs. crates.io, futures, and serde benefit most.
```
futures-check
avg: -1.9% min: -3.6% max: -0.5%
serde-check
avg: -2.1% min: -3.5% max: -0.7%
crates.io-check
avg: -1.7% min: -3.5% max: -0.3%
serde
avg: -2.0% min: -3.0% max: -0.9%
serde-opt
avg: -1.8% min: -2.9% max: -0.3%
futures
avg: -1.5% min: -2.8% max: -0.4%
tokio-webpush-simple-check
avg: -1.1% min: -2.2% max: -0.1%
futures-opt
avg: -1.2% min: -2.1% max: -0.4%
piston-image-check
avg: -0.8% min: -1.1% max: -0.3%
crates.io
avg: -0.6% min: -1.0% max: -0.3%
```
@Gankro, how do you think I should proceed here? Is leaving this as a wrapper reasonable? Or should I try to make BTreeMap itself lazy? If so, can I change the representation of Iter and Range?
Thanks!