Break out bucket 0 (containing `idx < 4096`) as an early return, which simplifies the remainder of the function, and allows optimizing the `checked_ilog2` since it can no longer return `None`. This reduces the runtime of `vec_cache::tests::slot_index_exhaustive` (which calls `SlotIndex::from_index` for every `u32`, twice) from ~15.5s to ~13.3s.