Introduce ChunkedBitSet and use it for some dataflow analyses.

This reduces peak memory usage significantly for some programs with very
large functions, such as:
- `keccak`, `unicode_normalization`, and `match-stress-enum`, from
  the `rustc-perf` benchmark suite;
- `http-0.2.6` from crates.io.

The new type is used in the analyses where the bitsets can get huge
(e.g. 10s of thousands of bits): `MaybeInitializedPlaces`,
`MaybeUninitializedPlaces`, and `EverInitializedPlaces`.

Some refactoring was required in `rustc_mir_dataflow`. All existing
analysis domains are either `BitSet` or a trivial wrapper around
`BitSet`, and access in a few places is done via `Borrow<BitSet>` or
`BorrowMut<BitSet>`. Now that some of these domains are `ClusterBitSet`,
that no longer works. So this commit replaces the `Borrow`/`BorrowMut`
usage with a new trait `BitSetExt` containing the needed bitset
operations. The impls just forward these to the underlying bitset type.
This required fiddling with trait bounds in a few places.

The commit also:
- Moves `static_assert_size` from `rustc_data_structures` to
  `rustc_index` so it can be used in the latter; the former now
  re-exports it so existing users are unaffected.
- Factors out some common "clear excess bits in the final word"
  functionality in `bit_set.rs`.
- Uses `fill` in a few places instead of loops.
This commit is contained in:
Nicholas Nethercote
2022-02-10 00:47:48 +11:00
parent 523a1b1d38
commit 36b495f3cf
14 changed files with 806 additions and 75 deletions

View File

@@ -30,10 +30,9 @@
//!
//! [gen-kill]: https://en.wikipedia.org/wiki/Data-flow_analysis#Bit_vector_problems
use std::borrow::BorrowMut;
use std::cmp::Ordering;
use rustc_index::bit_set::{BitSet, HybridBitSet};
use rustc_index::bit_set::{BitSet, ChunkedBitSet, HybridBitSet};
use rustc_index::vec::Idx;
use rustc_middle::mir::{self, BasicBlock, Location};
use rustc_middle::ty::TyCtxt;
@@ -52,6 +51,51 @@ pub use self::engine::{Engine, Results};
pub use self::lattice::{JoinSemiLattice, MeetSemiLattice};
pub use self::visitor::{visit_results, ResultsVisitable, ResultsVisitor};
/// Analysis domains are all bitsets of various kinds. This trait holds
/// operations needed by all of them.
pub trait BitSetExt<T> {
fn domain_size(&self) -> usize;
fn contains(&self, elem: T) -> bool;
fn union(&mut self, other: &HybridBitSet<T>);
fn subtract(&mut self, other: &HybridBitSet<T>);
}
impl<T: Idx> BitSetExt<T> for BitSet<T> {
fn domain_size(&self) -> usize {
self.domain_size()
}
fn contains(&self, elem: T) -> bool {
self.contains(elem)
}
fn union(&mut self, other: &HybridBitSet<T>) {
self.union(other);
}
fn subtract(&mut self, other: &HybridBitSet<T>) {
self.subtract(other);
}
}
impl<T: Idx> BitSetExt<T> for ChunkedBitSet<T> {
fn domain_size(&self) -> usize {
self.domain_size()
}
fn contains(&self, elem: T) -> bool {
self.contains(elem)
}
fn union(&mut self, other: &HybridBitSet<T>) {
self.union(other);
}
fn subtract(&mut self, other: &HybridBitSet<T>) {
self.subtract(other);
}
}
/// Define the domain of a dataflow problem.
///
/// This trait specifies the lattice on which this analysis operates (the domain) as well as its
@@ -303,7 +347,7 @@ pub trait GenKillAnalysis<'tcx>: Analysis<'tcx> {
impl<'tcx, A> Analysis<'tcx> for A
where
A: GenKillAnalysis<'tcx>,
A::Domain: GenKill<A::Idx> + BorrowMut<BitSet<A::Idx>>,
A::Domain: GenKill<A::Idx> + BitSetExt<A::Idx>,
{
fn apply_statement_effect(
&self,
@@ -435,7 +479,7 @@ impl<T: Idx> GenKillSet<T> {
}
}
pub fn apply(&self, state: &mut BitSet<T>) {
pub fn apply(&self, state: &mut impl BitSetExt<T>) {
state.union(&self.gen);
state.subtract(&self.kill);
}
@@ -463,6 +507,16 @@ impl<T: Idx> GenKill<T> for BitSet<T> {
}
}
impl<T: Idx> GenKill<T> for ChunkedBitSet<T> {
fn gen(&mut self, elem: T) {
self.insert(elem);
}
fn kill(&mut self, elem: T) {
self.remove(elem);
}
}
impl<T: Idx> GenKill<T> for lattice::Dual<BitSet<T>> {
fn gen(&mut self, elem: T) {
self.0.insert(elem);