2020-06-29 17:12:20 -07:00
|
|
|
//! Finding the dominators in a control-flow graph.
|
|
|
|
|
//!
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
//! Algorithm based on Loukas Georgiadis,
|
|
|
|
|
//! "Linear-Time Algorithms for Dominators and Related Problems",
|
2021-05-10 15:50:50 -04:00
|
|
|
//! <ftp://ftp.cs.princeton.edu/techreports/2005/737.pdf>
|
|
|
|
|
//!
|
|
|
|
|
//! Additionally useful is the original Lengauer-Tarjan paper on this subject,
|
|
|
|
|
//! "A Fast Algorithm for Finding Dominators in a Flowgraph"
|
|
|
|
|
//! Thomas Lengauer and Robert Endre Tarjan.
|
|
|
|
|
//! <https://www.cs.princeton.edu/courses/archive/spr03/cs423/download/dominators.pdf>
|
2016-06-09 15:49:07 -07:00
|
|
|
|
2018-07-01 16:54:01 -04:00
|
|
|
use super::ControlFlowGraph;
|
2023-04-19 10:57:17 +00:00
|
|
|
use rustc_index::{Idx, IndexSlice, IndexVec};
|
|
|
|
|
|
Updates to experimental coverage counter injection
This is a combination of 18 commits.
Commit #2:
Additional examples and some small improvements.
Commit #3:
fixed mir-opt non-mir extensions and spanview title elements
Corrected a fairly recent assumption in runtest.rs that all MIR dump
files end in .mir. (It was appending .mir to the graphviz .dot and
spanview .html file names when generating blessed output files. That
also left outdated files in the baseline alongside the files with the
incorrect names, which I've now removed.)
Updated spanview HTML title elements to match their content, replacing a
hardcoded and incorrect name that was left in accidentally when
originally submitted.
Commit #4:
added more test examples
also improved Makefiles with support for non-zero exit status and to
force validation of tests unless a specific test overrides it with a
specific comment.
Commit #5:
Fixed rare issues after testing on real-world crate
Commit #6:
Addressed PR feedback, and removed temporary -Zexperimental-coverage
-Zinstrument-coverage once again supports the latest capabilities of
LLVM instrprof coverage instrumentation.
Also fixed a bug in spanview.
Commit #7:
Fix closure handling, add tests for closures and inner items
And cleaned up other tests for consistency, and to make it more clear
where spans start/end by breaking up lines.
Commit #8:
renamed "typical" test results "expected"
Now that the `llvm-cov show` tests are improved to normally expect
matching actuals, and to allow individual tests to override that
expectation.
Commit #9:
test coverage of inline generic struct function
Commit #10:
Addressed review feedback
* Removed unnecessary Unreachable filter.
* Replaced a match wildcard with remining variants.
* Added more comments to help clarify the role of successors() in the
CFG traversal
Commit #11:
refactoring based on feedback
* refactored `fn coverage_spans()`.
* changed the way I expand an empty coverage span to improve performance
* fixed a typo that I had accidently left in, in visit.rs
Commit #12:
Optimized use of SourceMap and SourceFile
Commit #13:
Fixed a regression, and synched with upstream
Some generated test file names changed due to some new change upstream.
Commit #14:
Stripping out crate disambiguators from demangled names
These can vary depending on the test platform.
Commit #15:
Ignore llvm-cov show diff on test with generics, expand IO error message
Tests with generics produce llvm-cov show results with demangled names
that can include an unstable "crate disambiguator" (hex value). The
value changes when run in the Rust CI Windows environment. I added a sed
filter to strip them out (in a prior commit), but sed also appears to
fail in the same environment. Until I can figure out a workaround, I'm
just going to ignore this specific test result. I added a FIXME to
follow up later, but it's not that critical.
I also saw an error with Windows GNU, but the IO error did not
specify a path for the directory or file that triggered the error. I
updated the error messages to provide more info for next, time but also
noticed some other tests with similar steps did not fail. Looks
spurious.
Commit #16:
Modify rust-demangler to strip disambiguators by default
Commit #17:
Remove std::process::exit from coverage tests
Due to Issue #77553, programs that call std::process::exit() do not
generate coverage results on Windows MSVC.
Commit #18:
fix: test file paths exceeding Windows max path len
2020-09-01 16:15:17 -07:00
|
|
|
use std::cmp::Ordering;
|
2016-06-09 15:49:07 -07:00
|
|
|
|
|
|
|
|
#[cfg(test)]
|
2019-08-01 23:57:23 +03:00
|
|
|
mod tests;
|
2016-06-09 15:49:07 -07:00
|
|
|
|
2021-05-09 19:10:17 -04:00
|
|
|
struct PreOrderFrame<Iter> {
|
|
|
|
|
pre_order_idx: PreorderIndex,
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
iter: Iter,
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-09 19:10:17 -04:00
|
|
|
rustc_index::newtype_index! {
|
2022-12-18 21:47:28 +01:00
|
|
|
struct PreorderIndex {}
|
2021-05-09 19:10:17 -04:00
|
|
|
}
|
|
|
|
|
|
2023-01-21 00:00:00 +00:00
|
|
|
pub fn dominator_tree<G: ControlFlowGraph>(graph: G) -> DominatorTree<G::Node> {
|
2016-06-09 15:49:07 -07:00
|
|
|
// compute the post order index (rank) for each node
|
2021-01-24 13:32:18 +01:00
|
|
|
let mut post_order_rank = IndexVec::from_elem_n(0, graph.num_nodes());
|
2021-05-10 10:50:08 -04:00
|
|
|
|
|
|
|
|
// We allocate capacity for the full set of nodes, because most of the time
|
|
|
|
|
// most of the nodes *are* reachable.
|
|
|
|
|
let mut parent: IndexVec<PreorderIndex, PreorderIndex> =
|
|
|
|
|
IndexVec::with_capacity(graph.num_nodes());
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
|
2021-05-09 19:10:17 -04:00
|
|
|
let mut stack = vec![PreOrderFrame {
|
|
|
|
|
pre_order_idx: PreorderIndex::new(0),
|
|
|
|
|
iter: graph.successors(graph.start_node()),
|
|
|
|
|
}];
|
|
|
|
|
let mut pre_order_to_real: IndexVec<PreorderIndex, G::Node> =
|
|
|
|
|
IndexVec::with_capacity(graph.num_nodes());
|
|
|
|
|
let mut real_to_pre_order: IndexVec<G::Node, Option<PreorderIndex>> =
|
2021-05-08 19:13:11 -04:00
|
|
|
IndexVec::from_elem_n(None, graph.num_nodes());
|
|
|
|
|
pre_order_to_real.push(graph.start_node());
|
2021-05-10 10:50:08 -04:00
|
|
|
parent.push(PreorderIndex::new(0)); // the parent of the root node is the root for now.
|
2021-05-09 19:10:17 -04:00
|
|
|
real_to_pre_order[graph.start_node()] = Some(PreorderIndex::new(0));
|
2021-05-08 19:50:50 -04:00
|
|
|
let mut post_order_idx = 0;
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
|
2021-05-10 15:50:50 -04:00
|
|
|
// Traverse the graph, collecting a number of things:
|
|
|
|
|
//
|
|
|
|
|
// * Preorder mapping (to it, and back to the actual ordering)
|
|
|
|
|
// * Postorder mapping (used exclusively for rank_partial_cmp on the final product)
|
|
|
|
|
// * Parents for each vertex in the preorder tree
|
|
|
|
|
//
|
|
|
|
|
// These are all done here rather than through one of the 'standard'
|
|
|
|
|
// graph traversals to help make this fast.
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
'recurse: while let Some(frame) = stack.last_mut() {
|
|
|
|
|
while let Some(successor) = frame.iter.next() {
|
2021-05-09 12:56:58 -04:00
|
|
|
if real_to_pre_order[successor].is_none() {
|
2021-05-09 19:10:17 -04:00
|
|
|
let pre_order_idx = pre_order_to_real.push(successor);
|
2021-05-09 18:59:52 -04:00
|
|
|
real_to_pre_order[successor] = Some(pre_order_idx);
|
2021-05-10 10:50:08 -04:00
|
|
|
parent.push(frame.pre_order_idx);
|
2021-05-09 19:10:17 -04:00
|
|
|
stack.push(PreOrderFrame { pre_order_idx, iter: graph.successors(successor) });
|
2016-06-09 15:49:07 -07:00
|
|
|
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
continue 'recurse;
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
}
|
2021-05-09 19:10:17 -04:00
|
|
|
post_order_rank[pre_order_to_real[frame.pre_order_idx]] = post_order_idx;
|
2021-05-08 19:50:50 -04:00
|
|
|
post_order_idx += 1;
|
|
|
|
|
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
stack.pop();
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-09 18:59:52 -04:00
|
|
|
let reachable_vertices = pre_order_to_real.len();
|
|
|
|
|
|
2021-05-09 19:10:17 -04:00
|
|
|
let mut idom = IndexVec::from_elem_n(PreorderIndex::new(0), reachable_vertices);
|
2021-05-09 18:59:52 -04:00
|
|
|
let mut semi = IndexVec::from_fn_n(std::convert::identity, reachable_vertices);
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
let mut label = semi.clone();
|
2021-05-09 18:59:52 -04:00
|
|
|
let mut bucket = IndexVec::from_elem_n(vec![], reachable_vertices);
|
2021-05-09 14:02:24 -04:00
|
|
|
let mut lastlinked = None;
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
|
2021-05-10 15:50:50 -04:00
|
|
|
// We loop over vertices in reverse preorder. This implements the pseudocode
|
|
|
|
|
// of the simple Lengauer-Tarjan algorithm. A few key facts are noted here
|
|
|
|
|
// which are helpful for understanding the code (full proofs and such are
|
|
|
|
|
// found in various papers, including one cited at the top of this file).
|
|
|
|
|
//
|
|
|
|
|
// For each vertex w (which is not the root),
|
|
|
|
|
// * semi[w] is a proper ancestor of the vertex w (i.e., semi[w] != w)
|
|
|
|
|
// * idom[w] is an ancestor of semi[w] (i.e., idom[w] may equal semi[w])
|
|
|
|
|
//
|
|
|
|
|
// An immediate dominator of w (idom[w]) is a vertex v where v dominates w
|
|
|
|
|
// and every other dominator of w dominates v. (Every vertex except the root has
|
|
|
|
|
// a unique immediate dominator.)
|
|
|
|
|
//
|
|
|
|
|
// A semidominator for a given vertex w (semi[w]) is the vertex v with minimum
|
|
|
|
|
// preorder number such that there exists a path from v to w in which all elements (other than w) have
|
|
|
|
|
// preorder numbers greater than w (i.e., this path is not the tree path to
|
|
|
|
|
// w).
|
2021-05-09 19:10:17 -04:00
|
|
|
for w in (PreorderIndex::new(1)..PreorderIndex::new(reachable_vertices)).rev() {
|
2021-05-08 19:13:11 -04:00
|
|
|
// Optimization: process buckets just once, at the start of the
|
|
|
|
|
// iteration. Do not explicitly empty the bucket (even though it will
|
|
|
|
|
// not be used again), to save some instructions.
|
2021-05-10 15:50:50 -04:00
|
|
|
//
|
|
|
|
|
// The bucket here contains the vertices whose semidominator is the
|
|
|
|
|
// vertex w, which we are guaranteed to have found: all vertices who can
|
|
|
|
|
// be semidominated by w must have a preorder number exceeding w, so
|
|
|
|
|
// they have been placed in the bucket.
|
|
|
|
|
//
|
|
|
|
|
// We compute a partial set of immediate dominators here.
|
2021-05-10 10:50:08 -04:00
|
|
|
let z = parent[w];
|
2021-05-08 19:13:11 -04:00
|
|
|
for &v in bucket[z].iter() {
|
2021-05-10 15:50:50 -04:00
|
|
|
// This uses the result of Lemma 5 from section 2 from the original
|
|
|
|
|
// 1979 paper, to compute either the immediate or relative dominator
|
|
|
|
|
// for a given vertex v.
|
|
|
|
|
//
|
|
|
|
|
// eval returns a vertex y, for which semi[y] is minimum among
|
|
|
|
|
// vertices semi[v] +> y *> v. Note that semi[v] = z as we're in the
|
|
|
|
|
// z bucket.
|
|
|
|
|
//
|
|
|
|
|
// Given such a vertex y, semi[y] <= semi[v] and idom[y] = idom[v].
|
|
|
|
|
// If semi[y] = semi[v], though, idom[v] = semi[v].
|
|
|
|
|
//
|
|
|
|
|
// Using this, we can either set idom[v] to be:
|
|
|
|
|
// * semi[v] (i.e. z), if semi[y] is z
|
|
|
|
|
// * idom[y], otherwise
|
|
|
|
|
//
|
|
|
|
|
// We don't directly set to idom[y] though as it's not necessarily
|
|
|
|
|
// known yet. The second preorder traversal will cleanup by updating
|
|
|
|
|
// the idom for any that were missed in this pass.
|
2021-05-08 19:13:11 -04:00
|
|
|
let y = eval(&mut parent, lastlinked, &semi, &mut label, v);
|
|
|
|
|
idom[v] = if semi[y] < z { y } else { z };
|
2021-05-09 14:05:32 -04:00
|
|
|
}
|
|
|
|
|
|
2021-05-10 15:50:50 -04:00
|
|
|
// This loop computes the semi[w] for w.
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
semi[w] = w;
|
2021-05-08 19:13:11 -04:00
|
|
|
for v in graph.predecessors(pre_order_to_real[w]) {
|
2023-01-06 16:26:56 +00:00
|
|
|
// TL;DR: Reachable vertices may have unreachable predecessors, so ignore any of them.
|
|
|
|
|
//
|
|
|
|
|
// Ignore blocks which are not connected to the entry block.
|
|
|
|
|
//
|
|
|
|
|
// The algorithm that was used to traverse the graph and build the
|
|
|
|
|
// `pre_order_to_real` and `real_to_pre_order` vectors does so by
|
|
|
|
|
// starting from the entry block and following the successors.
|
|
|
|
|
// Therefore, any blocks not reachable from the entry block will be
|
|
|
|
|
// set to `None` in the `pre_order_to_real` vector.
|
|
|
|
|
//
|
|
|
|
|
// For example, in this graph, A and B should be skipped:
|
|
|
|
|
//
|
|
|
|
|
// ┌─────┐
|
|
|
|
|
// │ │
|
|
|
|
|
// └──┬──┘
|
|
|
|
|
// │
|
|
|
|
|
// ┌──▼──┐ ┌─────┐
|
|
|
|
|
// │ │ │ A │
|
|
|
|
|
// └──┬──┘ └──┬──┘
|
|
|
|
|
// │ │
|
|
|
|
|
// ┌───────┴───────┐ │
|
|
|
|
|
// │ │ │
|
|
|
|
|
// ┌──▼──┐ ┌──▼──┐ ┌──▼──┐
|
|
|
|
|
// │ │ │ │ │ B │
|
|
|
|
|
// └──┬──┘ └──┬──┘ └──┬──┘
|
|
|
|
|
// │ └──────┬─────┘
|
|
|
|
|
// ┌──▼──┐ │
|
|
|
|
|
// │ │ │
|
|
|
|
|
// └──┬──┘ ┌──▼──┐
|
|
|
|
|
// │ │ │
|
|
|
|
|
// │ └─────┘
|
|
|
|
|
// ┌──▼──┐
|
|
|
|
|
// │ │
|
|
|
|
|
// └──┬──┘
|
|
|
|
|
// │
|
|
|
|
|
// ┌──▼──┐
|
|
|
|
|
// │ │
|
|
|
|
|
// └─────┘
|
|
|
|
|
//
|
|
|
|
|
// ...this may be the case if a MirPass modifies the CFG to remove
|
|
|
|
|
// or rearrange certain blocks/edges.
|
2023-01-08 18:23:13 -08:00
|
|
|
let Some(v) = real_to_pre_order[v] else {
|
|
|
|
|
continue
|
|
|
|
|
};
|
2021-05-10 15:50:50 -04:00
|
|
|
|
|
|
|
|
// eval returns a vertex x from which semi[x] is minimum among
|
|
|
|
|
// vertices semi[v] +> x *> v.
|
|
|
|
|
//
|
|
|
|
|
// From Lemma 4 from section 2, we know that the semidominator of a
|
|
|
|
|
// vertex w is the minimum (by preorder number) vertex of the
|
|
|
|
|
// following:
|
|
|
|
|
//
|
|
|
|
|
// * direct predecessors of w with preorder number less than w
|
|
|
|
|
// * semidominators of u such that u > w and there exists (v, w)
|
|
|
|
|
// such that u *> v
|
|
|
|
|
//
|
|
|
|
|
// This loop therefore identifies such a minima. Note that any
|
|
|
|
|
// semidominator path to w must have all but the first vertex go
|
|
|
|
|
// through vertices numbered greater than w, so the reverse preorder
|
|
|
|
|
// traversal we are using guarantees that all of the information we
|
|
|
|
|
// might need is available at this point.
|
|
|
|
|
//
|
|
|
|
|
// The eval call will give us semi[x], which is either:
|
|
|
|
|
//
|
|
|
|
|
// * v itself, if v has not yet been processed
|
|
|
|
|
// * A possible 'best' semidominator for w.
|
2021-05-08 19:13:11 -04:00
|
|
|
let x = eval(&mut parent, lastlinked, &semi, &mut label, v);
|
|
|
|
|
semi[w] = std::cmp::min(semi[w], semi[x]);
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
}
|
2021-05-10 15:50:50 -04:00
|
|
|
// semi[w] is now semidominator(w) and won't change any more.
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
|
2021-05-09 14:06:05 -04:00
|
|
|
// Optimization: Do not insert into buckets if parent[w] = semi[w], as
|
|
|
|
|
// we then immediately know the idom.
|
2021-05-10 15:50:50 -04:00
|
|
|
//
|
|
|
|
|
// If we don't yet know the idom directly, then push this vertex into
|
|
|
|
|
// our semidominator's bucket, where it will get processed at a later
|
|
|
|
|
// stage to compute its immediate dominator.
|
2021-05-10 10:50:08 -04:00
|
|
|
if parent[w] != semi[w] {
|
2021-05-09 14:06:05 -04:00
|
|
|
bucket[semi[w]].push(w);
|
|
|
|
|
} else {
|
2021-05-10 10:50:08 -04:00
|
|
|
idom[w] = parent[w];
|
2021-05-09 14:06:05 -04:00
|
|
|
}
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
|
2021-05-09 14:02:24 -04:00
|
|
|
// Optimization: We share the parent array between processed and not
|
|
|
|
|
// processed elements; lastlinked represents the divider.
|
|
|
|
|
lastlinked = Some(w);
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
}
|
2021-05-10 15:50:50 -04:00
|
|
|
|
|
|
|
|
// Finalize the idoms for any that were not fully settable during initial
|
|
|
|
|
// traversal.
|
|
|
|
|
//
|
|
|
|
|
// If idom[w] != semi[w] then we know that we've stored vertex y from above
|
|
|
|
|
// into idom[w]. It is known to be our 'relative dominator', which means
|
|
|
|
|
// that it's one of w's ancestors and has the same immediate dominator as w,
|
|
|
|
|
// so use that idom.
|
2021-05-09 19:10:17 -04:00
|
|
|
for w in PreorderIndex::new(1)..PreorderIndex::new(reachable_vertices) {
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
if idom[w] != semi[w] {
|
|
|
|
|
idom[w] = idom[idom[w]];
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
let mut immediate_dominators = IndexVec::from_elem_n(None, graph.num_nodes());
|
2021-05-09 19:10:17 -04:00
|
|
|
for (idx, node) in pre_order_to_real.iter_enumerated() {
|
2021-05-08 19:13:11 -04:00
|
|
|
immediate_dominators[*node] = Some(pre_order_to_real[idom[idx]]);
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
|
2023-05-14 00:00:00 +00:00
|
|
|
let start_node = graph.start_node();
|
|
|
|
|
immediate_dominators[start_node] = None;
|
2023-01-21 00:00:00 +00:00
|
|
|
DominatorTree { start_node, post_order_rank, immediate_dominators }
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
|
2021-05-10 15:50:50 -04:00
|
|
|
/// Evaluate the link-eval virtual forest, providing the currently minimum semi
|
|
|
|
|
/// value for the passed `node` (which may be itself).
|
|
|
|
|
///
|
|
|
|
|
/// This maintains that for every vertex v, `label[v]` is such that:
|
|
|
|
|
///
|
|
|
|
|
/// ```text
|
|
|
|
|
/// semi[eval(v)] = min { semi[label[u]] | root_in_forest(v) +> u *> v }
|
|
|
|
|
/// ```
|
|
|
|
|
///
|
|
|
|
|
/// where `+>` is a proper ancestor and `*>` is just an ancestor.
|
2021-05-09 19:10:17 -04:00
|
|
|
#[inline]
|
|
|
|
|
fn eval(
|
2023-03-31 00:32:44 -07:00
|
|
|
ancestor: &mut IndexSlice<PreorderIndex, PreorderIndex>,
|
2021-05-09 19:10:17 -04:00
|
|
|
lastlinked: Option<PreorderIndex>,
|
2023-03-31 00:32:44 -07:00
|
|
|
semi: &IndexSlice<PreorderIndex, PreorderIndex>,
|
|
|
|
|
label: &mut IndexSlice<PreorderIndex, PreorderIndex>,
|
2021-05-09 19:10:17 -04:00
|
|
|
node: PreorderIndex,
|
|
|
|
|
) -> PreorderIndex {
|
2021-05-08 19:13:11 -04:00
|
|
|
if is_processed(node, lastlinked) {
|
|
|
|
|
compress(ancestor, lastlinked, semi, label, node);
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
label[node]
|
|
|
|
|
} else {
|
|
|
|
|
node
|
|
|
|
|
}
|
|
|
|
|
}
|
2016-06-09 15:49:07 -07:00
|
|
|
|
2021-05-09 19:10:17 -04:00
|
|
|
#[inline]
|
|
|
|
|
fn is_processed(v: PreorderIndex, lastlinked: Option<PreorderIndex>) -> bool {
|
2021-05-08 19:13:11 -04:00
|
|
|
if let Some(ll) = lastlinked { v >= ll } else { false }
|
2021-05-09 14:02:24 -04:00
|
|
|
}
|
|
|
|
|
|
2021-05-09 19:10:17 -04:00
|
|
|
#[inline]
|
|
|
|
|
fn compress(
|
2023-03-31 00:32:44 -07:00
|
|
|
ancestor: &mut IndexSlice<PreorderIndex, PreorderIndex>,
|
2021-05-09 19:10:17 -04:00
|
|
|
lastlinked: Option<PreorderIndex>,
|
2023-03-31 00:32:44 -07:00
|
|
|
semi: &IndexSlice<PreorderIndex, PreorderIndex>,
|
|
|
|
|
label: &mut IndexSlice<PreorderIndex, PreorderIndex>,
|
2021-05-09 19:10:17 -04:00
|
|
|
v: PreorderIndex,
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
) {
|
2021-05-08 19:13:11 -04:00
|
|
|
assert!(is_processed(v, lastlinked));
|
2022-02-23 10:31:26 -05:00
|
|
|
// Compute the processed list of ancestors
|
|
|
|
|
//
|
|
|
|
|
// We use a heap stack here to avoid recursing too deeply, exhausting the
|
|
|
|
|
// stack space.
|
|
|
|
|
let mut stack: smallvec::SmallVec<[_; 8]> = smallvec::smallvec![v];
|
|
|
|
|
let mut u = ancestor[v];
|
|
|
|
|
while is_processed(u, lastlinked) {
|
|
|
|
|
stack.push(u);
|
|
|
|
|
u = ancestor[u];
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Then in reverse order, popping the stack
|
|
|
|
|
for &[v, u] in stack.array_windows().rev() {
|
2021-05-08 19:13:11 -04:00
|
|
|
if semi[label[u]] < semi[label[v]] {
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
label[v] = label[u];
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
ancestor[v] = ancestor[u];
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation with the simple variant of
Lengauer-Tarjan, which performs better in the general case. Performance on the
keccak benchmark is about equivalent between the two, but we don't see
regressions (and indeed see improvements) on other benchmarks, even on a
partially optimized implementation.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
Implement the simple Lengauer-Tarjan algorithm
This replaces the previous implementation (from #34169), which has not been
optimized since, with the simple variant of Lengauer-Tarjan which performs
better in the general case. A previous attempt -- not kept in commit history --
attempted a replacement with a bitset-based implementation, but this led to
regressions on perf.rust-lang.org benchmarks and equivalent wins for the keccak
benchmark, so was rejected.
The implementation here follows that of the pseudocode in "Linear-Time
Algorithms for Dominators and Related Problems" thesis by Loukas Georgiadis. The
next few commits will optimize the implementation as suggested in the thesis.
Several related works are cited in the comments within the implementation, as
well.
On the keccak benchmark, we were previously spending 15% of our cycles computing
the NCA / intersect function; this function is quite expensive, especially on
modern CPUs, as it chases pointers on every iteration in a tight loop. With this
commit, we spend ~0.05% of our time in dominator computation.
2021-05-06 17:24:09 -04:00
|
|
|
}
|
2018-08-09 17:00:14 +02:00
|
|
|
|
2023-01-06 22:04:25 +00:00
|
|
|
/// Tracks the list of dominators for each node.
|
2016-06-09 15:49:07 -07:00
|
|
|
#[derive(Clone, Debug)]
|
2023-01-21 00:00:00 +00:00
|
|
|
pub struct DominatorTree<N: Idx> {
|
2023-05-14 00:00:00 +00:00
|
|
|
start_node: N,
|
2016-06-09 15:49:07 -07:00
|
|
|
post_order_rank: IndexVec<N, usize>,
|
2023-01-06 22:04:25 +00:00
|
|
|
// Even though we track only the immediate dominator of each node, it's
|
|
|
|
|
// possible to get its full list of dominators by looking up the dominator
|
|
|
|
|
// of each dominator. (See the `impl Iterator for Iter` definition).
|
2016-06-09 15:49:07 -07:00
|
|
|
immediate_dominators: IndexVec<N, Option<N>>,
|
|
|
|
|
}
|
|
|
|
|
|
2023-01-21 00:00:00 +00:00
|
|
|
impl<Node: Idx> DominatorTree<Node> {
|
2023-05-14 00:00:00 +00:00
|
|
|
/// Returns true if node is reachable from the start node.
|
2016-06-09 15:49:07 -07:00
|
|
|
pub fn is_reachable(&self, node: Node) -> bool {
|
2023-05-14 00:00:00 +00:00
|
|
|
node == self.start_node || self.immediate_dominators[node].is_some()
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
|
2023-05-14 00:00:00 +00:00
|
|
|
/// Returns the immediate dominator of node, if any.
|
|
|
|
|
pub fn immediate_dominator(&self, node: Node) -> Option<Node> {
|
|
|
|
|
self.immediate_dominators[node]
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
|
2023-01-06 22:04:25 +00:00
|
|
|
/// Provides an iterator over each dominator up the CFG, for the given Node.
|
|
|
|
|
/// See the `impl Iterator for Iter` definition to understand how this works.
|
2019-02-09 01:36:22 +09:00
|
|
|
pub fn dominators(&self, node: Node) -> Iter<'_, Node> {
|
2022-12-19 10:31:55 +01:00
|
|
|
assert!(self.is_reachable(node), "node {node:?} is not reachable");
|
2023-01-21 00:00:00 +00:00
|
|
|
Iter { dom_tree: self, node: Some(node) }
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
Updates to experimental coverage counter injection
This is a combination of 18 commits.
Commit #2:
Additional examples and some small improvements.
Commit #3:
fixed mir-opt non-mir extensions and spanview title elements
Corrected a fairly recent assumption in runtest.rs that all MIR dump
files end in .mir. (It was appending .mir to the graphviz .dot and
spanview .html file names when generating blessed output files. That
also left outdated files in the baseline alongside the files with the
incorrect names, which I've now removed.)
Updated spanview HTML title elements to match their content, replacing a
hardcoded and incorrect name that was left in accidentally when
originally submitted.
Commit #4:
added more test examples
also improved Makefiles with support for non-zero exit status and to
force validation of tests unless a specific test overrides it with a
specific comment.
Commit #5:
Fixed rare issues after testing on real-world crate
Commit #6:
Addressed PR feedback, and removed temporary -Zexperimental-coverage
-Zinstrument-coverage once again supports the latest capabilities of
LLVM instrprof coverage instrumentation.
Also fixed a bug in spanview.
Commit #7:
Fix closure handling, add tests for closures and inner items
And cleaned up other tests for consistency, and to make it more clear
where spans start/end by breaking up lines.
Commit #8:
renamed "typical" test results "expected"
Now that the `llvm-cov show` tests are improved to normally expect
matching actuals, and to allow individual tests to override that
expectation.
Commit #9:
test coverage of inline generic struct function
Commit #10:
Addressed review feedback
* Removed unnecessary Unreachable filter.
* Replaced a match wildcard with remining variants.
* Added more comments to help clarify the role of successors() in the
CFG traversal
Commit #11:
refactoring based on feedback
* refactored `fn coverage_spans()`.
* changed the way I expand an empty coverage span to improve performance
* fixed a typo that I had accidently left in, in visit.rs
Commit #12:
Optimized use of SourceMap and SourceFile
Commit #13:
Fixed a regression, and synched with upstream
Some generated test file names changed due to some new change upstream.
Commit #14:
Stripping out crate disambiguators from demangled names
These can vary depending on the test platform.
Commit #15:
Ignore llvm-cov show diff on test with generics, expand IO error message
Tests with generics produce llvm-cov show results with demangled names
that can include an unstable "crate disambiguator" (hex value). The
value changes when run in the Rust CI Windows environment. I added a sed
filter to strip them out (in a prior commit), but sed also appears to
fail in the same environment. Until I can figure out a workaround, I'm
just going to ignore this specific test result. I added a FIXME to
follow up later, but it's not that critical.
I also saw an error with Windows GNU, but the IO error did not
specify a path for the directory or file that triggered the error. I
updated the error messages to provide more info for next, time but also
noticed some other tests with similar steps did not fail. Looks
spurious.
Commit #16:
Modify rust-demangler to strip disambiguators by default
Commit #17:
Remove std::process::exit from coverage tests
Due to Issue #77553, programs that call std::process::exit() do not
generate coverage results on Windows MSVC.
Commit #18:
fix: test file paths exceeding Windows max path len
2020-09-01 16:15:17 -07:00
|
|
|
|
|
|
|
|
/// Provide deterministic ordering of nodes such that, if any two nodes have a dominator
|
|
|
|
|
/// relationship, the dominator will always precede the dominated. (The relative ordering
|
|
|
|
|
/// of two unrelated nodes will also be consistent, but otherwise the order has no
|
|
|
|
|
/// meaning.) This method cannot be used to determine if either Node dominates the other.
|
|
|
|
|
pub fn rank_partial_cmp(&self, lhs: Node, rhs: Node) -> Option<Ordering> {
|
2023-01-18 00:00:00 +00:00
|
|
|
self.post_order_rank[rhs].partial_cmp(&self.post_order_rank[lhs])
|
Updates to experimental coverage counter injection
This is a combination of 18 commits.
Commit #2:
Additional examples and some small improvements.
Commit #3:
fixed mir-opt non-mir extensions and spanview title elements
Corrected a fairly recent assumption in runtest.rs that all MIR dump
files end in .mir. (It was appending .mir to the graphviz .dot and
spanview .html file names when generating blessed output files. That
also left outdated files in the baseline alongside the files with the
incorrect names, which I've now removed.)
Updated spanview HTML title elements to match their content, replacing a
hardcoded and incorrect name that was left in accidentally when
originally submitted.
Commit #4:
added more test examples
also improved Makefiles with support for non-zero exit status and to
force validation of tests unless a specific test overrides it with a
specific comment.
Commit #5:
Fixed rare issues after testing on real-world crate
Commit #6:
Addressed PR feedback, and removed temporary -Zexperimental-coverage
-Zinstrument-coverage once again supports the latest capabilities of
LLVM instrprof coverage instrumentation.
Also fixed a bug in spanview.
Commit #7:
Fix closure handling, add tests for closures and inner items
And cleaned up other tests for consistency, and to make it more clear
where spans start/end by breaking up lines.
Commit #8:
renamed "typical" test results "expected"
Now that the `llvm-cov show` tests are improved to normally expect
matching actuals, and to allow individual tests to override that
expectation.
Commit #9:
test coverage of inline generic struct function
Commit #10:
Addressed review feedback
* Removed unnecessary Unreachable filter.
* Replaced a match wildcard with remining variants.
* Added more comments to help clarify the role of successors() in the
CFG traversal
Commit #11:
refactoring based on feedback
* refactored `fn coverage_spans()`.
* changed the way I expand an empty coverage span to improve performance
* fixed a typo that I had accidently left in, in visit.rs
Commit #12:
Optimized use of SourceMap and SourceFile
Commit #13:
Fixed a regression, and synched with upstream
Some generated test file names changed due to some new change upstream.
Commit #14:
Stripping out crate disambiguators from demangled names
These can vary depending on the test platform.
Commit #15:
Ignore llvm-cov show diff on test with generics, expand IO error message
Tests with generics produce llvm-cov show results with demangled names
that can include an unstable "crate disambiguator" (hex value). The
value changes when run in the Rust CI Windows environment. I added a sed
filter to strip them out (in a prior commit), but sed also appears to
fail in the same environment. Until I can figure out a workaround, I'm
just going to ignore this specific test result. I added a FIXME to
follow up later, but it's not that critical.
I also saw an error with Windows GNU, but the IO error did not
specify a path for the directory or file that triggered the error. I
updated the error messages to provide more info for next, time but also
noticed some other tests with similar steps did not fail. Looks
spurious.
Commit #16:
Modify rust-demangler to strip disambiguators by default
Commit #17:
Remove std::process::exit from coverage tests
Due to Issue #77553, programs that call std::process::exit() do not
generate coverage results on Windows MSVC.
Commit #18:
fix: test file paths exceeding Windows max path len
2020-09-01 16:15:17 -07:00
|
|
|
}
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
|
2019-02-09 01:36:22 +09:00
|
|
|
pub struct Iter<'dom, Node: Idx> {
|
2023-01-21 00:00:00 +00:00
|
|
|
dom_tree: &'dom DominatorTree<Node>,
|
2016-06-09 15:49:07 -07:00
|
|
|
node: Option<Node>,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl<'dom, Node: Idx> Iterator for Iter<'dom, Node> {
|
|
|
|
|
type Item = Node;
|
|
|
|
|
|
|
|
|
|
fn next(&mut self) -> Option<Self::Item> {
|
|
|
|
|
if let Some(node) = self.node {
|
2023-01-21 00:00:00 +00:00
|
|
|
self.node = self.dom_tree.immediate_dominator(node);
|
2020-03-20 15:03:11 +01:00
|
|
|
Some(node)
|
2016-06-09 15:49:07 -07:00
|
|
|
} else {
|
2020-03-20 15:03:11 +01:00
|
|
|
None
|
2016-06-09 15:49:07 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
2023-01-21 00:00:00 +00:00
|
|
|
|
|
|
|
|
#[derive(Clone, Debug)]
|
|
|
|
|
pub struct Dominators<Node: Idx> {
|
|
|
|
|
time: IndexVec<Node, Time>,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Describes the number of vertices discovered at the time when processing of a particular vertex
|
|
|
|
|
/// started and when it finished. Both values are zero for unreachable vertices.
|
|
|
|
|
#[derive(Copy, Clone, Default, Debug)]
|
|
|
|
|
struct Time {
|
|
|
|
|
start: u32,
|
|
|
|
|
finish: u32,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl<Node: Idx> Dominators<Node> {
|
|
|
|
|
pub fn dummy() -> Self {
|
|
|
|
|
Self { time: Default::default() }
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns true if `a` dominates `b`.
|
|
|
|
|
///
|
|
|
|
|
/// # Panics
|
|
|
|
|
///
|
|
|
|
|
/// Panics if `b` is unreachable.
|
|
|
|
|
pub fn dominates(&self, a: Node, b: Node) -> bool {
|
|
|
|
|
let a = self.time[a];
|
|
|
|
|
let b = self.time[b];
|
|
|
|
|
assert!(b.start != 0, "node {b:?} is not reachable");
|
|
|
|
|
a.start <= b.start && b.finish <= a.finish
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn dominators<N: Idx>(tree: &DominatorTree<N>) -> Dominators<N> {
|
|
|
|
|
let DominatorTree { start_node, ref immediate_dominators, post_order_rank: _ } = *tree;
|
|
|
|
|
|
|
|
|
|
// Transpose the dominator tree edges, so that child nodes of vertex v are stored in
|
2023-05-17 10:29:12 +00:00
|
|
|
// node[edges[v].start..edges[v].end].
|
2023-01-21 00:00:00 +00:00
|
|
|
let mut edges: IndexVec<N, std::ops::Range<u32>> =
|
|
|
|
|
IndexVec::from_elem(0..0, immediate_dominators);
|
|
|
|
|
for &idom in immediate_dominators.iter() {
|
|
|
|
|
if let Some(idom) = idom {
|
|
|
|
|
edges[idom].end += 1;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
let mut m = 0;
|
|
|
|
|
for e in edges.iter_mut() {
|
|
|
|
|
m += e.end;
|
|
|
|
|
e.start = m;
|
|
|
|
|
e.end = m;
|
|
|
|
|
}
|
|
|
|
|
let mut node = IndexVec::from_elem_n(Idx::new(0), m.try_into().unwrap());
|
|
|
|
|
for (i, &idom) in immediate_dominators.iter_enumerated() {
|
|
|
|
|
if let Some(idom) = idom {
|
|
|
|
|
edges[idom].start -= 1;
|
|
|
|
|
node[edges[idom].start] = i;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Perform a depth-first search of the dominator tree. Record the number of vertices discovered
|
|
|
|
|
// when vertex v is discovered first as time[v].start, and when its processing is finished as
|
|
|
|
|
// time[v].finish.
|
|
|
|
|
let mut time: IndexVec<N, Time> = IndexVec::from_elem(Time::default(), immediate_dominators);
|
|
|
|
|
let mut stack = Vec::new();
|
|
|
|
|
|
|
|
|
|
let mut discovered = 1;
|
|
|
|
|
stack.push(start_node);
|
|
|
|
|
time[start_node].start = discovered;
|
|
|
|
|
|
|
|
|
|
while let Some(&i) = stack.last() {
|
|
|
|
|
let e = &mut edges[i];
|
|
|
|
|
if e.start == e.end {
|
|
|
|
|
// Finish processing vertex i.
|
|
|
|
|
time[i].finish = discovered;
|
|
|
|
|
stack.pop();
|
|
|
|
|
} else {
|
|
|
|
|
let j = node[e.start];
|
|
|
|
|
e.start += 1;
|
|
|
|
|
// Start processing vertex j.
|
|
|
|
|
discovered += 1;
|
|
|
|
|
time[j].start = discovered;
|
|
|
|
|
stack.push(j);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
Dominators { time }
|
|
|
|
|
}
|