Rewrite collect_tokens implementations to use a flattened buffer

Instead of trying to collect tokens at each depth, we 'flatten' the
stream as we go allong, pushing open/close delimiters to our buffer
just like regular tokens. One capturing is complete, we reconstruct a
nested `TokenTree::Delimited` structure, producing a normal
`TokenStream`.

The reconstructed `TokenStream` is not created immediately - instead, it is
produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This
closure stores a clone of the original `TokenCursor`, plus a record of the
number of calls to `next()/next_desugared()`. This is sufficient to reconstruct
the tokenstream seen by the callback without storing any additional state. If
the tokenstream is never used (e.g. when a captured `macro_rules!` argument is
never passed to a proc macro), we never actually create a `TokenStream`.

This implementation has a number of advantages over the previous one:

* It is significantly simpler, with no edge cases around capturing the
  start/end of a delimited group.

* It can be easily extended to allow replacing tokens an an arbitrary
  'depth' by just using `Vec::splice` at the proper position. This is
  important for PR #76130, which requires us to track information about
  attributes along with tokens.

* The lazy approach to `TokenStream` construction allows us to easily
  parse an AST struct, and then decide after the fact whether we need a
  `TokenStream`. This will be useful when we start collecting tokens for
  `Attribute` - we can discard the `LazyTokenStream` if the parsed
  attribute doesn't need tokens (e.g. is a builtin attribute).

The performance impact seems to be neglibile (see
https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a
small slowdown on a few benchmarks, but it only rises above 1% for incremental
builds, where it represents a larger fraction of the much smaller instruction
count. There a ~1% speedup on a few other incremental benchmarks - my guess is
that the speedups and slowdowns will usually cancel out in practice.
This commit is contained in:
Aaron Hill
2020-09-26 21:56:29 -04:00
parent cb2462c53f
commit 593fdd3d45
7 changed files with 252 additions and 165 deletions

View File

@@ -24,7 +24,7 @@ pub use UnsafeSource::*;
use crate::ptr::P;
use crate::token::{self, CommentKind, DelimToken};
use crate::tokenstream::{DelimSpan, TokenStream, TokenTree};
use crate::tokenstream::{DelimSpan, LazyTokenStream, TokenStream, TokenTree};
use rustc_data_structures::stable_hasher::{HashStable, StableHasher};
use rustc_data_structures::stack::ensure_sufficient_stack;
@@ -97,7 +97,7 @@ pub struct Path {
/// The segments in the path: the things separated by `::`.
/// Global paths begin with `kw::PathRoot`.
pub segments: Vec<PathSegment>,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
impl PartialEq<Symbol> for Path {
@@ -535,7 +535,7 @@ pub struct Block {
/// Distinguishes between `unsafe { ... }` and `{ ... }`.
pub rules: BlockCheckMode,
pub span: Span,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
/// A match pattern.
@@ -546,7 +546,7 @@ pub struct Pat {
pub id: NodeId,
pub kind: PatKind,
pub span: Span,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
impl Pat {
@@ -892,7 +892,7 @@ pub struct Stmt {
pub id: NodeId,
pub kind: StmtKind,
pub span: Span,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
impl Stmt {
@@ -1040,7 +1040,7 @@ pub struct Expr {
pub kind: ExprKind,
pub span: Span,
pub attrs: AttrVec,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
// `Expr` is used a lot. Make sure it doesn't unintentionally get bigger.
@@ -1835,7 +1835,7 @@ pub struct Ty {
pub id: NodeId,
pub kind: TyKind,
pub span: Span,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
impl Clone for Ty {
@@ -2408,7 +2408,7 @@ impl<D: Decoder> rustc_serialize::Decodable<D> for AttrId {
pub struct AttrItem {
pub path: Path,
pub args: MacArgs,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
/// A list of attributes.
@@ -2482,7 +2482,7 @@ pub enum CrateSugar {
pub struct Visibility {
pub kind: VisibilityKind,
pub span: Span,
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
#[derive(Clone, Encodable, Decodable, Debug)]
@@ -2569,7 +2569,7 @@ pub struct Item<K = ItemKind> {
///
/// Note that the tokens here do not include the outer attributes, but will
/// include inner attributes.
pub tokens: Option<TokenStream>,
pub tokens: Option<LazyTokenStream>,
}
impl Item {