rust-lang/rust - rust - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Nicholas Nethercote	880e6f716d	Use `ThinVec` to shrink `LazyAttrTokenStreamInner`.	2025-04-30 07:12:09 +10:00
Nicholas Nethercote	298c56f4ba	Simplify `LazyAttrTokenStream`. This commit does the following. - Changes it from `Lrc<Box<dyn ToAttrTokenStream>>` to `Lrc<LazyAttrTokenStreamInner>`. - Reworks `LazyAttrTokenStreamImpl` as `LazyAttrTokenStreamInner`, which is a two-variant enum. - Removes the `ToAttrTokenStream` trait and the two impls of it. The recursion limit must be increased in some crates otherwise rustdoc aborts.	2025-04-30 07:10:56 +10:00
Nicholas Nethercote	28236ab703	Move various token stream things from `rustc_parse` to `rustc_ast`. Specifically: `TokenCursor`, `TokenTreeCursor`, `LazyAttrTokenStreamImpl`, `FlatToken`, `make_attr_token_stream`, `ParserRange`, `NodeRange`. `ParserReplacement`, and `NodeReplacement`. These are all related to token streams, rather than actual parsing. This will facilitate the simplifications in the next commit.	2025-04-29 12:14:27 +10:00
Nicholas Nethercote	bf8ce32558	Remove `token::{Open,Close}Delim`. By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`. PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing with delimiters. It also makes `ast::TokenKind` more similar to `parser::TokenType`. This requires a few new methods: - `TokenKind::is_{,open_,close_}delim()` replace various kinds of pattern matches. - `Delimiter::as_{open,close}_token_kind` are used to convert `Delimiter` values to `TokenKind`. Despite these additions, it's a net reduction in lines of code. This is because e.g. `token::OpenParen` is so much shorter than `token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms reduce to single line forms. And many places where the number of lines doesn't change are still easier to read, just because the names are shorter, e.g.: ``` - } else if self.token != token::CloseDelim(Delimiter::Brace) { + } else if self.token != token::CloseBrace { ```	2025-04-21 07:35:56 +10:00
Nicholas Nethercote	4d8f7577b5	Impl `Copy` for `Token` and `TokenKind`.	2025-04-02 16:16:49 +11:00
bjorn3	1fcae03369	Rustfmt	2025-02-08 22:12:13 +00:00
Nicholas Nethercote	afe238f66f	Introduce `InvisibleOrigin` on invisible delimiters. It's not used meaningfully yet, but will be needed to get rid of interpolated tokens.	2024-11-21 08:16:54 +11:00
Nicholas Nethercote	981dc02eaf	Revert "Avoid nested replacement ranges" from #129346 . It caused a test regression in the `cfg_eval.rs` crate. (The bugfix in #129346 was in a different commit; this commit was just a code simplification.)	2024-11-04 15:57:35 +11:00
Jubilee	515bdcda01	Rollup merge of #130551 - nnethercote:fix-break-last-token, r=petrochenkov Fix `break_last_token`. It currently doesn't handle the three-char tokens `>>=` and `<<=` correctly. These can be broken twice, resulting in three individual tokens. This is a latent bug that currently doesn't cause any problems, but does cause problems for #124141, because that PR increases the usage of lazy token streams. r? `@petrochenkov`	2024-09-23 07:54:44 -07:00
Nicholas Nethercote	73cc575177	Fix `break_last_token`. It currently doesn't handle the three-char tokens `>>=` and `<<=` correctly. These can be broken twice, resulting in three individual tokens. This is a latent bug that currently doesn't cause any problems, but does cause problems for #124141, because that PR increases the usage of lazy token streams.	2024-09-23 09:14:30 +10:00
Michael Goulet	c682aa162b	Reformat using the new identifier sorting from rustfmt	2024-09-22 19:11:29 -04:00
Michael Goulet	6d064295c8	clippy::useless_conversion	2024-09-11 17:52:53 -04:00
Nicholas Nethercote	d4bf28c014	Optimize `collect_tokens` a little. Use `Cow` to avoid cloning `ret.attrs()` unless necessary. This requires moving some things around to satisfy the borrow checker.	2024-08-24 06:58:35 +10:00
Nicholas Nethercote	1fdabfbebb	Avoid double-handling of attributes in `collect_tokens`. By keeping track of attributes that have been previously processed. This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary for #124141 which removes nonterminals. Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives slightly better perf than 4 now that there's an `IntervalSet` in `Parser`, which is cloned reasonably often.	2024-08-24 06:57:47 +10:00
Nicholas Nethercote	0bae33fcd5	Avoid nested replacement ranges. In a case like this: ``` mod a { mod b { #[cfg_attr(unix, inline)] fn f() { #[cfg_attr(linux, inline)] fn g1() {} #[cfg_attr(linux, inline)] fn g2() {} } } } ``` We currently end up with the following replacement ranges. - The lazy tokens for `f` has replacement ranges for `g1` and `g2`. - The lazy tokens for `a` has replacement ranges for `f`, `g1`, and `g2`. I.e. the replacement ranges for `g1` and `g2` are duplicated. In general, replacement ranges for inner AST nodes are duplicated up the chain for each nested `collect_tokens` call. And the code that processes the replacements is careful about the ordering in which the replacements are applied, to ensure that inner replacements are applied before outer replacements. But all of this is unnecessary. If you apply an inner replacement and then an outer replacement, the outer replacement completely overwrites the inner replacement. This commit avoids the duplication by removing replacements from `self.capture_state.parser_replacements` when they are used. (The effect on the example above is that the lazy tokesn for `a` no longer include replacement ranges for `g1` and `g2`.) This eliminates the possibility of nested replacements on individual AST nodes, which avoids the need for careful ordering of replacements.	2024-08-23 14:40:08 +10:00
Nicholas Nethercote	1ae521e9d5	Return earlier in some cases in `collect_token`. This example triggers an assertion failure: ``` fn f() -> u32 { #[cfg_eval] #[cfg(not(FALSE))] 0 } ``` The sequence of events: - `configure_annotatable` calls `parse_expr_force_collect`, which calls `collect_tokens`. - Within that, we end up in `parse_expr_dot_or_call`, which again calls `collect_tokens`. - The return value of the `f` call is the expression `0`. - This inner call collects tokens for `0` (parser range 10..11) and creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range 0..11). - We return to the outer `collect_tokens` call. The return value of the `f` call is again the expression `0`, again with the range 10..11, but the replacement from earlier covers the range 0..11. The code mistakenly assumes that any attributes from an inner `collect_tokens` call fit entirely within the body of the result of an outer `collect_tokens` call. So it adjusts the replacement parser range 0..11 to a node range by subtracting 10, resulting in -10..1. This is an invalid range and triggers an assertion failure. It's tricky to follow, but basically things get complicated when an AST node is returned from an inner `collect_tokens` call and then returned again from an outer `collect_token` node without being wrapped in any kind of additional layer. This commit changes `collect_tokens` to return early in some extra cases, avoiding the construction of lazy tokens. In the example above, the outer `collect_tokens` returns earlier because the `0` token already has tokens and `self.capture_state.capturing` is `Capturing::No`. This early return avoids the creation of the invalid range and the assertion failure. Fixes #129166. Note: these invalid ranges have been happening for a long time. #128725 looks like it's at fault only because it introduced the assertion that catches the invalid ranges.	2024-08-23 14:40:08 +10:00
Nicholas Nethercote	312ecdb2ed	Avoid unnecessary `cloned`.	2024-08-23 14:40:08 +10:00
Nicholas Nethercote	deab741ab4	Clarify a comment.	2024-08-23 14:40:08 +10:00
Nicholas Nethercote	9d31f86f0d	Overhaul token collection. This commit does the following. - Renames `collect_tokens_trailing_token` as `collect_tokens`, because (a) it's annoying long, and (b) the `_trailing_token` bit is less accurate now that its types have changed. - In `collect_tokens`, adds a `Option<CollectPos>` argument and a `UsePreAttrPos` in the return type of `f`. These are used in `parse_expr_force_collect` (for vanilla expressions) and in `parse_stmt_without_recovery` (for two different cases of expression statements). Together these ensure are enough to fix all the problems with token collection and assoc expressions. The changes to the `stringify.rs` test demonstrate some of these. - Adds a new test. The code in this test was causing an assertion failure prior to this commit, due to an invalid `NodeRange`. The extra complexity is annoying, but necessary to fix the existing problems.	2024-08-16 09:07:55 +10:00
Nicholas Nethercote	c8098be41f	Convert a bool to `Trailing`. This pre-existing type is suitable for use with the return value of the `f` parameter in `collect_tokens_trailing_token`. The more descriptive name will be useful because the next commit will add another boolean value to the return value of `f`.	2024-08-16 09:07:29 +10:00
Nicholas Nethercote	55906aa240	Make visibilities minimal and consistent in `attr_wrapper.rs`.	2024-08-16 09:06:15 +10:00
Nicholas Nethercote	af0093a6b8	Remove size assertion on `AttrWrapper`. It's not an important type when it comes to memory use.	2024-08-16 09:06:15 +10:00
Nicholas Nethercote	d1f05fd184	Distinguish the two kinds of token range. When collecting tokens there are two kinds of range: - a range relative to the parser's full token stream (which we get when we are parsing); - a range relative to a single AST node's token stream (which we use within `LazyAttrTokenStreamImpl` when replacing tokens). These are currently both represented with `Range<u32>` and it's easy to mix them up -- until now I hadn't properly understood the difference. This commit introduces `ParserRange` and `NodeRange` to distinguish them. This also requires splitting `ReplaceRange` in two, giving the new types `ParserReplacement` and `NodeReplacement`. (These latter two names reduce the overloading of the word "range".) The commit also rewrites some comments to be clearer. The end result is a little more verbose, but much clearer.	2024-08-01 19:30:40 +10:00
Nicholas Nethercote	2eb2ef1684	Streamline attribute stitching on AST nodes. It can be done more concisely.	2024-08-01 19:30:32 +10:00
Nicholas Nethercote	84ac80f192	Reformat `use` declarations. The previous commit updated `rustfmt.toml` appropriately. This commit is the outcome of running `x fmt --all` with the new formatting options.	2024-07-29 08:26:52 +10:00
Trevor Gross	af52be2cea	Rollup merge of #128224 - nnethercote:fewer-replace_ranges, r=petrochenkov Remove unnecessary range replacements This PR removes an unnecessary range replacement in `collect_tokens_trailing_token`, and does a couple of other small cleanups. r? ````@petrochenkov````	2024-07-26 19:03:06 -04:00
Nicholas Nethercote	55d37ae711	Remove an unnecessary block.	2024-07-26 17:37:03 +10:00
Nicholas Nethercote	6ea2da5a28	Tweak a loop. A fully imperative style is easier to read than a half-iterator, half-imperative style. Also, rename `inner_attr` as `attr` because it might be an outer attribute.	2024-07-26 17:37:03 +10:00
Nicholas Nethercote	6e87858f26	Fix a comment. Imagine you have replace ranges (2..20,X) and (5..15,Y), and these tokens: ``` a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x ``` If we replace (5..15,Y) first, then (2..20,X) we get this sequence ``` a,b,c,d,e,Y,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x ``` which is what we want. If we do it in the other order, we get this: ``` a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x a,b,X,_,_,Y,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x ``` which is wrong. So it's true that we need the `.rev()` but the comment is wrong about why.	2024-07-26 17:37:03 +10:00
Nicholas Nethercote	a560810a69	Don't include inner attribute ranges in `CaptureState`. The current code is this: ``` self.capture_state.replace_ranges.push((start_pos..end_pos, Some(target))); self.capture_state.replace_ranges.extend(inner_attr_replace_ranges); ``` What's not obvious is that every range in `inner_attr_replace_ranges` must be a strict sub-range of `start_pos..end_pos`. Which means, in `LazyAttrTokenStreamImpl::to_attr_token_stream`, they will be done first, and then the `start_pos..end_pos` replacement will just overwrite them. So they aren't needed.	2024-07-26 14:18:20 +10:00
Nicholas Nethercote	e631b1ebfa	Invert the sense of `is_complete` and rename it as `needs_tokens`. I have always found `is_complete` an unhelpful name. The new name (and inverted sense) fits in better with the conditions at its call sites.	2024-07-26 09:58:34 +10:00
Nicholas Nethercote	3d363c3d99	Move `is_complete` to the module that uses it. And make it non-`pub`.	2024-07-26 09:44:39 +10:00
Nicholas Nethercote	4288edb219	Inline and remove `AttrWrapper::is_complete`. It has a single call site. This change makes the two `needs_collect` conditions more similar to each other, and therefore easier to understand.	2024-07-26 09:44:07 +10:00
Nicholas Nethercote	caee195bdd	Invert early exit conditions in `collect_tokens_trailing_token`. This has been bugging me for a while. I find complex "if any of these are true" conditions easier to think about than complex "if all of these are true" conditions, because you can stop as soon as one is true.	2024-07-26 09:43:41 +10:00
Nicholas Nethercote	1dd566a6d0	Overhaul comments in `collect_tokens_trailing_token`. Adding details, clarifying lots of little things, etc. In particular, the commit adds details of an example. I find this very helpful, because it's taken me a long time to understand how this code works.	2024-07-19 15:25:55 +10:00
Nicholas Nethercote	f9c7ca70cb	Move `inner_attr` code downwards. This puts it just before the `replace_ranges` initialization, which makes sense because the two variables are closely related.	2024-07-19 15:25:54 +10:00
Nicholas Nethercote	1f67cf9e63	Remove `final_attrs` local variable. It's no shorter than `ret.attrs()`, and `ret.attrs()` is used multiple times earlier in the function.	2024-07-19 15:25:54 +10:00
Nicholas Nethercote	757f73f506	Simplify `CaptureState::inner_attr_ranges`. The `Option`s within the `ReplaceRange`s within the hashmap are always `None`. This PR omits them and inserts them when they are extracted from the hashmap.	2024-07-19 15:25:54 +10:00
Nicholas Nethercote	487802d6c8	Remove `TrailingToken`. It's used in `Parser::collect_tokens_trailing_token` to decide whether to capture a trailing token. But the callers actually know whether to capture a trailing token, so it's simpler for them to just pass in a bool. Also, the `TrailingToken::Gt` case was weird, because it didn't result in a trailing token being captured. It could have been subsumed by the `TrailingToken::MaybeComma` case, and it effectively is in the new code.	2024-07-18 17:28:49 +10:00
Jubilee	125343e7ab	Rollup merge of #127558 - nnethercote:more-Attribute-cleanups, r=petrochenkov More attribute cleanups A follow-up to #127308. r? ```@petrochenkov```	2024-07-13 20:19:46 -07:00
Nicholas Nethercote	8a390bae06	Change empty replace range condition. The new condition is equivalent in practice, but it's much more obvious that it would result in an empty range, because the condition lines up with the contents of the iterator.	2024-07-10 14:41:39 +10:00
Nicholas Nethercote	f5527949f2	Move `Spacing` into `FlatToken`. It's only needed for the `FlatToken::Token` variant. This makes things a little more concise.	2024-07-09 21:54:32 +10:00
Nicholas Nethercote	a88c4d67d9	Split the stack in `make_attr_token_stream`. It makes for shorter code, and fewer allocations.	2024-07-08 19:04:13 +10:00
Nicholas Nethercote	b16201317e	Use iterator normally in `make_attr_token_stream`. In a `for` loop, instead of a `while` loop.	2024-07-08 19:04:13 +10:00
Nicholas Nethercote	a47ae57a18	Use an `@` pattern to shorten some code.	2024-07-08 19:03:50 +10:00
Nicholas Nethercote	99721c8469	Clear `inner_attr_ranges` regularly. There's a comment saying we don't do it for performance reasons, but it doesn't actually affect performance. The commit also tweaks the control flow, to make clearer that two code paths are mutually exclusive.	2024-07-08 16:53:10 +10:00
Nicholas Nethercote	022582ca46	Remove `Clone` derive from `LazyAttrTokenStreamImpl`.	2024-07-07 16:24:51 +10:00
Nicholas Nethercote	3a5c4b6e4e	Rename some attribute types for consistency. - `AttributesData` -> `AttrsTarget` - `AttrTokenTree::Attributes` -> `AttrTokenTree::AttrsTarget` - `FlatToken::AttrTarget` -> `FlatToken::AttrsTarget`	2024-07-07 16:14:30 +10:00
Nicholas Nethercote	9d33a8fe51	Simplify `ReplaceRange`. Currently the second element is a `Vec<(FlatToken, Spacing)>`. But the vector always has zero or one elements, and the `FlatToken` is always `FlatToken::AttrTarget` (which contains an `AttributesData`), and the spacing is always `Alone`. So we can simplify it to `Option<AttributesData>`. An assertion in `to_attr_token_stream` can can also be removed, because `new_tokens.len()` was always 0 or 1, which means than `range.len()` is always greater than or equal to it, because `range.is_empty()` is always false (as per the earlier assertion).	2024-07-07 15:58:36 +10:00
Nicholas Nethercote	dd790ab8ef	Remove some unnecessary integer conversions. These should have been removed in #127233 when the positions were changed from `usize` to `u32`.	2024-07-05 08:27:17 +10:00

1 2 3

111 Commits