Fix incorrect eq_unspanned in TokenStream
Fixesrust-lang/rust#141522
r? ``@workingjubilee``
should we remove this function?
since it's used in several places, i'd prefer to keep it.
Remove `Path::is_ident`.
It checks that a path has a single segment that matches the given symbol, and that there are zero generic arguments. It has a single use.
We also have `impl PartialEq<Symbol> for Path` which does exactly the same thing *except* it doesn't check for zero generic arguments, which seems like an oversight. It has numerous uses.
This commit removes `Path::is_ident`, adds a test for zero generic arguments to `PartialEq<Symbol> for Path`, and changes the single use of `is_ident` to instead use `==`.
r? `@wesleywiser`
This adds an `iter!` macro that can be used to create movable
generators.
This also adds a yield_expr feature so the `yield` keyword can be used
within iter! macro bodies. This was needed because several unstable
features each need `yield` expressions, so this allows us to stabilize
them separately from any individual feature.
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
Co-authored-by: Jieyou Xu <jieyouxu@outlook.com>
Co-authored-by: Travis Cross <tc@traviscross.com>
It checks that a path has a single segment that matches the given
symbol, and that there are zero generic arguments. It has a single use.
We also have `impl PartialEq<Symbol> for Path` which does exactly the
same thing *except* it doesn't check for zero generic arguments, which
seems like an oversight. It has numerous uses.
This commit removes `Path::is_ident`, adds a test for zero generic
arguments to `PartialEq<Symbol> for Path`, and changes the single use of
`is_ident` to instead use `==`.
Reorder `ast::ItemKind::{Struct,Enum,Union}` fields.
So they match the order of the parts in the source code, e.g.:
```
struct Foo<T, U> { t: T, u: U }
<-><----> <------------>
/ | \
ident generics variant_data
```
r? `@fee1-dead`
Fix ICE in tokenstream with contracts from parser recovery
Fixesrust-lang/rust#140683
After two times of parsing error, the `recover_stmt_` constructs an error ast, then when we expand macors, the invalid tokenstream triggered ICE because of mismatched delims.
Expected `{` and get other tokens is an obvious error message, too much effort on recovery may introduce noise.
r? ```@nnethercote```
So they match the order of the parts in the source code, e.g.:
```
struct Foo<T, U> { t: T, u: U }
<-><----> <------------>
/ | \
ident generics variant_data
```
All uses have been removed. And it's nonsensical: an identifier by
definition has at least one char.
The commits adds an is-non-empty assertion to `Ident::new` to enforce
this, and converts some `Ident` constructions to use `Ident::new`.
Adding the assertion requires making `Ident::new` and
`Ident::with_dummy_span` non-const, which is no great loss.
The commit amends a couple of places that do path splitting to ensure no
empty identifiers are created.
This commit does the following.
- Changes it from `Lrc<Box<dyn ToAttrTokenStream>>` to
`Lrc<LazyAttrTokenStreamInner>`.
- Reworks `LazyAttrTokenStreamImpl` as `LazyAttrTokenStreamInner`, which
is a two-variant enum.
- Removes the `ToAttrTokenStream` trait and the two impls of it.
The recursion limit must be increased in some crates otherwise rustdoc
aborts.
Specifically: `TokenCursor`, `TokenTreeCursor`,
`LazyAttrTokenStreamImpl`, `FlatToken`, `make_attr_token_stream`,
`ParserRange`, `NodeRange`. `ParserReplacement`, and `NodeReplacement`.
These are all related to token streams, rather than actual parsing.
This will facilitate the simplifications in the next commit.
Clean: rename `open_braces` to `open_delimiters` in lexer and move `make_unclosed_delims_error` into `diagnostics.rs`.
Clean code prepared for resolving #138401. To avoid having too many extraneous changes in one PR, I cleaned up some of the naming and method placement in lexer in this PR.
1. For the make_unclosed_delims_error function defined in mod.rs is only used in lexer, so moved into lexer, which enhances encapsulation.
2. For open_braces in TokenTreeDiagInfo the naming is not canonical, as Brace refers to `{...} ` and this variable can store all kinds of different Delimiters. so I named it open_delimiters.
r? `@chenyukang`
Handle another negated literal in `eat_token_lit`.
Extends the change from #139653, which was on expressions, to literals.
Fixes#140098.
r? ``@petrochenkov``
Remove `token::{Open,Close}Delim`
By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`.
PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by
replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless
variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing
with delimiters. It also makes `ast::TokenKind` more similar to
`parser::TokenType`.
This requires a few new methods:
- `TokenKind::is_{,open_,close_}delim()` replace various kinds of
pattern matches.
- `Delimiter::as_{open,close}_token_kind` are used to convert
`Delimiter` values to `TokenKind`.
Despite these additions, it's a net reduction in lines of code. This is
because e.g. `token::OpenParen` is so much shorter than
`token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms
reduce to single line forms. And many places where the number of lines
doesn't change are still easier to read, just because the names are
shorter, e.g.:
```
- } else if self.token != token::CloseDelim(Delimiter::Brace) {
+ } else if self.token != token::CloseBrace {
```
r? `@petrochenkov`
By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`.
PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by
replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless
variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing
with delimiters. It also makes `ast::TokenKind` more similar to
`parser::TokenType`.
This requires a few new methods:
- `TokenKind::is_{,open_,close_}delim()` replace various kinds of
pattern matches.
- `Delimiter::as_{open,close}_token_kind` are used to convert
`Delimiter` values to `TokenKind`.
Despite these additions, it's a net reduction in lines of code. This is
because e.g. `token::OpenParen` is so much shorter than
`token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms
reduce to single line forms. And many places where the number of lines
doesn't change are still easier to read, just because the names are
shorter, e.g.:
```
- } else if self.token != token::CloseDelim(Delimiter::Brace) {
+ } else if self.token != token::CloseBrace {
```
Rollup of 9 pull requests
Successful merges:
- #135340 (Add `explicit_extern_abis` Feature and Enforce Explicit ABIs)
- #139440 (rustc_target: RISC-V: feature addition batch 2)
- #139667 (cfi: Remove #[no_sanitize(cfi)] for extern weak functions)
- #139828 (Don't require rigid alias's trait to hold)
- #139854 (Improve parse errors for stray lifetimes in type position)
- #139889 (Clean UI tests 3 of n)
- #139894 (Fix `opt-dist` CLI flag and make it work without LLD)
- #139900 (stepping into impls for normalization is unproductive)
- #139915 (replace some #[rustc_intrinsic] usage with use of the libcore declarations)
r? `@ghost`
`@rustbot` modify labels: rollup
Improve parse errors for stray lifetimes in type position
While technically & syntactically speaking lifetimes do begin[^1] types in type contexts (this essentially excludes generic argument lists) and require a following `+` to form a complete type (`'a +` denotes a bare trait object type), the likelihood that a user meant to write a lifetime-prefixed bare trait object type in *modern* editions (Rust ≥2021) when placing a lifetime into a type context is incredibly low (they would need to add at least three tokens to turn it into a *semantically* well-formed TOT: `'a` → `dyn 'a + Trait`).
Therefore let's *lie* in modern editions (just like in PR https://github.com/rust-lang/rust/pull/131239, a precedent if you will) by stating "*expected type, found lifetime*" in such cases which is a lot more a approachable, digestible and friendly compared to "*lifetime in trait object type must be followed by `+`*" (as added in PR https://github.com/rust-lang/rust/pull/69760).
I've also added recovery for "ampersand-less" reference types (e.g., `'a ()`, `'a mut Ty`) in modern editions because it was trivial to do and I think it's not unlikely to occur in practice.
Fixes#133413.
[^1]: For example, in the context of decl macros, this implies that a lone `'a` always matches syntax fragment `ty` ("even if" there's a later macro matcher expecting syntax fragment `lifetime`). Rephrased, lifetimes (in type contexts) *commit* to the type parser.
Namely, use a more sensical primary span.
Don't pretty-print AST nodes for the diagnostic message. Why:
* It's lossy (e.g., it doesn't replicate trailing `+`s in trait objects.
* It's prone to leak error nodes (printed as `(/*ERROR*/)`) since
the LHS can easily represent recovered code (e.g., `fn(i32?) + T`).
Detect and provide suggestion for `&raw EXPR`
When emitting an error in the parser, and we detect that the previous token was `raw` and we *could* have consumed `const`/`mut`, suggest that this may have been a mistyped raw ref expr. To do this, we add `const`/`mut` to the expected token set when parsing `&raw` as an expression (which does not affect the "good path" of parsing, for the record).
This is kind of a rudimentary error improvement, since it doesn't actually attempt to recover anything, leading to some other knock-on errors b/c we still treat `&raw` as the expression that was parsed... but at least we add the suggestion! I don't think the parser grammar means we can faithfully recover `&raw EXPR` early, i.e. during `parse_expr_borrow`.
Fixes#133231
`resolve_ident_in_lexical_scope` checks for an empty name. Why is this
necessary? Because `parse_item_impl` can produce an `impl` block with an
empty trait name in some cases. This is pretty gross and very
non-obvious.
This commit avoids the use of the empty trait name. In one case the
trait name is instead pulled from `TyKind::ImplTrait`, which prevents
the output for `tests/ui/impl-trait/extra-impl-in-trait-impl.rs` from
changing. In the other case we just fail the parse and don't try to
recover. I think losing error recovery in this obscure case is worth
the code cleanup.
This change affects `tests/ui/parser/impl-parsing.rs`, which is split in
two, and the obsolete `..` syntax cases are removed (they are tested
elsewhere).
Fixes#139445.
The additional errors aren't great but the first one is still good and
it's the most important, and imperfect errors are better than ICEing.
This can happen when invalid syntax is passed to a declarative macro. We
shouldn't be too strict about the token stream position once the parser
has rejected the invalid syntax.
Fixes#139248.
Implement `super let`
Tracking issue: https://github.com/rust-lang/rust/issues/139076
This implements `super let` as proposed in #139080, based on the following two equivalence rules.
1. For all expressions `$expr` in any context, these are equivalent:
- `& $expr`
- `{ super let a = & $expr; a }`
2. And, additionally, these are equivalent in any context when `$expr` is a temporary (aka rvalue):
- `& $expr`
- `{ super let a = $expr; & a }`
So far, this experiment has a few interesting results:
## Interesting result 1
In this snippet:
```rust
super let a = f(&temp());
```
I originally expected temporary `temp()` would be dropped at the end of the statement (`;`), just like in a regular `let`, because `temp()` is not subject to temporary lifetime extension.
However, it turns out that that would break the fundamental equivalence rules.
For example, in
```rust
g(&f(&temp()));
```
the temporary `temp()` will be dropped at the `;`.
The first equivalence rule tells us this must be equivalent:
```rust
g({ super let a = &f(&temp()); a });
```
But that means that `temp()` must live until the last `;` (after `g()`), not just the first `;` (after `f()`).
While this was somewhat surprising to me at first, it does match the exact behavior we need for `pin!()`: The following _should work_. (See also https://github.com/rust-lang/rust/issues/138718)
```rust
g(pin!(f(&mut temp())));
```
Here, `temp()` lives until the end of the statement. This makes sense from the perspective of the user, as no other `;` or `{}` are visible. Whether `pin!()` uses a `{}` block internally or not should be irrelevant.
This means that _nothing_ in a `super let` statement will be dropped at the end of that super let statement. It does not even need its own scope.
This raises questions that are useful for later on:
- Will this make temporaries live _too long_ in cases where `super let` is used not in a hidden block in a macro, but as a visible statement in code like the following?
```rust
let writer = {
super let file = File::create(&format!("/home/{user}/test"));
Writer::new(&file)
};
```
- Is a `let` statement in a block still the right syntax for this? Considering it has _no_ scope of its own, maybe neither a block nor a statement should be involved
This leads me to think that instead of `{ super let $pat = $init; $expr }`, we might want to consider something like `let $pat = $init in $expr` or `$expr where $pat = $init`. Although there are also issues with these, as it isn't obvious anymore if `$init` should be subject to temporary lifetime extension. (Do we want both `let _ = _ in ..` and `super let _ = _ in ..`?)
## Interesting result 2
What about `super let x;` without initializer?
```rust
let a = {
super let x;
x = temp();
&x
};
```
This works fine with the implementation in this PR: `x` is extended to live as long as `a`.
While it matches my expectations, a somewhat interesting thing to realize is that these are _not_ equivalent:
- `super let x = $expr;`
- `super let x; x = $expr;`
In the first case, all temporaries in $expr will live at least as long as (the result of) the surrounding block.
In the second case, temporaries will be dropped at the end of the assignment statement. (Because the assignment statement itself "is not `super`".)
This difference in behavior might be confusing, but it _might_ be useful.
One might want to extend the lifetime of a variable without extending all the temporaries in the initializer expression.
On the other hand, that can also be expressed as:
- `let x = $expr; super let x = x;` (w/o temporary lifetime extension), or
- `super let x = { $expr };` (w/ temporary lifetime extension)
So, this raises these questions:
- Do we want to accept `super let x;` without initializer at all?
- Does it make sense for statements other than let statements to be "super"? An expression statement also drops temporaries at its `;`, so now that we discovered that `super let` basically disables that `;` (see interesting result 1), is there a use to having other statements without their own scope? (I don't think that's ever useful?)
## Interesting result 3
This works now:
```rust
super let Some(x) = a.get(i) else { return };
```
I didn't put in any special cases for `super let else`. This is just the behavior that 'naturally' falls out when implementing `super let` without thinking of the `let else` case.
- Should `super let else` work?
## Interesting result 4
This 'works':
```rust
fn main() {
super let a = 123;
}
```
I didn't put in any special cases for `super let` at function scope. I had expected the code to cause an ICE or other weird failure when used at function body scope, because there's no way to let the variable live as long as the result of the function.
This raises the question:
- Does this mean that this behavior is the natural/expected behavior when `super let` is used at function scope? Or is this just a quirk and should we explicitly disallow `super let` in a function body? (Probably the latter.)
---
The questions above do not need an answer to land this PR. These questions should be considered when redesigning/rfc'ing/stabilizing the feature.