Fix parenthesization of subexprs containing statement boundary
This PR fixes a multitude of false negatives and false positives in the AST pretty printer's parenthesis insertion related to statement boundaries — statements which terminate unexpectedly early if there aren't parentheses.
Without this fix, the AST pretty printer (including both `stringify!` and `rustc -Zunpretty=expanded`) is prone to producing output which is not syntactically valid Rust. Invalid output is problematic because it means Rustfmt is unable to parse the output of `cargo expand`, for example, causing friction by forcing someone trying to debug a macro into reading poorly formatted code.
I believe the set of bugs fixed in this PR account for the most prevalent reason that `cargo expand` produces invalid output in real-world usage.
Fixes#98790.
## False negatives
The following is a correct program — `cargo check` succeeds.
```rust
macro_rules! m {
($e:expr) => {
match () { _ => $e }
};
}
fn main() {
m!({ 1 } - 1);
}
```
But `rustc -Zunpretty=expanded main.rs` produces output that is invalid Rust syntax, because parenthesization is needed and not being done by the pretty printer.
```rust
fn main() { match () { _ => { 1 } - 1, }; }
```
Piping this expanded code to rustfmt, it fails to parse.
```console
error: unexpected `,` in pattern
--> <stdin>:1:38
|
1 | fn main() { match () { _ => { 1 } - 1, }; }
| ^
|
help: try adding parentheses to match on a tuple...
|
1 | fn main() { match () { _ => { 1 } (- 1,) }; }
| + +
help: ...or a vertical bar to match on multiple alternatives
|
1 | fn main() { match () { _ => { 1 } - 1 | }; }
| ~~~~~
```
Fixed output after this PR:
```rust
fn main() { match () { _ => ({ 1 }) - 1, }; }
```
## False positives
Less problematic, but worth fixing (just like #118726).
```rust
fn main() {
let _ = match () { _ => 1 } - 1;
}
```
Output of `rustc -Zunpretty=expanded lib.rs` before this PR. There is no reason parentheses would need to be inserted there.
```rust
fn main() { let _ = (match () { _ => 1, }) - 1; }
```
After this PR:
```rust
fn main() { let _ = match () { _ => 1, } - 1; }
```
## Alternatives considered
In this PR I opted to parenthesize only the leading subexpression causing the statement boundary, rather than the entire statement. Example:
```rust
macro_rules! m {
($e:expr) => {
$e
};
}
fn main() {
m!(loop { break [1]; }[0] - 1);
}
```
This PR produces the following pretty-printed contents for fn main:
```rust
(loop { break [1]; })[0] - 1;
```
A different equally correct output would be:
```rust
(loop { break [1]; }[0] - 1);
```
I chose the one I did because it is the *only* approach used by handwritten code in the standard library and compiler. There are 4 places where parenthesization is being used to prevent a statement boundary, and in all 4, the developer has chosen to parenthesize the smallest subexpression rather than the whole statement:
b37d43efd9/compiler/rustc_codegen_cranelift/example/alloc_system.rs (L102)b37d43efd9/compiler/rustc_parse/src/errors.rs (L1021-L1029)b37d43efd9/library/core/src/future/poll_fn.rs (L151)b37d43efd9/library/core/src/ops/range.rs (L824-L828)
Introduce support for `async gen` blocks
I'm delighted to demonstrate that `async gen` block are not very difficult to support. They're simply coroutines that yield `Poll<Option<T>>` and return `()`.
**This PR is WIP and in draft mode for now** -- I'm mostly putting it up to show folks that it's possible. This PR needs a lang-team experiment associated with it or possible an RFC, since I don't think it falls under the jurisdiction of the `gen` RFC that was recently authored by oli (https://github.com/rust-lang/rfcs/pull/3513, https://github.com/rust-lang/rust/issues/117078).
### Technical note on the pre-generator-transform yield type:
The reason that the underlying coroutines yield `Poll<Option<T>>` and not `Poll<T>` (which would make more sense, IMO, for the pre-transformed coroutine), is because the `TransformVisitor` that is used to turn coroutines into built-in state machine functions would have to destructure and reconstruct the latter into the former, which requires at least inserting a new basic block (for a `switchInt` terminator, to match on the `Poll` discriminant).
This does mean that the desugaring (at the `rustc_ast_lowering` level) of `async gen` blocks is a bit more involved. However, since we already need to intercept both `.await` and `yield` operators, I don't consider it much of a technical burden.
r? `@ghost`
never_patterns: Parse match arms with no body
Never patterns are meant to signal unreachable cases, and thus don't take bodies:
```rust
let ptr: *const Option<!> = ...;
match *ptr {
None => { foo(); }
Some(!),
}
```
This PR makes rustc accept the above, and enforces that an arm has a body xor is a never pattern. This affects parsing of match arms even with the feature off, so this is delicate. (Plus this is my first non-trivial change to the parser).
~~The last commit is optional; it introduces a bit of churn to allow the new suggestions to be machine-applicable. There may be a better solution? I'm not sure.~~ EDIT: I removed that commit
r? `@compiler-errors`
- Rename them both `as_str`, which is the typical name for a function
that returns a `&str`. (`to_string` is appropriate for functions
returning `String` or maybe `Cow<'a, str>`.)
- Change `UnOp::as_str` from an associated function (weird!) to a
method.
- Avoid needless `self` dereferences.
There was an incomplete version of the check in parsing and a second
version in AST validation. This meant that some, but not all, invalid
uses were allowed inside macros/disabled cfgs. It also means that later
passes have a hard time knowing when the let expression is in a valid
location, sometimes causing ICEs.
- Add a field to ExprKind::Let in AST/HIR to mark whether it's in a
valid location.
- Suppress later errors and MIR construction for invalid let
expressions.
Indexing is similar to method calls in having an arbitrary
left-hand-side and then something on the right, which is the main part
of the expression. Method calls already have a span for that right part,
but indexing does not. This means that long method chains that use
indexing have really bad spans, especially when the indexing panics and
that span in coverted into a panic location.
This does the same thing as method calls for the AST and HIR, storing an
extra span which is then put into the `fn_span` field in THIR.
My type ascription
Oh rip it out
Ah
If you think we live too much then
You can sacrifice diagnostics
Don't mix your garbage
Into my syntax
So many weird hacks keep diagnostics alive
Yet I don't even step outside
So many bad diagnostics keep tyasc alive
Yet tyasc doesn't even bother to survive!
Close parentheses for `offset_of` in AST pretty printing
HIR pretty printing already handles it correctly.
This will conflict with #110694 but it seems like that PR is gonna take bit more time.
Move format_args!() into AST (and expand it during AST lowering)
Implements https://github.com/rust-lang/compiler-team/issues/541
This moves FormatArgs from rustc_builtin_macros to rustc_ast_lowering. For now, the end result is the same. But this allows for future changes to do smarter things with format_args!(). It also allows Clippy to directly access the ast::FormatArgs, making things a lot easier.
This change turns the format args types into lang items. The builtin macro used to refer to them by their path. After this change, the path is no longer relevant, making it easier to make changes in `core`.
This updates clippy to use the new language items, but this doesn't yet make clippy use the ast::FormatArgs structure that's now available. That should be done after this is merged.
Remove `token::Lit` from `ast::MetaItemLit`.
Currently `ast::MetaItemLit` represents the literal kind twice. This PR removes that redundancy. Best reviewed one commit at a time.
r? `@petrochenkov`
Remove useless borrows and derefs
They are nothing more than noise.
<sub>These are not all of them, but my clippy started crashing (stack overflow), so rip :(</sub>
This is required to distinguish between cooked and raw byte string
literals in an `ast::LitKind`, without referring to an adjacent
`token::Lit`. It's a prerequisite for the next commit.