The reader itself doesn't need ability to peek tokens, so it's better
if clients implement this functionality.
This hopefully becomes especially easy once we use iterator interface
for lexer, but this is not too easy at the moment, because of buffered
errors.
Allow attributes in formal function parameters
Implements https://github.com/rust-lang/rust/issues/60406.
This is my first contribution to the compiler and since this is a large and complex project, I am not fully aware of the consequences of the changes I have made.
**TODO**
- [x] Forbid some built-in attributes.
- [x] Expand cfg/cfg_attr
lexer: Disallow bare CR in raw byte strings
Handles bare CR ~but doesn't translate `\r\n` to `\n` yet in raw strings yet~ and translates CRLF to LF in raw strings.
As a side-note I think it'd be good to change the `unescape_` to return plain iterators to reduce some boilerplate (e.g. `has_error` could benefit from collecting `Result<T>` and aborting early on errors) but will do that separately, unless I missed something here that prevents it.
@matklad @petrochenkov thoughts?
It was commented out as part of
8a8e497ae7.
Done probably by accident, since the code in question was moved to a
match arm, along with newly introduced logic to detect bare CRs in raw
strings.
Move token tree related lexer state to a separate struct
Just a types-based refactoring.
We only used a bunch of fields when tokenizing into a token tree, so let's move them out of the base lexer
Identify when a stmt could have been parsed as an expr
There are some expressions that can be parsed as a statement without
a trailing semicolon depending on the context, which can lead to
confusing errors due to the same looking code being accepted in some
places and not others. Identify these cases and suggest enclosing in
parenthesis making the parse non-ambiguous without changing the
accepted grammar.
Fix#54186, cc #54482, fix#59975, fix#47287.
Currently, we deal with escape sequences twice: once when we lex a
string, and a second time when we unescape literals. This PR aims to
remove this duplication, by introducing a new `unescape` mode as a
single source of truth for character escaping rules
There are some expressions that can be parsed as a statement without
a trailing semicolon depending on the context, which can lead to
confusing errors due to the same looking code being accepted in some
places and not others. Identify these cases and suggest enclosing in
parenthesis making the parse non-ambiguous without changing the
accepted grammar.