Commit Graph

50 Commits

Author SHA1 Message Date
Nicholas Nethercote
84ac80f192 Reformat use declarations.
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
2024-07-29 08:26:52 +10:00
Nicholas Nethercote
6341935a13 Remove extern crate tracing from numerous crates. 2024-04-30 16:47:49 +10:00
Trevor Gross
80bb15ed91 Add compiler support for parsing f16 and f128 2024-03-14 00:40:22 -05:00
Nicholas Nethercote
ac47f6c666 Add suffixes to LitError.
To avoid some unwrapping.
2024-02-15 15:47:24 +11:00
Nicholas Nethercote
25ed6e43b0 Add ErrorGuaranteed to ast::LitKind::Err, token::LitKind::Err.
This mostly works well, and eliminates a couple of delayed bugs.

One annoying thing is that we should really also add an
`ErrorGuaranteed` to `proc_macro::bridge::LitKind::Err`. But that's
difficult because `proc_macro` doesn't have access to `ErrorGuaranteed`,
so we have to fake it.
2024-02-15 14:46:08 +11:00
Nicholas Nethercote
8b35f8e41e Remove LitError::LexerError.
`cook_lexer_literal` can emit an error about an invalid int literal but
then return a non-`Err` token. And then `integer_lit` has to account for
this to avoid printing a redundant error message.

This commit changes `cook_lexer_literal` to return `Err` in that case.
Then `integer_lit` doesn't need the special case, and
`LitError::LexerError` can be removed.
2024-02-15 12:58:18 +11:00
Nicholas Nethercote
86f371ed59 Rename the unescaping functions.
`unescape_literal` becomes `unescape_unicode`, and `unescape_c_string`
becomes `unescape_mixed`. Because rfc3349 will mean that C string
literals will no longer be the only mixed utf8 literals.
2024-01-25 12:28:11 +11:00
Nicholas Nethercote
a1c07214f0 Rework CStrUnit.
- Rename it as `MixedUnit`, because it will soon be used in more than
  just C string literals.
- Change the `Byte` variant to `HighByte` and use it only for
  `\x80`..`\xff` cases. This fixes the old inexactness where ASCII chars
  could be encoded with either `Byte` or `Char`.
- Add useful comments.
- Remove `is_ascii`, in favour of `u8::is_ascii`.
2024-01-25 12:28:11 +11:00
Nicholas Nethercote
314dbc7f22 Avoid useless checking in from_token_lit.
The parser already does a check-only unescaping which catches all
errors. So the checking done in `from_token_lit` never hits.

But literals causing warnings can still occur in `from_token_lit`. So
the commit changes `str-escape.rs` to use byte string literals and C
string literals as well, to give better coverage and ensure the new
assertions in `from_token_lit` are correct.
2024-01-25 12:22:17 +11:00
Josh Stone
33e0422826 Pack the u128 in LitKind::Int 2024-01-19 20:10:39 -08:00
Nicholas Nethercote
9018d2c455 Detect NulInCStr error earlier.
By making it an `EscapeError` instead of a `LitError`. This makes it
like the other errors produced when checking string literals contents,
e.g. for invalid escape sequences or bare CR chars.

NOTE: this means these errors are issued earlier, before expansion,
which changes behaviour. It will be possible to move the check back to
the later point if desired. If that happens, it's likely that all the
string literal contents checks will be delayed together.

One nice thing about this: the old approach had some code in
`report_lit_error` to calculate the span of the nul char from a range.
This code used a hardwired `+2` to account for the `c"` at the start of
a C string literal, but this should have changed to a `+3` for raw C
string literals to account for the `cr"`, which meant that the caret in
`cr"` nul error messages was one short of where it should have been. The
new approach doesn't need any of this and avoids the off-by-one error.
2024-01-12 16:19:37 +11:00
Nicholas Nethercote
a50efe2653 Unify single-char and multi-char CStrUnit::Char handling.
The two cases are equivalent. C string literals aren't common so there
is no performance need here.
2023-12-13 10:06:13 +11:00
Nicholas Nethercote
4acc5e6480 Don't rebuild raw strings when unescaping.
Raw strings don't have escape sequences, so for them "unescaping" just
means checking for invalid chars like bare CR. Which means there is no
need to rebuild them one char or byte at a time while escaping, because
the unescaped version will be the same. This commit removes that
rebuilding.

Also, the commit changes things so that "unescaping" is unconditional for
raw strings and raw byte strings. That's simpler and they're rare enough
that the perf effect is negligible.
2023-12-13 09:26:10 +11:00
Maybe Waffle
fb0f74a8c9 Use Option::is_some_and and Result::is_ok_and in the compiler 2023-05-24 14:20:41 +00:00
Deadbeef
abb181dfd9 make it semantic error 2023-05-02 10:32:08 +00:00
Deadbeef
a49570fd20 fix TODO comments 2023-05-02 10:32:07 +00:00
Deadbeef
76d1f93896 update and add a few tests 2023-05-02 10:30:09 +00:00
Deadbeef
8ff3903643 initial step towards implementing C string literals 2023-05-02 10:30:09 +00:00
nils
fd7a159710 Fix uninlined_format_args for some compiler crates
Convert all the crates that have had their diagnostic migration
completed (except save_analysis because that will be deleted soon and
apfloat because of the licensing problem).
2023-01-05 19:01:12 +01:00
clubby789
537c7f4fa9 Print correct base for too-large literals
Also update tests
2023-01-02 11:43:07 +00:00
bors
2cd2070af7 Auto merge of #105160 - nnethercote:rm-Lit-token_lit, r=petrochenkov
Remove `token::Lit` from `ast::MetaItemLit`.

Currently `ast::MetaItemLit` represents the literal kind twice. This PR removes that redundancy. Best reviewed one commit at a time.

r? `@petrochenkov`
2022-12-12 05:16:50 +00:00
Nicholas Nethercote
7e0c6dba0d Remove LitKind::synthesize_token_lit.
It has a single call site in the HIR pretty printer, where the resulting
token lit is immediately converted to a string.

This commit replaces `LitKind::synthesize_token_lit` with a `Display`
impl for `LitKind`, which can be used by the HIR pretty printer.
2022-12-05 16:33:24 +11:00
Nicholas Nethercote
4ae956f600 Remove ExtCtxt::expr_lit. 2022-12-05 14:20:38 +11:00
Nicholas Nethercote
2fd364acff Remove token::Lit from ast::MetaItemLit.
`token::Lit` contains a `kind` field that indicates what kind of literal
it is. `ast::MetaItemLit` currently wraps a `token::Lit` but also has
its own `kind` field. This means that `ast::MetaItemLit` encodes the
literal kind in two different ways.

This commit changes `ast::MetaItemLit` so it no longer wraps
`token::Lit`. It now contains the `symbol` and `suffix` fields from
`token::Lit`, but not the `kind` field, eliminating the redundancy.
2022-12-02 13:49:19 +11:00
Nicholas Nethercote
a7f35c42d4 Add StrStyle to ast::LitKind::ByteStr.
This is required to distinguish between cooked and raw byte string
literals in an `ast::LitKind`, without referring to an adjacent
`token::Lit`. It's a prerequisite for the next commit.
2022-12-02 10:38:58 +11:00
Nicholas Nethercote
e658144586 Rename LitKind::to_token_lit as LitKind::synthesize_token_lit.
This makes it clearer that it's not a lossless conversion, which I find
helpful.
2022-12-02 10:23:44 +11:00
Maybe Waffle
f2b97a8bfe Remove useless borrows and derefs 2022-12-01 17:34:43 +00:00
Nicholas Nethercote
ba1751a201 Avoid more MetaItem-to-Attribute conversions.
There is code for converting `Attribute` (syntactic) to `MetaItem`
(semantic). There is also code for the reverse direction. The reverse
direction isn't really necessary; it's currently only used when
generating attributes, e.g. in `derive` code.

This commit adds some new functions for creating `Attributes`s directly,
without involving `MetaItem`s: `mk_attr_word`, `mk_attr_name_value_str`,
`mk_attr_nested_word`, and
`ExtCtxt::attr_{word,name_value_str,nested_word}`.

These new methods replace the old functions for creating `Attribute`s:
`mk_attr_inner`, `mk_attr_outer`, and `ExtCtxt::attribute`. Those
functions took `MetaItem`s as input, and relied on many other functions
that created `MetaItems`, which are also removed: `mk_name_value_item`,
`mk_list_item`, `mk_word_item`, `mk_nested_word_item`,
`{MetaItem,MetaItemKind,NestedMetaItem}::token_trees`,
`MetaItemKind::attr_args`, `MetaItemLit::{from_lit_kind,to_token}`,
`ExtCtxt::meta_word`.

Overall this cuts more than 100 lines of code and makes thing simpler.
2022-11-29 18:43:53 +11:00
Nicholas Nethercote
d1b61a31c5 Inline and remove MetaItemLit::from_lit_kind.
It has a single call site.
2022-11-29 18:42:40 +11:00
Nicholas Nethercote
e4a9150872 Rename ast::Lit as ast::MetaItemLit. 2022-11-28 15:18:49 +11:00
Nicholas Nethercote
8cfc8153da Remove Lit::from_included_bytes.
`Lit::from_included_bytes` calls `Lit::from_lit_kind`, but the two call
sites only need the resulting `token::Lit`, not the full `ast::Lit`.

This commit changes those call sites to use `LitKind::to_token_lit`,
which means `from_included_bytes` can be removed.
2022-11-28 15:17:45 +11:00
Nicholas Nethercote
358a603f11 Use token::Lit in ast::ExprKind::Lit.
Instead of `ast::Lit`.

Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.

This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
2022-11-16 09:41:28 +11:00
clubby789
b2da155a9a Introduce ExprKind::IncludedBytes 2022-11-11 16:31:32 +00:00
Nicholas Nethercote
a838952239 Remove unescape_byte_literal.
It's easy to just use `unescape_literal` + `byte_from_char`.
2022-11-05 13:56:36 +11:00
Dylan DPC
93177758fc Rollup merge of #100767 - kadiwa4:escape_ascii, r=jackh726
Remove manual <[u8]>::escape_ascii

`@rustbot` label: +C-cleanup
2022-09-12 15:21:30 +05:30
Oli Scherer
ee3c835018 Always import all tracing macros for the entire crate instead of piecemeal by module 2022-09-01 14:54:27 +00:00
Nicholas Nethercote
b997af95fc Handle Err in ast::LitKind::to_token_lit.
Fixes #100948.
2022-08-25 10:50:39 +10:00
Nicholas Nethercote
6087dc2054 Remove the symbol from ast::LitKind::Err.
Because it's never used meaningfully.
2022-08-23 16:56:24 +10:00
KaDiWa
a297631bdc use <[u8]>::escape_ascii instead of core::ascii::escape_default 2022-08-19 19:00:37 +02:00
Nicholas Nethercote
5d3cc1713a Rename some things related to literals.
- Rename `ast::Lit::token` as `ast::Lit::token_lit`, because its type is
  `token::Lit`, which is not a token. (This has been confusing me for a
  long time.)
  reasonable because we have an `ast::token::Lit` inside an `ast::Lit`.
- Rename `LitKind::{from,to}_lit_token` as
  `LitKind::{from,to}_token_lit`, to match the above change and
  `token::Lit`.
2022-08-16 13:41:34 +10:00
Caio
8073a88f35 Implement macro meta-variable expressions 2022-03-09 16:46:23 -03:00
Caio
ef5601b321 2 - Make more use of let_chains
Continuation of #94376.

cc #53667
2022-02-26 13:45:43 -03:00
Nicholas Nethercote
44308dc348 Inline a hot closure in from_lit_token.
The change looks big because `rustfmt` rearranges things, but the only
real change is the inlining annotation.
2022-02-24 17:07:49 +11:00
Nicholas Nethercote
056d48a2c9 Remove unnecessary sigils around Symbol::as_str() calls. 2021-12-15 17:32:14 +11:00
est31
15de4cbc4b Remove redundant [..]s 2021-12-09 00:01:29 +01:00
Anton Golov
5d59b4412e Add warning when whitespace is not skipped after an escaped newline. 2021-07-30 16:26:39 +02:00
Dániel Buga
dc932cdf88 Remove unnecessary manual shrink_to_fit calls 2021-01-16 14:02:36 +01:00
Vadim Petrochenkov
71cd6f42a6 ast: Remove some indirection layers from values in key-value attributes 2021-01-09 21:50:39 +03:00
Robin Schoonover
5ab19676ed Remove extra indirection in LitKind::ByteStr 2020-10-04 15:52:15 -06:00
mark
9e5f7d5631 mv compiler to compiler/ 2020-08-30 18:45:07 +03:00