const-eval interning: get rid of type-driven traversal
This entirely replaces our const-eval interner, i.e. the code that takes the final result of a constant evaluation from the local memory of the const-eval machine to the global `tcx` memory. The main goal of this change is to ensure that we can detect mutable references that sneak into this final value -- this is something we want to reject for `static` and `const`, and while const-checking performs some static analysis to ensure this, I would be much more comfortable stabilizing const_mut_refs if we had a dynamic check that sanitizes the final value. (This is generally the approach we have been using on const-eval: do a static check to give nice errors upfront, and then do a dynamic check to be really sure that the properties we need for soundness, actually hold.)
We can do this now that https://github.com/rust-lang/rust/pull/118324 landed and each pointer comes with a bit (completely independent of its type) storing whether mutation is permitted through this pointer or not.
The new interner is a lot simpler than the old one: previously we did a complete type-driven traversal to determine the mutability of all memory we see, and then a second pass to intern any leftover raw pointers. The new interner simply recursively traverses the allocation holding the final result, and all allocations reachable from it (which can be determined from the raw bytes of the result, without knowing anything about types), and ensures they all get interned. The initial allocation is interned as immutable for `const` and pomoted and non-interior-mutable `static`; all other allocations are interned as immutable for `static`, `const`, and promoted. The main subtlety is justifying that those inner allocations may indeed be interned immutably, i.e., that mutating them later would anyway already be UB:
- for promoteds, we rely on the analysis that does promotion to ensure that this is sound.
- for `const` and `static`, we check that all pointers in the final result that point to things that are new (i.e., part of this const evaluation) are immutable, i.e., were created via `&<expr>` at a non-interior-mutable type. Mutation through immutable pointers is UB so we are free to intern that memory as immutable.
Interning raises an error if it encounters a dangling pointer or a mutable pointer that violates the above rules.
I also extended our type-driven const validity checks to ensure that `&mut T` in the final value of a const points to mutable memory, at least if `T` is not zero-sized. This catches cases of people turning `&i32` into `&mut i32` (which would still be considered a read-only pointer). Similarly, when these checks encounter an `UnsafeCell`, they are checking that it lives in mutable memory. (Both of these only traverse the newly created values; if those point to other consts/promoteds, the check stops there. But that's okay, we don't have to catch all the UB.) I co-developed this with the stricter interner changes but I can split it out into a separate PR if you prefer.
This PR does have the immediate effect of allowing some new code on stable, for instance:
```rust
const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;
```
Previously that code got rejected since the type-based interner didn't know what to do with that pointer. It's a raw pointer, we cannot trust its type. The new interner does not care about types so it sees no issue with this code; there's an immutable pointer pointing to some read-only memory (storing a `Vec<i32>`), all is good. Accepting this code pretty much commits us to non-type-based interning, but I think that's the better strategy anyway.
This PR also leads to slightly worse error messages when the final value of a const contains a dangling reference. Previously we would complete interning and then the type-based validation would detect this dangling reference and show a nice error saying where in the value (i.e., in which field) the dangling reference is located. However, the new interner cannot distinguish dangling references from dangling raw pointers, so it must throw an error when it encounters either of them. It doesn't have an understanding of the value structure so all it can say is "somewhere in this constant there's a dangling pointer". (Later parts of the compiler don't like dangling pointers/references so we have to reject them either during interning or during validation.) This could potentially be improved by doing validation before interning, but that's a larger change that I have not attempted yet. (It's also subtle since we do want validation to use the final mutability bits of all involved allocations, and currently it is interning that marks a bunch of allocations as immutable -- that would have to still happen before validation.)
`@rust-lang/wg-const-eval` I hope you are okay with this plan. :)
`@rust-lang/lang` paging you in since this accepts new code on stable as explained above. Please let me know if you think FCP is necessary.
Fix overflow check
Make MIRI choose the path randomly and rename the intrinsic
Add back test
Add miri test and make it operate on `ptr`
Define `llvm.is.constant` for primitives
Update MIRI comment and fix test in stage2
Add const eval test
Clarify that both branches must have the same side effects
guaranteed non guarantee
use immediate type instead
Co-Authored-By: Ralf Jung <post@ralfj.de>
We have `span_delayed_bug` and often pass it a `DUMMY_SP`. This commit
adds `delayed_bug`, which matches pairs like `err`/`span_err` and
`warn`/`span_warn`.
Remove `DiagCtxt` API duplication
`DiagCtxt` defines the internal API for creating and emitting diagnostics: methods like `struct_err`, `struct_span_warn`, `note`, `create_fatal`, `emit_bug`. There are over 50 methods.
Some of these methods are then duplicated across several other types: `Session`, `ParseSess`, `Parser`, `ExtCtxt`, and `MirBorrowckCtxt`. `Session` duplicates the most, though half the ones it does are unused. Each duplicated method just calls forward to the corresponding method in `DiagCtxt`. So this duplication exists to (in the best case) shorten chains like `ecx.tcx.sess.parse_sess.dcx.emit_err()` to `ecx.emit_err()`.
This API duplication is ugly and has been bugging me for a while. And it's inconsistent: there's no real logic about which methods are duplicated, and the use of `#[rustc_lint_diagnostic]` and `#[track_caller]` attributes vary across the duplicates.
This PR removes the duplicated API methods and makes all diagnostic creation and emission go through `DiagCtxt`. It also adds `dcx` getter methods to several types to shorten chains. This approach scales *much* better than API duplication; indeed, the PR adds `dcx()` to numerous types that didn't have API duplication: `TyCtxt`, `LoweringCtxt`, `ConstCx`, `FnCtxt`, `TypeErrCtxt`, `InferCtxt`, `CrateLoader`, `CheckAttrVisitor`, and `Resolver`. These result in a lot of changes from `foo.tcx.sess.emit_err()` to `foo.dcx().emit_err()`. (You could do this with more types, but it gets into diminishing returns territory for types that don't emit many diagnostics.)
After all these changes, some call sites are more verbose, some are less verbose, and many are the same. The total number of lines is reduced, mostly because of the removed API duplication. And consistency is increased, because calls to `emit_err` and friends are always preceded with `.dcx()` or `.dcx`.
r? `@compiler-errors`
Renamings:
- find -> opt_hir_node
- get -> hir_node
- find_by_def_id -> opt_hir_node_by_def_id
- get_by_def_id -> hir_node_by_def_id
Fix rebase changes using removed methods
Use `tcx.hir_node_by_def_id()` whenever possible in compiler
Fix clippy errors
Fix compiler
Apply suggestions from code review
Co-authored-by: Vadim Petrochenkov <vadim.petrochenkov@gmail.com>
Add FIXME for `tcx.hir()` returned type about its removal
Simplify with with `tcx.hir_node_by_def_id`
Expand Miri's BorTag GC to a Provenance GC
As suggested in https://github.com/rust-lang/miri/issues/3080#issuecomment-1732505573
We previously solved memory growth issues associated with the Stacked Borrows and Tree Borrows runtimes with a GC. But of course we also have state accumulation associated with whole allocations elsewhere in the interpreter, and this PR starts tackling those.
To do this, we expand the visitor for the GC so that it can visit a BorTag or an AllocId. Instead of collecting all live AllocIds into a single HashSet, we just collect from the Machine itself then go through an accessor `InterpCx::is_alloc_live` which checks a number of allocation data structures in the core interpreter. This avoids the overhead of all the inserts that collecting their keys would require.
r? ``@RalfJung``
patterns: reject raw pointers that are not just integers
Matching against `0 as *const i32` is fine, matching against `&42 as *const i32` is not.
This extends the existing check against function pointers and wide pointers: we now uniformly reject all these pointer types during valtree construction, and then later lint because of that. See [here](https://github.com/rust-lang/rust/pull/116930#issuecomment-1784654073) for some more explanation and context.
Also fixes https://github.com/rust-lang/rust/issues/116929.
Cc `@oli-obk` `@lcnr`
Most notably, this commit changes the `pub use crate::*;` in that file
to `use crate::*;`. This requires a lot of `use` items in other crates
to be adjusted, because everything defined within `rustc_span::*` was
also available via `rustc_span::source_map::*`, which is bizarre.
The commit also removes `SourceMap::span_to_relative_line_string`, which
is unused.
Format all the let-chains in compiler crates
Since rust-lang/rustfmt#5910 has landed, soon we will have support for formatting let-chains (as soon as rustfmt syncs and beta gets bumped).
This PR applies the changes [from master rustfmt to rust-lang/rust eagerly](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/out.20formatting.20of.20prs/near/374997516), so that the next beta bump does not have to deal with a 200+ file diff and can remain concerned with other things like `cfg(bootstrap)` -- #113637 was a pain to land, for example, because of let-else.
I will also add this commit to the ignore list after it has landed.
The commands that were run -- I'm not great at bash-foo, but this applies rustfmt to every compiler crate, and then reverts the two crates that should probably be formatted out-of-tree.
```
~/rustfmt $ ls -1d ~/rust/compiler/* | xargs -I@ cargo run --bin rustfmt -- `@/src/lib.rs` --config-path ~/rust --edition=2021 # format all of the compiler crates
~/rust $ git checkout HEAD -- compiler/rustc_codegen_{gcc,cranelift} # revert changes to cg-gcc and cg-clif
```
cc `@rust-lang/rustfmt`
r? `@WaffleLapkin` or `@Nilstrieb` who said they may be able to review this purely mechanical PR :>
cc `@Mark-Simulacrum` and `@petrochenkov,` who had some thoughts on the order of operations with big formatting changes in https://github.com/rust-lang/rust/pull/95262#issue-1178993801. I think the situation has changed since then, given that let-chains support exists on master rustfmt now, and I'm fairly confident that this formatting PR should land even if *bootstrap* rustfmt doesn't yet format let-chains in order to lessen the burden of the next beta bump.
const-eval: make misalignment a hard error
It's been a future-incompat error (showing up in cargo's reports) since https://github.com/rust-lang/rust/pull/104616, Rust 1.68, released in March. That should be long enough.
The question for the lang team is simply -- should we move ahead with this, making const-eval alignment failures a hard error? (It turns out some of them accidentally already were hard errors since #104616. But not all so this is still a breaking change. Crater found no regression.)
Partially outline code inside the panic! macro
This outlines code inside the panic! macro in some cases. This is split out from https://github.com/rust-lang/rust/pull/115562 to exclude changes to rustc.
interpret: more consistently use ImmTy in operators and casts
The diff in src/tools/miri/src/shims/x86/sse2.rs should hopefully suffice to explain why this is nicer. :)