borrow from @[] vectors (cc #2797)
This commit is contained in:
@@ -1,149 +1,217 @@
|
|||||||
/*!
|
/*!
|
||||||
* # Borrow check
|
# Borrow check
|
||||||
*
|
|
||||||
* This pass is in job of enforcing *memory safety* and *purity*. As
|
This pass is in job of enforcing *memory safety* and *purity*. As
|
||||||
* memory safety is by far the more complex topic, I'll focus on that in
|
memory safety is by far the more complex topic, I'll focus on that in
|
||||||
* this description, but purity will be covered later on. In the context
|
this description, but purity will be covered later on. In the context
|
||||||
* of Rust, memory safety means three basic things:
|
of Rust, memory safety means three basic things:
|
||||||
*
|
|
||||||
* - no writes to immutable memory;
|
- no writes to immutable memory;
|
||||||
* - all pointers point to non-freed memory;
|
- all pointers point to non-freed memory;
|
||||||
* - all pointers point to memory of the same type as the pointer.
|
- all pointers point to memory of the same type as the pointer.
|
||||||
*
|
|
||||||
* The last point might seem confusing: after all, for the most part,
|
The last point might seem confusing: after all, for the most part,
|
||||||
* this condition is guaranteed by the type check. However, there are
|
this condition is guaranteed by the type check. However, there are
|
||||||
* two cases where the type check effectively delegates to borrow check.
|
two cases where the type check effectively delegates to borrow check.
|
||||||
*
|
|
||||||
* The first case has to do with enums. If there is a pointer to the
|
The first case has to do with enums. If there is a pointer to the
|
||||||
* interior of an enum, and the enum is in a mutable location (such as a
|
interior of an enum, and the enum is in a mutable location (such as a
|
||||||
* local variable or field declared to be mutable), it is possible that
|
local variable or field declared to be mutable), it is possible that
|
||||||
* the user will overwrite the enum with a new value of a different
|
the user will overwrite the enum with a new value of a different
|
||||||
* variant, and thus effectively change the type of the memory that the
|
variant, and thus effectively change the type of the memory that the
|
||||||
* pointer is pointing at.
|
pointer is pointing at.
|
||||||
*
|
|
||||||
* The second case has to do with mutability. Basically, the type
|
The second case has to do with mutability. Basically, the type
|
||||||
* checker has only a limited understanding of mutability. It will allow
|
checker has only a limited understanding of mutability. It will allow
|
||||||
* (for example) the user to get an immutable pointer with the address of
|
(for example) the user to get an immutable pointer with the address of
|
||||||
* a mutable local variable. It will also allow a `@mut T` or `~mut T`
|
a mutable local variable. It will also allow a `@mut T` or `~mut T`
|
||||||
* pointer to be borrowed as a `&r.T` pointer. These seeming oversights
|
pointer to be borrowed as a `&r.T` pointer. These seeming oversights
|
||||||
* are in fact intentional; they allow the user to temporarily treat a
|
are in fact intentional; they allow the user to temporarily treat a
|
||||||
* mutable value as immutable. It is up to the borrow check to guarantee
|
mutable value as immutable. It is up to the borrow check to guarantee
|
||||||
* that the value in question is not in fact mutated during the lifetime
|
that the value in question is not in fact mutated during the lifetime
|
||||||
* `r` of the reference.
|
`r` of the reference.
|
||||||
*
|
|
||||||
* # Summary of the safety check
|
# Definition of unstable memory
|
||||||
*
|
|
||||||
* In order to enforce mutability, the borrow check has three tricks up
|
The primary danger to safety arises due to *unstable memory*.
|
||||||
* its sleeve.
|
Unstable memory is memory whose validity or type may change as a
|
||||||
*
|
result of an assignment, move, or a variable going out of scope.
|
||||||
* First, data which is uniquely tied to the current stack frame (that'll
|
There are two cases in Rust where memory is unstable: the contents of
|
||||||
* be defined shortly) is tracked very precisely. This means that, for
|
unique boxes and enums.
|
||||||
* example, if an immutable pointer to a mutable local variable is
|
|
||||||
* created, the borrowck will simply check for assignments to that
|
Unique boxes are unstable because when the variable containing the
|
||||||
* particular local variable: no other memory is affected.
|
unique box is re-assigned, moves, or goes out of scope, the unique box
|
||||||
*
|
is freed or---in the case of a move---potentially given to another
|
||||||
* Second, if the data is not uniquely tied to the stack frame, it may
|
task. In either case, if there is an extant and usable pointer into
|
||||||
* still be possible to ensure its validity by rooting garbage collected
|
the box, then safety guarantees would be compromised.
|
||||||
* pointers at runtime. For example, if there is a mutable local
|
|
||||||
* variable `x` of type `@T`, and its contents are borrowed with an
|
Enum values are unstable because they are reassigned the types of
|
||||||
* expression like `&*x`, then the value of `x` will be rooted (today,
|
their contents may change if they are assigned with a different
|
||||||
* that means its ref count will be temporary increased) for the lifetime
|
variant than they had previously.
|
||||||
* of the reference that is created. This means that the pointer remains
|
|
||||||
* valid even if `x` is reassigned.
|
# Safety criteria that must be enforced
|
||||||
*
|
|
||||||
* Finally, if neither of these two solutions are applicable, then we
|
Whenever a piece of memory is borrowed for lifetime L, there are two
|
||||||
* require that all operations within the scope of the reference be
|
things which the borrow checker must guarantee. First, it must
|
||||||
* *pure*. A pure operation is effectively one that does not write to
|
guarantee that the memory address will remain allocated (and owned by
|
||||||
* any aliasable memory. This means that it is still possible to write
|
the current task) for the entirety of the lifetime L. Second, it must
|
||||||
* to local variables or other data that is uniquely tied to the stack
|
guarantee that the type of the data will not change for the entirety
|
||||||
* frame (there's that term again; formal definition still pending) but
|
of the lifetime L. In exchange, the region-based type system will
|
||||||
* not to data reached via a `&T` or `@T` pointer. Such writes could
|
guarantee that the pointer is not used outside the lifetime L. These
|
||||||
* possibly have the side-effect of causing the data which must remain
|
guarantees are to some extent independent but are also inter-related.
|
||||||
* valid to be overwritten.
|
|
||||||
*
|
In some cases, the type of a pointer cannot be invalidated but the
|
||||||
* # Possible future directions
|
lifetime can. For example, imagine a pointer to the interior of
|
||||||
*
|
a shared box like:
|
||||||
* There are numerous ways that the `borrowck` could be strengthened, but
|
|
||||||
* these are the two most likely:
|
let mut x = @mut {f: 5, g: 6};
|
||||||
*
|
let y = &mut x.f;
|
||||||
* - flow-sensitivity: we do not currently consider flow at all but only
|
|
||||||
* block-scoping. This means that innocent code like the following is
|
Here, a pointer was created to the interior of a shared box which
|
||||||
* rejected:
|
contains a record. Even if `*x` were to be mutated like so:
|
||||||
*
|
|
||||||
* let mut x: int;
|
*x = {f: 6, g: 7};
|
||||||
* ...
|
|
||||||
* x = 5;
|
This would cause `*y` to change from 5 to 6, but the pointer pointer
|
||||||
* let y: &int = &x; // immutable ptr created
|
`y` remains valid. It still points at an integer even if that integer
|
||||||
* ...
|
has been overwritten.
|
||||||
*
|
|
||||||
* The reason is that the scope of the pointer `y` is the entire
|
However, if we were to reassign `x` itself, like so:
|
||||||
* enclosing block, and the assignment `x = 5` occurs within that
|
|
||||||
* block. The analysis is not smart enough to see that `x = 5` always
|
x = @{f: 6, g: 7};
|
||||||
* happens before the immutable pointer is created. This is relatively
|
|
||||||
* easy to fix and will surely be fixed at some point.
|
This could potentially invalidate `y`, because if `x` were the final
|
||||||
*
|
reference to the shared box, then that memory would be released and
|
||||||
* - finer-grained purity checks: currently, our fallback for
|
now `y` points at freed memory. (We will see that to prevent this
|
||||||
* guaranteeing random references into mutable, aliasable memory is to
|
scenario we will *root* shared boxes that reside in mutable memory
|
||||||
* require *total purity*. This is rather strong. We could use local
|
whose contents are borrowed; rooting means that we create a temporary
|
||||||
* type-based alias analysis to distinguish writes that could not
|
to ensure that the box is not collected).
|
||||||
* possibly invalid the references which must be guaranteed. This
|
|
||||||
* would only work within the function boundaries; function calls would
|
In other cases, like an enum on the stack, the memory cannot be freed
|
||||||
* still require total purity. This seems less likely to be
|
but its type can change:
|
||||||
* implemented in the short term as it would make the code
|
|
||||||
* significantly more complex; there is currently no code to analyze
|
let mut x = some(5);
|
||||||
* the types and determine the possible impacts of a write.
|
alt x {
|
||||||
*
|
some(ref y) => { ... }
|
||||||
* # Terminology
|
none => { ... }
|
||||||
*
|
}
|
||||||
* A **loan** is .
|
|
||||||
*
|
Here as before, the pointer `y` would be invalidated if we were to
|
||||||
* # How the code works
|
reassign `x` to `none`. (We will see that this case is prevented
|
||||||
*
|
because borrowck tracks data which resides on the stack and prevents
|
||||||
* The borrow check code is divided into several major modules, each of
|
variables from reassigned if there may be pointers to their interior)
|
||||||
* which is documented in its own file.
|
|
||||||
*
|
Finally, in some cases, both dangers can arise. For example, something
|
||||||
* The `gather_loans` and `check_loans` are the two major passes of the
|
like the following:
|
||||||
* analysis. The `gather_loans` pass runs over the IR once to determine
|
|
||||||
* what memory must remain valid and for how long. Its name is a bit of
|
let mut x = ~some(5);
|
||||||
* a misnomer; it does in fact gather up the set of loans which are
|
alt x {
|
||||||
* granted, but it also determines when @T pointers must be rooted and
|
~some(ref y) => { ... }
|
||||||
* for which scopes purity must be required.
|
~none => { ... }
|
||||||
*
|
}
|
||||||
* The `check_loans` pass walks the IR and examines the loans and purity
|
|
||||||
* requirements computed in `gather_loans`. It checks to ensure that (a)
|
In this case, if `x` to be reassigned or `*x` were to be mutated, then
|
||||||
* the conditions of all loans are honored; (b) no contradictory loans
|
the pointer `y` would be invalided. (This case is also prevented by
|
||||||
* were granted (for example, loaning out the same memory as mutable and
|
borrowck tracking data which is owned by the current stack frame)
|
||||||
* immutable simultaneously); and (c) any purity requirements are
|
|
||||||
* honored.
|
# Summary of the safety check
|
||||||
*
|
|
||||||
* The remaining modules are helper modules used by `gather_loans` and
|
In order to enforce mutability, the borrow check has a few tricks up
|
||||||
* `check_loans`:
|
its sleeve:
|
||||||
*
|
|
||||||
* - `categorization` has the job of analyzing an expression to determine
|
- When data is owned by the current stack frame, we can identify every
|
||||||
* what kind of memory is used in evaluating it (for example, where
|
possible assignment to a local variable and simply prevent
|
||||||
* dereferences occur and what kind of pointer is dereferenced; whether
|
potentially dangerous assignments directly.
|
||||||
* the memory is mutable; etc)
|
|
||||||
* - `loan` determines when data uniquely tied to the stack frame can be
|
- If data is owned by a shared box, we can root the box to increase
|
||||||
* loaned out.
|
its lifetime.
|
||||||
* - `preserve` determines what actions (if any) must be taken to preserve
|
|
||||||
* aliasable data. This is the code which decides when to root
|
- If data is found within a borrowed pointer, we can assume that the
|
||||||
* an @T pointer or to require purity.
|
data will remain live for the entirety of the borrowed pointer.
|
||||||
*
|
|
||||||
* # Maps that are created
|
- We can rely on the fact that pure actions (such as calling pure
|
||||||
*
|
functions) do not mutate data which is not owned by the current
|
||||||
* Borrowck results in two maps.
|
stack frame.
|
||||||
*
|
|
||||||
* - `root_map`: identifies those expressions or patterns whose result
|
# Possible future directions
|
||||||
* needs to be rooted. Conceptually the root_map maps from an
|
|
||||||
* expression or pattern node to a `node_id` identifying the scope for
|
There are numerous ways that the `borrowck` could be strengthened, but
|
||||||
* which the expression must be rooted (this `node_id` should identify
|
these are the two most likely:
|
||||||
* a block or call). The actual key to the map is not an expression id,
|
|
||||||
* however, but a `root_map_key`, which combines an expression id with a
|
- flow-sensitivity: we do not currently consider flow at all but only
|
||||||
* deref count and is used to cope with auto-deref.
|
block-scoping. This means that innocent code like the following is
|
||||||
*
|
rejected:
|
||||||
* - `mutbl_map`: identifies those local variables which are modified or
|
|
||||||
* moved. This is used by trans to guarantee that such variables are
|
let mut x: int;
|
||||||
* given a memory location and not used as immediates.
|
...
|
||||||
|
x = 5;
|
||||||
|
let y: &int = &x; // immutable ptr created
|
||||||
|
...
|
||||||
|
|
||||||
|
The reason is that the scope of the pointer `y` is the entire
|
||||||
|
enclosing block, and the assignment `x = 5` occurs within that
|
||||||
|
block. The analysis is not smart enough to see that `x = 5` always
|
||||||
|
happens before the immutable pointer is created. This is relatively
|
||||||
|
easy to fix and will surely be fixed at some point.
|
||||||
|
|
||||||
|
- finer-grained purity checks: currently, our fallback for
|
||||||
|
guaranteeing random references into mutable, aliasable memory is to
|
||||||
|
require *total purity*. This is rather strong. We could use local
|
||||||
|
type-based alias analysis to distinguish writes that could not
|
||||||
|
possibly invalid the references which must be guaranteed. This
|
||||||
|
would only work within the function boundaries; function calls would
|
||||||
|
still require total purity. This seems less likely to be
|
||||||
|
implemented in the short term as it would make the code
|
||||||
|
significantly more complex; there is currently no code to analyze
|
||||||
|
the types and determine the possible impacts of a write.
|
||||||
|
|
||||||
|
# How the code works
|
||||||
|
|
||||||
|
The borrow check code is divided into several major modules, each of
|
||||||
|
which is documented in its own file.
|
||||||
|
|
||||||
|
The `gather_loans` and `check_loans` are the two major passes of the
|
||||||
|
analysis. The `gather_loans` pass runs over the IR once to determine
|
||||||
|
what memory must remain valid and for how long. Its name is a bit of
|
||||||
|
a misnomer; it does in fact gather up the set of loans which are
|
||||||
|
granted, but it also determines when @T pointers must be rooted and
|
||||||
|
for which scopes purity must be required.
|
||||||
|
|
||||||
|
The `check_loans` pass walks the IR and examines the loans and purity
|
||||||
|
requirements computed in `gather_loans`. It checks to ensure that (a)
|
||||||
|
the conditions of all loans are honored; (b) no contradictory loans
|
||||||
|
were granted (for example, loaning out the same memory as mutable and
|
||||||
|
immutable simultaneously); and (c) any purity requirements are
|
||||||
|
honored.
|
||||||
|
|
||||||
|
The remaining modules are helper modules used by `gather_loans` and
|
||||||
|
`check_loans`:
|
||||||
|
|
||||||
|
- `categorization` has the job of analyzing an expression to determine
|
||||||
|
what kind of memory is used in evaluating it (for example, where
|
||||||
|
dereferences occur and what kind of pointer is dereferenced; whether
|
||||||
|
the memory is mutable; etc)
|
||||||
|
- `loan` determines when data uniquely tied to the stack frame can be
|
||||||
|
loaned out.
|
||||||
|
- `preserve` determines what actions (if any) must be taken to preserve
|
||||||
|
aliasable data. This is the code which decides when to root
|
||||||
|
an @T pointer or to require purity.
|
||||||
|
|
||||||
|
# Maps that are created
|
||||||
|
|
||||||
|
Borrowck results in two maps.
|
||||||
|
|
||||||
|
- `root_map`: identifies those expressions or patterns whose result
|
||||||
|
needs to be rooted. Conceptually the root_map maps from an
|
||||||
|
expression or pattern node to a `node_id` identifying the scope for
|
||||||
|
which the expression must be rooted (this `node_id` should identify
|
||||||
|
a block or call). The actual key to the map is not an expression id,
|
||||||
|
however, but a `root_map_key`, which combines an expression id with a
|
||||||
|
deref count and is used to cope with auto-deref.
|
||||||
|
|
||||||
|
- `mutbl_map`: identifies those local variables which are modified or
|
||||||
|
moved. This is used by trans to guarantee that such variables are
|
||||||
|
given a memory location and not used as immediates.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import syntax::ast;
|
import syntax::ast;
|
||||||
|
|||||||
@@ -361,12 +361,18 @@ impl public_methods for borrowck_ctxt {
|
|||||||
|
|
||||||
ret alt deref_kind(self.tcx, base_cmt.ty) {
|
ret alt deref_kind(self.tcx, base_cmt.ty) {
|
||||||
deref_ptr(ptr) {
|
deref_ptr(ptr) {
|
||||||
// make deref of vectors explicit, as explained in the comment at
|
// (a) the contents are loanable if the base is loanable
|
||||||
// the head of this section
|
// and this is a *unique* vector
|
||||||
let deref_lp = base_cmt.lp.map(|lp| @lp_deref(lp, ptr) );
|
let deref_lp = alt ptr {
|
||||||
|
uniq_ptr => {base_cmt.lp.map(|lp| @lp_deref(lp, uniq_ptr))}
|
||||||
|
_ => {none}
|
||||||
|
};
|
||||||
|
|
||||||
|
// (b) the deref is explicit in the resulting cmt
|
||||||
let deref_cmt = @{id:expr.id, span:expr.span,
|
let deref_cmt = @{id:expr.id, span:expr.span,
|
||||||
cat:cat_deref(base_cmt, 0u, ptr), lp:deref_lp,
|
cat:cat_deref(base_cmt, 0u, ptr), lp:deref_lp,
|
||||||
mutbl:m_imm, ty:mt.ty};
|
mutbl:m_imm, ty:mt.ty};
|
||||||
|
|
||||||
comp(expr, deref_cmt, base_cmt.ty, mt)
|
comp(expr, deref_cmt, base_cmt.ty, mt)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user