borrow from @[] vectors (cc #2797)

This commit is contained in:
Niko Matsakis
2012-07-17 18:44:00 -07:00
parent bb5e2ba60a
commit 4ee4a2ab31
2 changed files with 224 additions and 150 deletions

View File

@@ -1,149 +1,217 @@
/*! /*!
* # Borrow check # Borrow check
*
* This pass is in job of enforcing *memory safety* and *purity*. As This pass is in job of enforcing *memory safety* and *purity*. As
* memory safety is by far the more complex topic, I'll focus on that in memory safety is by far the more complex topic, I'll focus on that in
* this description, but purity will be covered later on. In the context this description, but purity will be covered later on. In the context
* of Rust, memory safety means three basic things: of Rust, memory safety means three basic things:
*
* - no writes to immutable memory; - no writes to immutable memory;
* - all pointers point to non-freed memory; - all pointers point to non-freed memory;
* - all pointers point to memory of the same type as the pointer. - all pointers point to memory of the same type as the pointer.
*
* The last point might seem confusing: after all, for the most part, The last point might seem confusing: after all, for the most part,
* this condition is guaranteed by the type check. However, there are this condition is guaranteed by the type check. However, there are
* two cases where the type check effectively delegates to borrow check. two cases where the type check effectively delegates to borrow check.
*
* The first case has to do with enums. If there is a pointer to the The first case has to do with enums. If there is a pointer to the
* interior of an enum, and the enum is in a mutable location (such as a interior of an enum, and the enum is in a mutable location (such as a
* local variable or field declared to be mutable), it is possible that local variable or field declared to be mutable), it is possible that
* the user will overwrite the enum with a new value of a different the user will overwrite the enum with a new value of a different
* variant, and thus effectively change the type of the memory that the variant, and thus effectively change the type of the memory that the
* pointer is pointing at. pointer is pointing at.
*
* The second case has to do with mutability. Basically, the type The second case has to do with mutability. Basically, the type
* checker has only a limited understanding of mutability. It will allow checker has only a limited understanding of mutability. It will allow
* (for example) the user to get an immutable pointer with the address of (for example) the user to get an immutable pointer with the address of
* a mutable local variable. It will also allow a `@mut T` or `~mut T` a mutable local variable. It will also allow a `@mut T` or `~mut T`
* pointer to be borrowed as a `&r.T` pointer. These seeming oversights pointer to be borrowed as a `&r.T` pointer. These seeming oversights
* are in fact intentional; they allow the user to temporarily treat a are in fact intentional; they allow the user to temporarily treat a
* mutable value as immutable. It is up to the borrow check to guarantee mutable value as immutable. It is up to the borrow check to guarantee
* that the value in question is not in fact mutated during the lifetime that the value in question is not in fact mutated during the lifetime
* `r` of the reference. `r` of the reference.
*
* # Summary of the safety check # Definition of unstable memory
*
* In order to enforce mutability, the borrow check has three tricks up The primary danger to safety arises due to *unstable memory*.
* its sleeve. Unstable memory is memory whose validity or type may change as a
* result of an assignment, move, or a variable going out of scope.
* First, data which is uniquely tied to the current stack frame (that'll There are two cases in Rust where memory is unstable: the contents of
* be defined shortly) is tracked very precisely. This means that, for unique boxes and enums.
* example, if an immutable pointer to a mutable local variable is
* created, the borrowck will simply check for assignments to that Unique boxes are unstable because when the variable containing the
* particular local variable: no other memory is affected. unique box is re-assigned, moves, or goes out of scope, the unique box
* is freed or---in the case of a move---potentially given to another
* Second, if the data is not uniquely tied to the stack frame, it may task. In either case, if there is an extant and usable pointer into
* still be possible to ensure its validity by rooting garbage collected the box, then safety guarantees would be compromised.
* pointers at runtime. For example, if there is a mutable local
* variable `x` of type `@T`, and its contents are borrowed with an Enum values are unstable because they are reassigned the types of
* expression like `&*x`, then the value of `x` will be rooted (today, their contents may change if they are assigned with a different
* that means its ref count will be temporary increased) for the lifetime variant than they had previously.
* of the reference that is created. This means that the pointer remains
* valid even if `x` is reassigned. # Safety criteria that must be enforced
*
* Finally, if neither of these two solutions are applicable, then we Whenever a piece of memory is borrowed for lifetime L, there are two
* require that all operations within the scope of the reference be things which the borrow checker must guarantee. First, it must
* *pure*. A pure operation is effectively one that does not write to guarantee that the memory address will remain allocated (and owned by
* any aliasable memory. This means that it is still possible to write the current task) for the entirety of the lifetime L. Second, it must
* to local variables or other data that is uniquely tied to the stack guarantee that the type of the data will not change for the entirety
* frame (there's that term again; formal definition still pending) but of the lifetime L. In exchange, the region-based type system will
* not to data reached via a `&T` or `@T` pointer. Such writes could guarantee that the pointer is not used outside the lifetime L. These
* possibly have the side-effect of causing the data which must remain guarantees are to some extent independent but are also inter-related.
* valid to be overwritten.
* In some cases, the type of a pointer cannot be invalidated but the
* # Possible future directions lifetime can. For example, imagine a pointer to the interior of
* a shared box like:
* There are numerous ways that the `borrowck` could be strengthened, but
* these are the two most likely: let mut x = @mut {f: 5, g: 6};
* let y = &mut x.f;
* - flow-sensitivity: we do not currently consider flow at all but only
* block-scoping. This means that innocent code like the following is Here, a pointer was created to the interior of a shared box which
* rejected: contains a record. Even if `*x` were to be mutated like so:
*
* let mut x: int; *x = {f: 6, g: 7};
* ...
* x = 5; This would cause `*y` to change from 5 to 6, but the pointer pointer
* let y: &int = &x; // immutable ptr created `y` remains valid. It still points at an integer even if that integer
* ... has been overwritten.
*
* The reason is that the scope of the pointer `y` is the entire However, if we were to reassign `x` itself, like so:
* enclosing block, and the assignment `x = 5` occurs within that
* block. The analysis is not smart enough to see that `x = 5` always x = @{f: 6, g: 7};
* happens before the immutable pointer is created. This is relatively
* easy to fix and will surely be fixed at some point. This could potentially invalidate `y`, because if `x` were the final
* reference to the shared box, then that memory would be released and
* - finer-grained purity checks: currently, our fallback for now `y` points at freed memory. (We will see that to prevent this
* guaranteeing random references into mutable, aliasable memory is to scenario we will *root* shared boxes that reside in mutable memory
* require *total purity*. This is rather strong. We could use local whose contents are borrowed; rooting means that we create a temporary
* type-based alias analysis to distinguish writes that could not to ensure that the box is not collected).
* possibly invalid the references which must be guaranteed. This
* would only work within the function boundaries; function calls would In other cases, like an enum on the stack, the memory cannot be freed
* still require total purity. This seems less likely to be but its type can change:
* implemented in the short term as it would make the code
* significantly more complex; there is currently no code to analyze let mut x = some(5);
* the types and determine the possible impacts of a write. alt x {
* some(ref y) => { ... }
* # Terminology none => { ... }
* }
* A **loan** is .
* Here as before, the pointer `y` would be invalidated if we were to
* # How the code works reassign `x` to `none`. (We will see that this case is prevented
* because borrowck tracks data which resides on the stack and prevents
* The borrow check code is divided into several major modules, each of variables from reassigned if there may be pointers to their interior)
* which is documented in its own file.
* Finally, in some cases, both dangers can arise. For example, something
* The `gather_loans` and `check_loans` are the two major passes of the like the following:
* analysis. The `gather_loans` pass runs over the IR once to determine
* what memory must remain valid and for how long. Its name is a bit of let mut x = ~some(5);
* a misnomer; it does in fact gather up the set of loans which are alt x {
* granted, but it also determines when @T pointers must be rooted and ~some(ref y) => { ... }
* for which scopes purity must be required. ~none => { ... }
* }
* The `check_loans` pass walks the IR and examines the loans and purity
* requirements computed in `gather_loans`. It checks to ensure that (a) In this case, if `x` to be reassigned or `*x` were to be mutated, then
* the conditions of all loans are honored; (b) no contradictory loans the pointer `y` would be invalided. (This case is also prevented by
* were granted (for example, loaning out the same memory as mutable and borrowck tracking data which is owned by the current stack frame)
* immutable simultaneously); and (c) any purity requirements are
* honored. # Summary of the safety check
*
* The remaining modules are helper modules used by `gather_loans` and In order to enforce mutability, the borrow check has a few tricks up
* `check_loans`: its sleeve:
*
* - `categorization` has the job of analyzing an expression to determine - When data is owned by the current stack frame, we can identify every
* what kind of memory is used in evaluating it (for example, where possible assignment to a local variable and simply prevent
* dereferences occur and what kind of pointer is dereferenced; whether potentially dangerous assignments directly.
* the memory is mutable; etc)
* - `loan` determines when data uniquely tied to the stack frame can be - If data is owned by a shared box, we can root the box to increase
* loaned out. its lifetime.
* - `preserve` determines what actions (if any) must be taken to preserve
* aliasable data. This is the code which decides when to root - If data is found within a borrowed pointer, we can assume that the
* an @T pointer or to require purity. data will remain live for the entirety of the borrowed pointer.
*
* # Maps that are created - We can rely on the fact that pure actions (such as calling pure
* functions) do not mutate data which is not owned by the current
* Borrowck results in two maps. stack frame.
*
* - `root_map`: identifies those expressions or patterns whose result # Possible future directions
* needs to be rooted. Conceptually the root_map maps from an
* expression or pattern node to a `node_id` identifying the scope for There are numerous ways that the `borrowck` could be strengthened, but
* which the expression must be rooted (this `node_id` should identify these are the two most likely:
* a block or call). The actual key to the map is not an expression id,
* however, but a `root_map_key`, which combines an expression id with a - flow-sensitivity: we do not currently consider flow at all but only
* deref count and is used to cope with auto-deref. block-scoping. This means that innocent code like the following is
* rejected:
* - `mutbl_map`: identifies those local variables which are modified or
* moved. This is used by trans to guarantee that such variables are let mut x: int;
* given a memory location and not used as immediates. ...
x = 5;
let y: &int = &x; // immutable ptr created
...
The reason is that the scope of the pointer `y` is the entire
enclosing block, and the assignment `x = 5` occurs within that
block. The analysis is not smart enough to see that `x = 5` always
happens before the immutable pointer is created. This is relatively
easy to fix and will surely be fixed at some point.
- finer-grained purity checks: currently, our fallback for
guaranteeing random references into mutable, aliasable memory is to
require *total purity*. This is rather strong. We could use local
type-based alias analysis to distinguish writes that could not
possibly invalid the references which must be guaranteed. This
would only work within the function boundaries; function calls would
still require total purity. This seems less likely to be
implemented in the short term as it would make the code
significantly more complex; there is currently no code to analyze
the types and determine the possible impacts of a write.
# How the code works
The borrow check code is divided into several major modules, each of
which is documented in its own file.
The `gather_loans` and `check_loans` are the two major passes of the
analysis. The `gather_loans` pass runs over the IR once to determine
what memory must remain valid and for how long. Its name is a bit of
a misnomer; it does in fact gather up the set of loans which are
granted, but it also determines when @T pointers must be rooted and
for which scopes purity must be required.
The `check_loans` pass walks the IR and examines the loans and purity
requirements computed in `gather_loans`. It checks to ensure that (a)
the conditions of all loans are honored; (b) no contradictory loans
were granted (for example, loaning out the same memory as mutable and
immutable simultaneously); and (c) any purity requirements are
honored.
The remaining modules are helper modules used by `gather_loans` and
`check_loans`:
- `categorization` has the job of analyzing an expression to determine
what kind of memory is used in evaluating it (for example, where
dereferences occur and what kind of pointer is dereferenced; whether
the memory is mutable; etc)
- `loan` determines when data uniquely tied to the stack frame can be
loaned out.
- `preserve` determines what actions (if any) must be taken to preserve
aliasable data. This is the code which decides when to root
an @T pointer or to require purity.
# Maps that are created
Borrowck results in two maps.
- `root_map`: identifies those expressions or patterns whose result
needs to be rooted. Conceptually the root_map maps from an
expression or pattern node to a `node_id` identifying the scope for
which the expression must be rooted (this `node_id` should identify
a block or call). The actual key to the map is not an expression id,
however, but a `root_map_key`, which combines an expression id with a
deref count and is used to cope with auto-deref.
- `mutbl_map`: identifies those local variables which are modified or
moved. This is used by trans to guarantee that such variables are
given a memory location and not used as immediates.
*/ */
import syntax::ast; import syntax::ast;

View File

@@ -361,12 +361,18 @@ impl public_methods for borrowck_ctxt {
ret alt deref_kind(self.tcx, base_cmt.ty) { ret alt deref_kind(self.tcx, base_cmt.ty) {
deref_ptr(ptr) { deref_ptr(ptr) {
// make deref of vectors explicit, as explained in the comment at // (a) the contents are loanable if the base is loanable
// the head of this section // and this is a *unique* vector
let deref_lp = base_cmt.lp.map(|lp| @lp_deref(lp, ptr) ); let deref_lp = alt ptr {
uniq_ptr => {base_cmt.lp.map(|lp| @lp_deref(lp, uniq_ptr))}
_ => {none}
};
// (b) the deref is explicit in the resulting cmt
let deref_cmt = @{id:expr.id, span:expr.span, let deref_cmt = @{id:expr.id, span:expr.span,
cat:cat_deref(base_cmt, 0u, ptr), lp:deref_lp, cat:cat_deref(base_cmt, 0u, ptr), lp:deref_lp,
mutbl:m_imm, ty:mt.ty}; mutbl:m_imm, ty:mt.ty};
comp(expr, deref_cmt, base_cmt.ty, mt) comp(expr, deref_cmt, base_cmt.ty, mt)
} }