Auto merge of #27414 - Gankro:tarpl-fixes, r=alexcrichton
This is *mostly* reducing *my* use of *italics* but there's some other misc changes interspersed as I went along. This updates the italicizing alphabetically from `a` to `ra`. r? @steveklabnik
This commit is contained in:
@@ -2,38 +2,33 @@
|
|||||||
|
|
||||||
# NOTE: This is a draft document, and may contain serious errors
|
# NOTE: This is a draft document, and may contain serious errors
|
||||||
|
|
||||||
So you've played around with Rust a bit. You've written a few simple programs and
|
So you've played around with Rust a bit. You've written a few simple programs
|
||||||
you think you grok the basics. Maybe you've even read through
|
and you think you grok the basics. Maybe you've even read through *[The Rust
|
||||||
*[The Rust Programming Language][trpl]*. Now you want to get neck-deep in all the
|
Programming Language][trpl]* (TRPL). Now you want to get neck-deep in all the
|
||||||
nitty-gritty details of the language. You want to know those weird corner-cases.
|
nitty-gritty details of the language. You want to know those weird corner-cases.
|
||||||
You want to know what the heck `unsafe` really means, and how to properly use it.
|
You want to know what the heck `unsafe` really means, and how to properly use
|
||||||
This is the book for you.
|
it. This is the book for you.
|
||||||
|
|
||||||
To be clear, this book goes into *serious* detail. We're going to dig into
|
To be clear, this book goes into serious detail. We're going to dig into
|
||||||
exception-safety and pointer aliasing. We're going to talk about memory
|
exception-safety and pointer aliasing. We're going to talk about memory
|
||||||
models. We're even going to do some type-theory. This is stuff that you
|
models. We're even going to do some type-theory. This is stuff that you
|
||||||
absolutely *don't* need to know to write fast and safe Rust programs.
|
absolutely don't need to know to write fast and safe Rust programs.
|
||||||
You could probably close this book *right now* and still have a productive
|
You could probably close this book *right now* and still have a productive
|
||||||
and happy career in Rust.
|
and happy career in Rust.
|
||||||
|
|
||||||
However if you intend to write unsafe code -- or just *really* want to dig into
|
However if you intend to write unsafe code -- or just really want to dig into
|
||||||
the guts of the language -- this book contains *invaluable* information.
|
the guts of the language -- this book contains invaluable information.
|
||||||
|
|
||||||
Unlike *The Rust Programming Language* we *will* be assuming considerable prior
|
Unlike TRPL we will be assuming considerable prior knowledge. In particular, you
|
||||||
knowledge. In particular, you should be comfortable with:
|
should be comfortable with basic systems programming and basic Rust. If you
|
||||||
|
don't feel comfortable with these topics, you should consider [reading
|
||||||
|
TRPL][trpl], though we will not be assuming that you have. You can skip
|
||||||
|
straight to this book if you want; just know that we won't be explaining
|
||||||
|
everything from the ground up.
|
||||||
|
|
||||||
* Basic Systems Programming:
|
Due to the nature of advanced Rust programming, we will be spending a lot of
|
||||||
* Pointers
|
time talking about *safety* and *guarantees*. In particular, a significant
|
||||||
* [The stack and heap][]
|
portion of the book will be dedicated to correctly writing and understanding
|
||||||
* The memory hierarchy (caches)
|
Unsafe Rust.
|
||||||
* Threads
|
|
||||||
|
|
||||||
* [Basic Rust][]
|
|
||||||
|
|
||||||
Due to the nature of advanced Rust programming, we will be spending a lot of time
|
|
||||||
talking about *safety* and *guarantees*. In particular, a significant portion of
|
|
||||||
the book will be dedicated to correctly writing and understanding Unsafe Rust.
|
|
||||||
|
|
||||||
[trpl]: ../book/
|
[trpl]: ../book/
|
||||||
[The stack and heap]: ../book/the-stack-and-the-heap.html
|
|
||||||
[Basic Rust]: ../book/syntax-and-semantics.html
|
|
||||||
|
|||||||
@@ -10,7 +10,7 @@
|
|||||||
* [Ownership](ownership.md)
|
* [Ownership](ownership.md)
|
||||||
* [References](references.md)
|
* [References](references.md)
|
||||||
* [Lifetimes](lifetimes.md)
|
* [Lifetimes](lifetimes.md)
|
||||||
* [Limits of lifetimes](lifetime-mismatch.md)
|
* [Limits of Lifetimes](lifetime-mismatch.md)
|
||||||
* [Lifetime Elision](lifetime-elision.md)
|
* [Lifetime Elision](lifetime-elision.md)
|
||||||
* [Unbounded Lifetimes](unbounded-lifetimes.md)
|
* [Unbounded Lifetimes](unbounded-lifetimes.md)
|
||||||
* [Higher-Rank Trait Bounds](hrtb.md)
|
* [Higher-Rank Trait Bounds](hrtb.md)
|
||||||
|
|||||||
@@ -17,7 +17,7 @@ face.
|
|||||||
The C11 memory model is fundamentally about trying to bridge the gap between the
|
The C11 memory model is fundamentally about trying to bridge the gap between the
|
||||||
semantics we want, the optimizations compilers want, and the inconsistent chaos
|
semantics we want, the optimizations compilers want, and the inconsistent chaos
|
||||||
our hardware wants. *We* would like to just write programs and have them do
|
our hardware wants. *We* would like to just write programs and have them do
|
||||||
exactly what we said but, you know, *fast*. Wouldn't that be great?
|
exactly what we said but, you know, fast. Wouldn't that be great?
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -35,20 +35,20 @@ y = 3;
|
|||||||
x = 2;
|
x = 2;
|
||||||
```
|
```
|
||||||
|
|
||||||
The compiler may conclude that it would *really* be best if your program did
|
The compiler may conclude that it would be best if your program did
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
x = 2;
|
x = 2;
|
||||||
y = 3;
|
y = 3;
|
||||||
```
|
```
|
||||||
|
|
||||||
This has inverted the order of events *and* completely eliminated one event.
|
This has inverted the order of events and completely eliminated one event.
|
||||||
From a single-threaded perspective this is completely unobservable: after all
|
From a single-threaded perspective this is completely unobservable: after all
|
||||||
the statements have executed we are in exactly the same state. But if our
|
the statements have executed we are in exactly the same state. But if our
|
||||||
program is multi-threaded, we may have been relying on `x` to *actually* be
|
program is multi-threaded, we may have been relying on `x` to actually be
|
||||||
assigned to 1 before `y` was assigned. We would *really* like the compiler to be
|
assigned to 1 before `y` was assigned. We would like the compiler to be
|
||||||
able to make these kinds of optimizations, because they can seriously improve
|
able to make these kinds of optimizations, because they can seriously improve
|
||||||
performance. On the other hand, we'd really like to be able to depend on our
|
performance. On the other hand, we'd also like to be able to depend on our
|
||||||
program *doing the thing we said*.
|
program *doing the thing we said*.
|
||||||
|
|
||||||
|
|
||||||
@@ -57,15 +57,15 @@ program *doing the thing we said*.
|
|||||||
# Hardware Reordering
|
# Hardware Reordering
|
||||||
|
|
||||||
On the other hand, even if the compiler totally understood what we wanted and
|
On the other hand, even if the compiler totally understood what we wanted and
|
||||||
respected our wishes, our *hardware* might instead get us in trouble. Trouble
|
respected our wishes, our hardware might instead get us in trouble. Trouble
|
||||||
comes from CPUs in the form of memory hierarchies. There is indeed a global
|
comes from CPUs in the form of memory hierarchies. There is indeed a global
|
||||||
shared memory space somewhere in your hardware, but from the perspective of each
|
shared memory space somewhere in your hardware, but from the perspective of each
|
||||||
CPU core it is *so very far away* and *so very slow*. Each CPU would rather work
|
CPU core it is *so very far away* and *so very slow*. Each CPU would rather work
|
||||||
with its local cache of the data and only go through all the *anguish* of
|
with its local cache of the data and only go through all the anguish of
|
||||||
talking to shared memory *only* when it doesn't actually have that memory in
|
talking to shared memory only when it doesn't actually have that memory in
|
||||||
cache.
|
cache.
|
||||||
|
|
||||||
After all, that's the whole *point* of the cache, right? If every read from the
|
After all, that's the whole point of the cache, right? If every read from the
|
||||||
cache had to run back to shared memory to double check that it hadn't changed,
|
cache had to run back to shared memory to double check that it hadn't changed,
|
||||||
what would the point be? The end result is that the hardware doesn't guarantee
|
what would the point be? The end result is that the hardware doesn't guarantee
|
||||||
that events that occur in the same order on *one* thread, occur in the same
|
that events that occur in the same order on *one* thread, occur in the same
|
||||||
@@ -99,13 +99,13 @@ provides weak ordering guarantees. This has two consequences for concurrent
|
|||||||
programming:
|
programming:
|
||||||
|
|
||||||
* Asking for stronger guarantees on strongly-ordered hardware may be cheap or
|
* Asking for stronger guarantees on strongly-ordered hardware may be cheap or
|
||||||
even *free* because they already provide strong guarantees unconditionally.
|
even free because they already provide strong guarantees unconditionally.
|
||||||
Weaker guarantees may only yield performance wins on weakly-ordered hardware.
|
Weaker guarantees may only yield performance wins on weakly-ordered hardware.
|
||||||
|
|
||||||
* Asking for guarantees that are *too* weak on strongly-ordered hardware is
|
* Asking for guarantees that are too weak on strongly-ordered hardware is
|
||||||
more likely to *happen* to work, even though your program is strictly
|
more likely to *happen* to work, even though your program is strictly
|
||||||
incorrect. If possible, concurrent algorithms should be tested on weakly-
|
incorrect. If possible, concurrent algorithms should be tested on
|
||||||
ordered hardware.
|
weakly-ordered hardware.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -115,10 +115,10 @@ programming:
|
|||||||
|
|
||||||
The C11 memory model attempts to bridge the gap by allowing us to talk about the
|
The C11 memory model attempts to bridge the gap by allowing us to talk about the
|
||||||
*causality* of our program. Generally, this is by establishing a *happens
|
*causality* of our program. Generally, this is by establishing a *happens
|
||||||
before* relationships between parts of the program and the threads that are
|
before* relationship between parts of the program and the threads that are
|
||||||
running them. This gives the hardware and compiler room to optimize the program
|
running them. This gives the hardware and compiler room to optimize the program
|
||||||
more aggressively where a strict happens-before relationship isn't established,
|
more aggressively where a strict happens-before relationship isn't established,
|
||||||
but forces them to be more careful where one *is* established. The way we
|
but forces them to be more careful where one is established. The way we
|
||||||
communicate these relationships are through *data accesses* and *atomic
|
communicate these relationships are through *data accesses* and *atomic
|
||||||
accesses*.
|
accesses*.
|
||||||
|
|
||||||
@@ -130,8 +130,10 @@ propagate the changes made in data accesses to other threads as lazily and
|
|||||||
inconsistently as it wants. Mostly critically, data accesses are how data races
|
inconsistently as it wants. Mostly critically, data accesses are how data races
|
||||||
happen. Data accesses are very friendly to the hardware and compiler, but as
|
happen. Data accesses are very friendly to the hardware and compiler, but as
|
||||||
we've seen they offer *awful* semantics to try to write synchronized code with.
|
we've seen they offer *awful* semantics to try to write synchronized code with.
|
||||||
Actually, that's too weak. *It is literally impossible to write correct
|
Actually, that's too weak.
|
||||||
synchronized code using only data accesses*.
|
|
||||||
|
**It is literally impossible to write correct synchronized code using only data
|
||||||
|
accesses.**
|
||||||
|
|
||||||
Atomic accesses are how we tell the hardware and compiler that our program is
|
Atomic accesses are how we tell the hardware and compiler that our program is
|
||||||
multi-threaded. Each atomic access can be marked with an *ordering* that
|
multi-threaded. Each atomic access can be marked with an *ordering* that
|
||||||
@@ -141,7 +143,10 @@ they *can't* do. For the compiler, this largely revolves around re-ordering of
|
|||||||
instructions. For the hardware, this largely revolves around how writes are
|
instructions. For the hardware, this largely revolves around how writes are
|
||||||
propagated to other threads. The set of orderings Rust exposes are:
|
propagated to other threads. The set of orderings Rust exposes are:
|
||||||
|
|
||||||
* Sequentially Consistent (SeqCst) Release Acquire Relaxed
|
* Sequentially Consistent (SeqCst)
|
||||||
|
* Release
|
||||||
|
* Acquire
|
||||||
|
* Relaxed
|
||||||
|
|
||||||
(Note: We explicitly do not expose the C11 *consume* ordering)
|
(Note: We explicitly do not expose the C11 *consume* ordering)
|
||||||
|
|
||||||
@@ -154,13 +159,13 @@ synchronize"
|
|||||||
|
|
||||||
Sequentially Consistent is the most powerful of all, implying the restrictions
|
Sequentially Consistent is the most powerful of all, implying the restrictions
|
||||||
of all other orderings. Intuitively, a sequentially consistent operation
|
of all other orderings. Intuitively, a sequentially consistent operation
|
||||||
*cannot* be reordered: all accesses on one thread that happen before and after a
|
cannot be reordered: all accesses on one thread that happen before and after a
|
||||||
SeqCst access *stay* before and after it. A data-race-free program that uses
|
SeqCst access stay before and after it. A data-race-free program that uses
|
||||||
only sequentially consistent atomics and data accesses has the very nice
|
only sequentially consistent atomics and data accesses has the very nice
|
||||||
property that there is a single global execution of the program's instructions
|
property that there is a single global execution of the program's instructions
|
||||||
that all threads agree on. This execution is also particularly nice to reason
|
that all threads agree on. This execution is also particularly nice to reason
|
||||||
about: it's just an interleaving of each thread's individual executions. This
|
about: it's just an interleaving of each thread's individual executions. This
|
||||||
*does not* hold if you start using the weaker atomic orderings.
|
does not hold if you start using the weaker atomic orderings.
|
||||||
|
|
||||||
The relative developer-friendliness of sequential consistency doesn't come for
|
The relative developer-friendliness of sequential consistency doesn't come for
|
||||||
free. Even on strongly-ordered platforms sequential consistency involves
|
free. Even on strongly-ordered platforms sequential consistency involves
|
||||||
@@ -170,8 +175,8 @@ In practice, sequential consistency is rarely necessary for program correctness.
|
|||||||
However sequential consistency is definitely the right choice if you're not
|
However sequential consistency is definitely the right choice if you're not
|
||||||
confident about the other memory orders. Having your program run a bit slower
|
confident about the other memory orders. Having your program run a bit slower
|
||||||
than it needs to is certainly better than it running incorrectly! It's also
|
than it needs to is certainly better than it running incorrectly! It's also
|
||||||
*mechanically* trivial to downgrade atomic operations to have a weaker
|
mechanically trivial to downgrade atomic operations to have a weaker
|
||||||
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of
|
consistency later on. Just change `SeqCst` to `Relaxed` and you're done! Of
|
||||||
course, proving that this transformation is *correct* is a whole other matter.
|
course, proving that this transformation is *correct* is a whole other matter.
|
||||||
|
|
||||||
|
|
||||||
@@ -183,15 +188,15 @@ Acquire and Release are largely intended to be paired. Their names hint at their
|
|||||||
use case: they're perfectly suited for acquiring and releasing locks, and
|
use case: they're perfectly suited for acquiring and releasing locks, and
|
||||||
ensuring that critical sections don't overlap.
|
ensuring that critical sections don't overlap.
|
||||||
|
|
||||||
Intuitively, an acquire access ensures that every access after it *stays* after
|
Intuitively, an acquire access ensures that every access after it stays after
|
||||||
it. However operations that occur before an acquire are free to be reordered to
|
it. However operations that occur before an acquire are free to be reordered to
|
||||||
occur after it. Similarly, a release access ensures that every access before it
|
occur after it. Similarly, a release access ensures that every access before it
|
||||||
*stays* before it. However operations that occur after a release are free to be
|
stays before it. However operations that occur after a release are free to be
|
||||||
reordered to occur before it.
|
reordered to occur before it.
|
||||||
|
|
||||||
When thread A releases a location in memory and then thread B subsequently
|
When thread A releases a location in memory and then thread B subsequently
|
||||||
acquires *the same* location in memory, causality is established. Every write
|
acquires *the same* location in memory, causality is established. Every write
|
||||||
that happened *before* A's release will be observed by B *after* its release.
|
that happened before A's release will be observed by B after its release.
|
||||||
However no causality is established with any other threads. Similarly, no
|
However no causality is established with any other threads. Similarly, no
|
||||||
causality is established if A and B access *different* locations in memory.
|
causality is established if A and B access *different* locations in memory.
|
||||||
|
|
||||||
@@ -230,7 +235,7 @@ weakly-ordered platforms.
|
|||||||
# Relaxed
|
# Relaxed
|
||||||
|
|
||||||
Relaxed accesses are the absolute weakest. They can be freely re-ordered and
|
Relaxed accesses are the absolute weakest. They can be freely re-ordered and
|
||||||
provide no happens-before relationship. Still, relaxed operations *are* still
|
provide no happens-before relationship. Still, relaxed operations are still
|
||||||
atomic. That is, they don't count as data accesses and any read-modify-write
|
atomic. That is, they don't count as data accesses and any read-modify-write
|
||||||
operations done to them occur atomically. Relaxed operations are appropriate for
|
operations done to them occur atomically. Relaxed operations are appropriate for
|
||||||
things that you definitely want to happen, but don't particularly otherwise care
|
things that you definitely want to happen, but don't particularly otherwise care
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
The mutual exclusion property of mutable references can be very limiting when
|
The mutual exclusion property of mutable references can be very limiting when
|
||||||
working with a composite structure. The borrow checker understands some basic
|
working with a composite structure. The borrow checker understands some basic
|
||||||
stuff, but will fall over pretty easily. It *does* understand structs
|
stuff, but will fall over pretty easily. It does understand structs
|
||||||
sufficiently to know that it's possible to borrow disjoint fields of a struct
|
sufficiently to know that it's possible to borrow disjoint fields of a struct
|
||||||
simultaneously. So this works today:
|
simultaneously. So this works today:
|
||||||
|
|
||||||
@@ -50,7 +50,7 @@ to the same value.
|
|||||||
|
|
||||||
In order to "teach" borrowck that what we're doing is ok, we need to drop down
|
In order to "teach" borrowck that what we're doing is ok, we need to drop down
|
||||||
to unsafe code. For instance, mutable slices expose a `split_at_mut` function
|
to unsafe code. For instance, mutable slices expose a `split_at_mut` function
|
||||||
that consumes the slice and returns *two* mutable slices. One for everything to
|
that consumes the slice and returns two mutable slices. One for everything to
|
||||||
the left of the index, and one for everything to the right. Intuitively we know
|
the left of the index, and one for everything to the right. Intuitively we know
|
||||||
this is safe because the slices don't overlap, and therefore alias. However
|
this is safe because the slices don't overlap, and therefore alias. However
|
||||||
the implementation requires some unsafety:
|
the implementation requires some unsafety:
|
||||||
@@ -93,10 +93,10 @@ completely incompatible with this API, as it would produce multiple mutable
|
|||||||
references to the same object!
|
references to the same object!
|
||||||
|
|
||||||
However it actually *does* work, exactly because iterators are one-shot objects.
|
However it actually *does* work, exactly because iterators are one-shot objects.
|
||||||
Everything an IterMut yields will be yielded *at most* once, so we don't
|
Everything an IterMut yields will be yielded at most once, so we don't
|
||||||
*actually* ever yield multiple mutable references to the same piece of data.
|
actually ever yield multiple mutable references to the same piece of data.
|
||||||
|
|
||||||
Perhaps surprisingly, mutable iterators *don't* require unsafe code to be
|
Perhaps surprisingly, mutable iterators don't require unsafe code to be
|
||||||
implemented for many types!
|
implemented for many types!
|
||||||
|
|
||||||
For instance here's a singly linked list:
|
For instance here's a singly linked list:
|
||||||
|
|||||||
@@ -1,13 +1,13 @@
|
|||||||
% Casts
|
% Casts
|
||||||
|
|
||||||
Casts are a superset of coercions: every coercion can be explicitly
|
Casts are a superset of coercions: every coercion can be explicitly
|
||||||
invoked via a cast. However some conversions *require* a cast.
|
invoked via a cast. However some conversions require a cast.
|
||||||
While coercions are pervasive and largely harmless, these "true casts"
|
While coercions are pervasive and largely harmless, these "true casts"
|
||||||
are rare and potentially dangerous. As such, casts must be explicitly invoked
|
are rare and potentially dangerous. As such, casts must be explicitly invoked
|
||||||
using the `as` keyword: `expr as Type`.
|
using the `as` keyword: `expr as Type`.
|
||||||
|
|
||||||
True casts generally revolve around raw pointers and the primitive numeric
|
True casts generally revolve around raw pointers and the primitive numeric
|
||||||
types. Even though they're dangerous, these casts are *infallible* at runtime.
|
types. Even though they're dangerous, these casts are infallible at runtime.
|
||||||
If a cast triggers some subtle corner case no indication will be given that
|
If a cast triggers some subtle corner case no indication will be given that
|
||||||
this occurred. The cast will simply succeed. That said, casts must be valid
|
this occurred. The cast will simply succeed. That said, casts must be valid
|
||||||
at the type level, or else they will be prevented statically. For instance,
|
at the type level, or else they will be prevented statically. For instance,
|
||||||
|
|||||||
@@ -80,7 +80,7 @@ loop {
|
|||||||
// because it relies on actual values.
|
// because it relies on actual values.
|
||||||
if true {
|
if true {
|
||||||
// But it does understand that it will only be taken once because
|
// But it does understand that it will only be taken once because
|
||||||
// we *do* unconditionally break out of it. Therefore `x` doesn't
|
// we unconditionally break out of it. Therefore `x` doesn't
|
||||||
// need to be marked as mutable.
|
// need to be marked as mutable.
|
||||||
x = 0;
|
x = 0;
|
||||||
break;
|
break;
|
||||||
|
|||||||
@@ -2,12 +2,12 @@
|
|||||||
|
|
||||||
Rust as a language doesn't *really* have an opinion on how to do concurrency or
|
Rust as a language doesn't *really* have an opinion on how to do concurrency or
|
||||||
parallelism. The standard library exposes OS threads and blocking sys-calls
|
parallelism. The standard library exposes OS threads and blocking sys-calls
|
||||||
because *everyone* has those, and they're uniform enough that you can provide
|
because everyone has those, and they're uniform enough that you can provide
|
||||||
an abstraction over them in a relatively uncontroversial way. Message passing,
|
an abstraction over them in a relatively uncontroversial way. Message passing,
|
||||||
green threads, and async APIs are all diverse enough that any abstraction over
|
green threads, and async APIs are all diverse enough that any abstraction over
|
||||||
them tends to involve trade-offs that we weren't willing to commit to for 1.0.
|
them tends to involve trade-offs that we weren't willing to commit to for 1.0.
|
||||||
|
|
||||||
However the way Rust models concurrency makes it relatively easy design your own
|
However the way Rust models concurrency makes it relatively easy design your own
|
||||||
concurrency paradigm as a library and have *everyone else's* code Just Work
|
concurrency paradigm as a library and have everyone else's code Just Work
|
||||||
with yours. Just require the right lifetimes and Send and Sync where appropriate
|
with yours. Just require the right lifetimes and Send and Sync where appropriate
|
||||||
and you're off to the races. Or rather, off to the... not... having... races.
|
and you're off to the races. Or rather, off to the... not... having... races.
|
||||||
|
|||||||
@@ -37,14 +37,14 @@ blindly memcopied to somewhere else in memory. This means pure on-the-stack-but-
|
|||||||
still-movable intrusive linked lists are simply not happening in Rust (safely).
|
still-movable intrusive linked lists are simply not happening in Rust (safely).
|
||||||
|
|
||||||
Assignment and copy constructors similarly don't exist because move semantics
|
Assignment and copy constructors similarly don't exist because move semantics
|
||||||
are the *only* semantics in Rust. At most `x = y` just moves the bits of y into
|
are the only semantics in Rust. At most `x = y` just moves the bits of y into
|
||||||
the x variable. Rust *does* provide two facilities for providing C++'s copy-
|
the x variable. Rust does provide two facilities for providing C++'s copy-
|
||||||
oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy
|
oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy
|
||||||
constructor, but it's never implicitly invoked. You have to explicitly call
|
constructor, but it's never implicitly invoked. You have to explicitly call
|
||||||
`clone` on an element you want to be cloned. Copy is a special case of Clone
|
`clone` on an element you want to be cloned. Copy is a special case of Clone
|
||||||
where the implementation is just "copy the bits". Copy types *are* implicitly
|
where the implementation is just "copy the bits". Copy types *are* implicitly
|
||||||
cloned whenever they're moved, but because of the definition of Copy this just
|
cloned whenever they're moved, but because of the definition of Copy this just
|
||||||
means *not* treating the old copy as uninitialized -- a no-op.
|
means not treating the old copy as uninitialized -- a no-op.
|
||||||
|
|
||||||
While Rust provides a `Default` trait for specifying the moral equivalent of a
|
While Rust provides a `Default` trait for specifying the moral equivalent of a
|
||||||
default constructor, it's incredibly rare for this trait to be used. This is
|
default constructor, it's incredibly rare for this trait to be used. This is
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ a different type. Because Rust encourages encoding important properties in the
|
|||||||
type system, these problems are incredibly pervasive. As such, Rust
|
type system, these problems are incredibly pervasive. As such, Rust
|
||||||
consequently gives you several ways to solve them.
|
consequently gives you several ways to solve them.
|
||||||
|
|
||||||
First we'll look at the ways that *Safe Rust* gives you to reinterpret values.
|
First we'll look at the ways that Safe Rust gives you to reinterpret values.
|
||||||
The most trivial way to do this is to just destructure a value into its
|
The most trivial way to do this is to just destructure a value into its
|
||||||
constituent parts and then build a new type out of them. e.g.
|
constituent parts and then build a new type out of them. e.g.
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
% Data Representation in Rust
|
% Data Representation in Rust
|
||||||
|
|
||||||
Low-level programming cares a lot about data layout. It's a big deal. It also pervasively
|
Low-level programming cares a lot about data layout. It's a big deal. It also
|
||||||
influences the rest of the language, so we're going to start by digging into how data is
|
pervasively influences the rest of the language, so we're going to start by
|
||||||
represented in Rust.
|
digging into how data is represented in Rust.
|
||||||
|
|||||||
@@ -7,16 +7,19 @@ What the language *does* provide is full-blown automatic destructors through the
|
|||||||
fn drop(&mut self);
|
fn drop(&mut self);
|
||||||
```
|
```
|
||||||
|
|
||||||
This method gives the type time to somehow finish what it was doing. **After
|
This method gives the type time to somehow finish what it was doing.
|
||||||
`drop` is run, Rust will recursively try to drop all of the fields of `self`**.
|
|
||||||
|
**After `drop` is run, Rust will recursively try to drop all of the fields
|
||||||
|
of `self`.**
|
||||||
|
|
||||||
This is a convenience feature so that you don't have to write "destructor
|
This is a convenience feature so that you don't have to write "destructor
|
||||||
boilerplate" to drop children. If a struct has no special logic for being
|
boilerplate" to drop children. If a struct has no special logic for being
|
||||||
dropped other than dropping its children, then it means `Drop` doesn't need to
|
dropped other than dropping its children, then it means `Drop` doesn't need to
|
||||||
be implemented at all!
|
be implemented at all!
|
||||||
|
|
||||||
**There is no stable way to prevent this behaviour in Rust 1.0.
|
**There is no stable way to prevent this behaviour in Rust 1.0.**
|
||||||
|
|
||||||
Note that taking `&mut self` means that even if you *could* suppress recursive
|
Note that taking `&mut self` means that even if you could suppress recursive
|
||||||
Drop, Rust will prevent you from e.g. moving fields out of self. For most types,
|
Drop, Rust will prevent you from e.g. moving fields out of self. For most types,
|
||||||
this is totally fine.
|
this is totally fine.
|
||||||
|
|
||||||
@@ -90,7 +93,7 @@ After we deallocate the `box`'s ptr in SuperBox's destructor, Rust will
|
|||||||
happily proceed to tell the box to Drop itself and everything will blow up with
|
happily proceed to tell the box to Drop itself and everything will blow up with
|
||||||
use-after-frees and double-frees.
|
use-after-frees and double-frees.
|
||||||
|
|
||||||
Note that the recursive drop behaviour applies to *all* structs and enums
|
Note that the recursive drop behaviour applies to all structs and enums
|
||||||
regardless of whether they implement Drop. Therefore something like
|
regardless of whether they implement Drop. Therefore something like
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
@@ -114,7 +117,7 @@ enum Link {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
will have its inner Box field dropped *if and only if* an instance stores the
|
will have its inner Box field dropped if and only if an instance stores the
|
||||||
Next variant.
|
Next variant.
|
||||||
|
|
||||||
In general this works really nice because you don't need to worry about
|
In general this works really nice because you don't need to worry about
|
||||||
@@ -165,7 +168,7 @@ impl<T> Drop for SuperBox<T> {
|
|||||||
```
|
```
|
||||||
|
|
||||||
However this has fairly odd semantics: you're saying that a field that *should*
|
However this has fairly odd semantics: you're saying that a field that *should*
|
||||||
always be Some may be None, just because that happens in the destructor. Of
|
always be Some *may* be None, just because that happens in the destructor. Of
|
||||||
course this conversely makes a lot of sense: you can call arbitrary methods on
|
course this conversely makes a lot of sense: you can call arbitrary methods on
|
||||||
self during the destructor, and this should prevent you from ever doing so after
|
self during the destructor, and this should prevent you from ever doing so after
|
||||||
deinitializing the field. Not that it will prevent you from producing any other
|
deinitializing the field. Not that it will prevent you from producing any other
|
||||||
|
|||||||
@@ -10,7 +10,7 @@ How can it do this with conditional initialization?
|
|||||||
|
|
||||||
Note that this is not a problem that all assignments need worry about. In
|
Note that this is not a problem that all assignments need worry about. In
|
||||||
particular, assigning through a dereference unconditionally drops, and assigning
|
particular, assigning through a dereference unconditionally drops, and assigning
|
||||||
in a `let` unconditionally *doesn't* drop:
|
in a `let` unconditionally doesn't drop:
|
||||||
|
|
||||||
```
|
```
|
||||||
let mut x = Box::new(0); // let makes a fresh variable, so never need to drop
|
let mut x = Box::new(0); // let makes a fresh variable, so never need to drop
|
||||||
@@ -23,11 +23,11 @@ one of its subfields.
|
|||||||
|
|
||||||
It turns out that Rust actually tracks whether a type should be dropped or not
|
It turns out that Rust actually tracks whether a type should be dropped or not
|
||||||
*at runtime*. As a variable becomes initialized and uninitialized, a *drop flag*
|
*at runtime*. As a variable becomes initialized and uninitialized, a *drop flag*
|
||||||
for that variable is toggled. When a variable *might* need to be dropped, this
|
for that variable is toggled. When a variable might need to be dropped, this
|
||||||
flag is evaluated to determine if it *should* be dropped.
|
flag is evaluated to determine if it should be dropped.
|
||||||
|
|
||||||
Of course, it is *often* the case that a value's initialization state can be
|
Of course, it is often the case that a value's initialization state can be
|
||||||
*statically* known at every point in the program. If this is the case, then the
|
statically known at every point in the program. If this is the case, then the
|
||||||
compiler can theoretically generate more efficient code! For instance, straight-
|
compiler can theoretically generate more efficient code! For instance, straight-
|
||||||
line code has such *static drop semantics*:
|
line code has such *static drop semantics*:
|
||||||
|
|
||||||
@@ -40,8 +40,8 @@ y = x; // y was init; Drop y, overwrite it, and make x uninit!
|
|||||||
// x goes out of scope; x was uninit; do nothing.
|
// x goes out of scope; x was uninit; do nothing.
|
||||||
```
|
```
|
||||||
|
|
||||||
And even branched code where all branches have the same behaviour with respect
|
Similarly, branched code where all branches have the same behaviour with respect
|
||||||
to initialization:
|
to initialization has static drop semantics:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
# let condition = true;
|
# let condition = true;
|
||||||
@@ -65,7 +65,7 @@ if condition {
|
|||||||
x = Box::new(0); // x was uninit; just overwrite.
|
x = Box::new(0); // x was uninit; just overwrite.
|
||||||
println!("{}", x);
|
println!("{}", x);
|
||||||
}
|
}
|
||||||
// x goes out of scope; x *might* be uninit;
|
// x goes out of scope; x might be uninit;
|
||||||
// check the flag!
|
// check the flag!
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -81,7 +81,7 @@ if condition {
|
|||||||
|
|
||||||
As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a hidden
|
As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a hidden
|
||||||
field of any type that implements Drop. Rust sets the drop flag by overwriting
|
field of any type that implements Drop. Rust sets the drop flag by overwriting
|
||||||
the *entire* value with a particular bit pattern. This is pretty obviously Not
|
the entire value with a particular bit pattern. This is pretty obviously Not
|
||||||
The Fastest and causes a bunch of trouble with optimizing code. It's legacy from
|
The Fastest and causes a bunch of trouble with optimizing code. It's legacy from
|
||||||
a time when you could do much more complex conditional initialization.
|
a time when you could do much more complex conditional initialization.
|
||||||
|
|
||||||
@@ -92,4 +92,4 @@ as it requires fairly substantial changes to the compiler.
|
|||||||
Regardless, Rust programs don't need to worry about uninitialized values on
|
Regardless, Rust programs don't need to worry about uninitialized values on
|
||||||
the stack for correctness. Although they might care for performance. Thankfully,
|
the stack for correctness. Although they might care for performance. Thankfully,
|
||||||
Rust makes it easy to take control here! Uninitialized values are there, and
|
Rust makes it easy to take control here! Uninitialized values are there, and
|
||||||
you can work with them in Safe Rust, but you're *never* in danger.
|
you can work with them in Safe Rust, but you're never in danger.
|
||||||
|
|||||||
@@ -30,7 +30,7 @@ let (x, y) = (vec![], vec![]);
|
|||||||
```
|
```
|
||||||
|
|
||||||
Does either value strictly outlive the other? The answer is in fact *no*,
|
Does either value strictly outlive the other? The answer is in fact *no*,
|
||||||
neither value strictly outlives the other. Of course, one of x or y will be
|
neither value strictly outlives the other. Of course, one of x or y will be
|
||||||
dropped before the other, but the actual order is not specified. Tuples aren't
|
dropped before the other, but the actual order is not specified. Tuples aren't
|
||||||
special in this regard; composite structures just don't guarantee their
|
special in this regard; composite structures just don't guarantee their
|
||||||
destruction order as of Rust 1.0.
|
destruction order as of Rust 1.0.
|
||||||
@@ -100,11 +100,11 @@ fn main() {
|
|||||||
<anon>:15 }
|
<anon>:15 }
|
||||||
```
|
```
|
||||||
|
|
||||||
Implementing Drop lets the Inspector execute some arbitrary code *during* its
|
Implementing Drop lets the Inspector execute some arbitrary code during its
|
||||||
death. This means it can potentially observe that types that are supposed to
|
death. This means it can potentially observe that types that are supposed to
|
||||||
live as long as it does actually were destroyed first.
|
live as long as it does actually were destroyed first.
|
||||||
|
|
||||||
Interestingly, only *generic* types need to worry about this. If they aren't
|
Interestingly, only generic types need to worry about this. If they aren't
|
||||||
generic, then the only lifetimes they can harbor are `'static`, which will truly
|
generic, then the only lifetimes they can harbor are `'static`, which will truly
|
||||||
live *forever*. This is why this problem is referred to as *sound generic drop*.
|
live *forever*. This is why this problem is referred to as *sound generic drop*.
|
||||||
Sound generic drop is enforced by the *drop checker*. As of this writing, some
|
Sound generic drop is enforced by the *drop checker*. As of this writing, some
|
||||||
@@ -116,12 +116,12 @@ section:
|
|||||||
strictly outlive it.**
|
strictly outlive it.**
|
||||||
|
|
||||||
This rule is sufficient but not necessary to satisfy the drop checker. That is,
|
This rule is sufficient but not necessary to satisfy the drop checker. That is,
|
||||||
if your type obeys this rule then it's *definitely* sound to drop. However
|
if your type obeys this rule then it's definitely sound to drop. However
|
||||||
there are special cases where you can fail to satisfy this, but still
|
there are special cases where you can fail to satisfy this, but still
|
||||||
successfully pass the borrow checker. These are the precise rules that are
|
successfully pass the borrow checker. These are the precise rules that are
|
||||||
currently up in the air.
|
currently up in the air.
|
||||||
|
|
||||||
It turns out that when writing unsafe code, we generally don't need to
|
It turns out that when writing unsafe code, we generally don't need to
|
||||||
worry at all about doing the right thing for the drop checker. However there
|
worry at all about doing the right thing for the drop checker. However there
|
||||||
is *one* special case that you need to worry about, which we will look at in
|
is one special case that you need to worry about, which we will look at in
|
||||||
the next section.
|
the next section.
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
% Exception Safety
|
% Exception Safety
|
||||||
|
|
||||||
Although programs should use unwinding sparingly, there's *a lot* of code that
|
Although programs should use unwinding sparingly, there's a lot of code that
|
||||||
*can* panic. If you unwrap a None, index out of bounds, or divide by 0, your
|
*can* panic. If you unwrap a None, index out of bounds, or divide by 0, your
|
||||||
program *will* panic. On debug builds, *every* arithmetic operation can panic
|
program will panic. On debug builds, every arithmetic operation can panic
|
||||||
if it overflows. Unless you are very careful and tightly control what code runs,
|
if it overflows. Unless you are very careful and tightly control what code runs,
|
||||||
pretty much everything can unwind, and you need to be ready for it.
|
pretty much everything can unwind, and you need to be ready for it.
|
||||||
|
|
||||||
@@ -22,7 +22,7 @@ unsound states must be careful that a panic does not cause that state to be
|
|||||||
used. Generally this means ensuring that only non-panicking code is run while
|
used. Generally this means ensuring that only non-panicking code is run while
|
||||||
these states exist, or making a guard that cleans up the state in the case of
|
these states exist, or making a guard that cleans up the state in the case of
|
||||||
a panic. This does not necessarily mean that the state a panic witnesses is a
|
a panic. This does not necessarily mean that the state a panic witnesses is a
|
||||||
fully *coherent* state. We need only guarantee that it's a *safe* state.
|
fully coherent state. We need only guarantee that it's a *safe* state.
|
||||||
|
|
||||||
Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe.
|
Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe.
|
||||||
It controls all the code that runs, and most of that code can't panic. However
|
It controls all the code that runs, and most of that code can't panic. However
|
||||||
@@ -58,17 +58,16 @@ impl<T: Clone> Vec<T> {
|
|||||||
We bypass `push` in order to avoid redundant capacity and `len` checks on the
|
We bypass `push` in order to avoid redundant capacity and `len` checks on the
|
||||||
Vec that we definitely know has capacity. The logic is totally correct, except
|
Vec that we definitely know has capacity. The logic is totally correct, except
|
||||||
there's a subtle problem with our code: it's not exception-safe! `set_len`,
|
there's a subtle problem with our code: it's not exception-safe! `set_len`,
|
||||||
`offset`, and `write` are all fine, but *clone* is the panic bomb we over-
|
`offset`, and `write` are all fine; `clone` is the panic bomb we over-looked.
|
||||||
looked.
|
|
||||||
|
|
||||||
Clone is completely out of our control, and is totally free to panic. If it
|
Clone is completely out of our control, and is totally free to panic. If it
|
||||||
does, our function will exit early with the length of the Vec set too large. If
|
does, our function will exit early with the length of the Vec set too large. If
|
||||||
the Vec is looked at or dropped, uninitialized memory will be read!
|
the Vec is looked at or dropped, uninitialized memory will be read!
|
||||||
|
|
||||||
The fix in this case is fairly simple. If we want to guarantee that the values
|
The fix in this case is fairly simple. If we want to guarantee that the values
|
||||||
we *did* clone are dropped we can set the len *in* the loop. If we just want to
|
we *did* clone are dropped, we can set the `len` every loop iteration. If we
|
||||||
guarantee that uninitialized memory can't be observed, we can set the len
|
just want to guarantee that uninitialized memory can't be observed, we can set
|
||||||
*after* the loop.
|
the `len` after the loop.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -89,7 +88,7 @@ bubble_up(heap, index):
|
|||||||
|
|
||||||
A literal transcription of this code to Rust is totally fine, but has an annoying
|
A literal transcription of this code to Rust is totally fine, but has an annoying
|
||||||
performance characteristic: the `self` element is swapped over and over again
|
performance characteristic: the `self` element is swapped over and over again
|
||||||
uselessly. We would *rather* have the following:
|
uselessly. We would rather have the following:
|
||||||
|
|
||||||
```text
|
```text
|
||||||
bubble_up(heap, index):
|
bubble_up(heap, index):
|
||||||
@@ -128,7 +127,7 @@ actually touched the state of the heap yet. Once we do start messing with the
|
|||||||
heap, we're working with only data and functions that we trust, so there's no
|
heap, we're working with only data and functions that we trust, so there's no
|
||||||
concern of panics.
|
concern of panics.
|
||||||
|
|
||||||
Perhaps you're not happy with this design. Surely, it's cheating! And we have
|
Perhaps you're not happy with this design. Surely it's cheating! And we have
|
||||||
to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's
|
to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's
|
||||||
intermix untrusted and unsafe code *for reals*.
|
intermix untrusted and unsafe code *for reals*.
|
||||||
|
|
||||||
|
|||||||
@@ -48,7 +48,7 @@ a variable position based on its alignment][dst-issue].**
|
|||||||
|
|
||||||
# Zero Sized Types (ZSTs)
|
# Zero Sized Types (ZSTs)
|
||||||
|
|
||||||
Rust actually allows types to be specified that occupy *no* space:
|
Rust actually allows types to be specified that occupy no space:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
struct Foo; // No fields = no size
|
struct Foo; // No fields = no size
|
||||||
@@ -124,7 +124,7 @@ let res: Result<u32, Void> = Ok(0);
|
|||||||
let Ok(num) = res;
|
let Ok(num) = res;
|
||||||
```
|
```
|
||||||
|
|
||||||
But neither of these tricks work today, so all Void types get you today is
|
But neither of these tricks work today, so all Void types get you is
|
||||||
the ability to be confident that certain situations are statically impossible.
|
the ability to be confident that certain situations are statically impossible.
|
||||||
|
|
||||||
One final subtle detail about empty types is that raw pointers to them are
|
One final subtle detail about empty types is that raw pointers to them are
|
||||||
|
|||||||
@@ -55,7 +55,7 @@ fn main() {
|
|||||||
How on earth are we supposed to express the lifetimes on `F`'s trait bound? We
|
How on earth are we supposed to express the lifetimes on `F`'s trait bound? We
|
||||||
need to provide some lifetime there, but the lifetime we care about can't be
|
need to provide some lifetime there, but the lifetime we care about can't be
|
||||||
named until we enter the body of `call`! Also, that isn't some fixed lifetime;
|
named until we enter the body of `call`! Also, that isn't some fixed lifetime;
|
||||||
call works with *any* lifetime `&self` happens to have at that point.
|
`call` works with *any* lifetime `&self` happens to have at that point.
|
||||||
|
|
||||||
This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we
|
This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we
|
||||||
desugar this is as follows:
|
desugar this is as follows:
|
||||||
|
|||||||
@@ -21,21 +21,21 @@ uselessly, holding on to its precious resources until the program terminates (at
|
|||||||
which point all those resources would have been reclaimed by the OS anyway).
|
which point all those resources would have been reclaimed by the OS anyway).
|
||||||
|
|
||||||
We may consider a more restricted form of leak: failing to drop a value that is
|
We may consider a more restricted form of leak: failing to drop a value that is
|
||||||
unreachable. Rust also doesn't prevent this. In fact Rust has a *function for
|
unreachable. Rust also doesn't prevent this. In fact Rust *has a function for
|
||||||
doing this*: `mem::forget`. This function consumes the value it is passed *and
|
doing this*: `mem::forget`. This function consumes the value it is passed *and
|
||||||
then doesn't run its destructor*.
|
then doesn't run its destructor*.
|
||||||
|
|
||||||
In the past `mem::forget` was marked as unsafe as a sort of lint against using
|
In the past `mem::forget` was marked as unsafe as a sort of lint against using
|
||||||
it, since failing to call a destructor is generally not a well-behaved thing to
|
it, since failing to call a destructor is generally not a well-behaved thing to
|
||||||
do (though useful for some special unsafe code). However this was generally
|
do (though useful for some special unsafe code). However this was generally
|
||||||
determined to be an untenable stance to take: there are *many* ways to fail to
|
determined to be an untenable stance to take: there are many ways to fail to
|
||||||
call a destructor in safe code. The most famous example is creating a cycle of
|
call a destructor in safe code. The most famous example is creating a cycle of
|
||||||
reference-counted pointers using interior mutability.
|
reference-counted pointers using interior mutability.
|
||||||
|
|
||||||
It is reasonable for safe code to assume that destructor leaks do not happen, as
|
It is reasonable for safe code to assume that destructor leaks do not happen, as
|
||||||
any program that leaks destructors is probably wrong. However *unsafe* code
|
any program that leaks destructors is probably wrong. However *unsafe* code
|
||||||
cannot rely on destructors to be run to be *safe*. For most types this doesn't
|
cannot rely on destructors to be run in order to be safe. For most types this
|
||||||
matter: if you leak the destructor then the type is *by definition*
|
doesn't matter: if you leak the destructor then the type is by definition
|
||||||
inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
|
inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
|
||||||
then you waste some memory but that's hardly going to violate memory-safety.
|
then you waste some memory but that's hardly going to violate memory-safety.
|
||||||
|
|
||||||
@@ -64,7 +64,7 @@ uninitialized data! We could backshift all the elements in the Vec every time we
|
|||||||
remove a value, but this would have pretty catastrophic performance
|
remove a value, but this would have pretty catastrophic performance
|
||||||
consequences.
|
consequences.
|
||||||
|
|
||||||
Instead, we would like Drain to *fix* the Vec's backing storage when it is
|
Instead, we would like Drain to fix the Vec's backing storage when it is
|
||||||
dropped. It should run itself to completion, backshift any elements that weren't
|
dropped. It should run itself to completion, backshift any elements that weren't
|
||||||
removed (drain supports subranges), and then fix Vec's `len`. It's even
|
removed (drain supports subranges), and then fix Vec's `len`. It's even
|
||||||
unwinding-safe! Easy!
|
unwinding-safe! Easy!
|
||||||
@@ -97,13 +97,13 @@ consistent state gives us Undefined Behaviour in safe code (making the API
|
|||||||
unsound).
|
unsound).
|
||||||
|
|
||||||
So what can we do? Well, we can pick a trivially consistent state: set the Vec's
|
So what can we do? Well, we can pick a trivially consistent state: set the Vec's
|
||||||
len to be 0 when we *start* the iteration, and fix it up if necessary in the
|
len to be 0 when we start the iteration, and fix it up if necessary in the
|
||||||
destructor. That way, if everything executes like normal we get the desired
|
destructor. That way, if everything executes like normal we get the desired
|
||||||
behaviour with minimal overhead. But if someone has the *audacity* to
|
behaviour with minimal overhead. But if someone has the *audacity* to
|
||||||
mem::forget us in the middle of the iteration, all that does is *leak even more*
|
mem::forget us in the middle of the iteration, all that does is *leak even more*
|
||||||
(and possibly leave the Vec in an *unexpected* but consistent state). Since
|
(and possibly leave the Vec in an unexpected but otherwise consistent state).
|
||||||
we've accepted that mem::forget is safe, this is definitely safe. We call leaks
|
Since we've accepted that mem::forget is safe, this is definitely safe. We call
|
||||||
causing more leaks a *leak amplification*.
|
leaks causing more leaks a *leak amplification*.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -167,16 +167,16 @@ impl<T> Drop for Rc<T> {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
This code contains an implicit and subtle assumption: ref_count can fit in a
|
This code contains an implicit and subtle assumption: `ref_count` can fit in a
|
||||||
`usize`, because there can't be more than `usize::MAX` Rcs in memory. However
|
`usize`, because there can't be more than `usize::MAX` Rcs in memory. However
|
||||||
this itself assumes that the ref_count accurately reflects the number of Rcs
|
this itself assumes that the `ref_count` accurately reflects the number of Rcs
|
||||||
in memory, which we know is false with mem::forget. Using mem::forget we can
|
in memory, which we know is false with `mem::forget`. Using `mem::forget` we can
|
||||||
overflow the ref_count, and then get it down to 0 with outstanding Rcs. Then we
|
overflow the `ref_count`, and then get it down to 0 with outstanding Rcs. Then
|
||||||
can happily use-after-free the inner data. Bad Bad Not Good.
|
we can happily use-after-free the inner data. Bad Bad Not Good.
|
||||||
|
|
||||||
This can be solved by *saturating* the ref_count, which is sound because
|
This can be solved by just checking the `ref_count` and doing *something*. The
|
||||||
decreasing the refcount by `n` still requires `n` Rcs simultaneously living
|
standard library's stance is to just abort, because your program has become
|
||||||
in memory.
|
horribly degenerate. Also *oh my gosh* it's such a ridiculous corner case.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -237,7 +237,7 @@ In principle, this totally works! Rust's ownership system perfectly ensures it!
|
|||||||
let mut data = Box::new(0);
|
let mut data = Box::new(0);
|
||||||
{
|
{
|
||||||
let guard = thread::scoped(|| {
|
let guard = thread::scoped(|| {
|
||||||
// This is at best a data race. At worst, it's *also* a use-after-free.
|
// This is at best a data race. At worst, it's also a use-after-free.
|
||||||
*data += 1;
|
*data += 1;
|
||||||
});
|
});
|
||||||
// Because the guard is forgotten, expiring the loan without blocking this
|
// Because the guard is forgotten, expiring the loan without blocking this
|
||||||
|
|||||||
@@ -18,7 +18,7 @@ fn main() {
|
|||||||
```
|
```
|
||||||
|
|
||||||
One might expect it to compile. We call `mutate_and_share`, which mutably borrows
|
One might expect it to compile. We call `mutate_and_share`, which mutably borrows
|
||||||
`foo` *temporarily*, but then returns *only* a shared reference. Therefore we
|
`foo` temporarily, but then returns only a shared reference. Therefore we
|
||||||
would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed.
|
would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed.
|
||||||
|
|
||||||
However when we try to compile it:
|
However when we try to compile it:
|
||||||
@@ -69,7 +69,7 @@ due to the lifetime of `loan` and mutate_and_share's signature. Then when we
|
|||||||
try to call `share`, and it sees we're trying to alias that `&'c mut foo` and
|
try to call `share`, and it sees we're trying to alias that `&'c mut foo` and
|
||||||
blows up in our face!
|
blows up in our face!
|
||||||
|
|
||||||
This program is clearly correct according to the reference semantics we *actually*
|
This program is clearly correct according to the reference semantics we actually
|
||||||
care about, but the lifetime system is too coarse-grained to handle that.
|
care about, but the lifetime system is too coarse-grained to handle that.
|
||||||
|
|
||||||
|
|
||||||
@@ -78,4 +78,4 @@ TODO: other common problems? SEME regions stuff, mostly?
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
[ex2]: lifetimes.html#example-2:-aliasing-a-mutable-reference
|
[ex2]: lifetimes.html#example-2:-aliasing-a-mutable-reference
|
||||||
|
|||||||
@@ -6,11 +6,11 @@ and anything that contains a reference, is tagged with a lifetime specifying
|
|||||||
the scope it's valid for.
|
the scope it's valid for.
|
||||||
|
|
||||||
Within a function body, Rust generally doesn't let you explicitly name the
|
Within a function body, Rust generally doesn't let you explicitly name the
|
||||||
lifetimes involved. This is because it's generally not really *necessary*
|
lifetimes involved. This is because it's generally not really necessary
|
||||||
to talk about lifetimes in a local context; Rust has all the information and
|
to talk about lifetimes in a local context; Rust has all the information and
|
||||||
can work out everything as optimally as possible. Many anonymous scopes and
|
can work out everything as optimally as possible. Many anonymous scopes and
|
||||||
temporaries that you would otherwise have to write are often introduced to
|
temporaries that you would otherwise have to write are often introduced to
|
||||||
make your code *just work*.
|
make your code Just Work.
|
||||||
|
|
||||||
However once you cross the function boundary, you need to start talking about
|
However once you cross the function boundary, you need to start talking about
|
||||||
lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip
|
lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip
|
||||||
@@ -42,7 +42,7 @@ likely desugar to the following:
|
|||||||
'a: {
|
'a: {
|
||||||
let x: i32 = 0;
|
let x: i32 = 0;
|
||||||
'b: {
|
'b: {
|
||||||
// lifetime used is 'b because that's *good enough*.
|
// lifetime used is 'b because that's good enough.
|
||||||
let y: &'b i32 = &'b x;
|
let y: &'b i32 = &'b x;
|
||||||
'c: {
|
'c: {
|
||||||
// ditto on 'c
|
// ditto on 'c
|
||||||
@@ -107,8 +107,9 @@ fn as_str<'a>(data: &'a u32) -> &'a str {
|
|||||||
This signature of `as_str` takes a reference to a u32 with *some* lifetime, and
|
This signature of `as_str` takes a reference to a u32 with *some* lifetime, and
|
||||||
promises that it can produce a reference to a str that can live *just as long*.
|
promises that it can produce a reference to a str that can live *just as long*.
|
||||||
Already we can see why this signature might be trouble. That basically implies
|
Already we can see why this signature might be trouble. That basically implies
|
||||||
that we're going to *find* a str somewhere in the scope the scope the reference
|
that we're going to find a str somewhere in the scope the reference
|
||||||
to the u32 originated in, or somewhere *even* earlier. That's a *bit* of a big ask.
|
to the u32 originated in, or somewhere *even earlier*. That's a bit of a big
|
||||||
|
ask.
|
||||||
|
|
||||||
We then proceed to compute the string `s`, and return a reference to it. Since
|
We then proceed to compute the string `s`, and return a reference to it. Since
|
||||||
the contract of our function says the reference must outlive `'a`, that's the
|
the contract of our function says the reference must outlive `'a`, that's the
|
||||||
@@ -135,7 +136,7 @@ fn main() {
|
|||||||
'd: {
|
'd: {
|
||||||
// An anonymous scope is introduced because the borrow does not
|
// An anonymous scope is introduced because the borrow does not
|
||||||
// need to last for the whole scope x is valid for. The return
|
// need to last for the whole scope x is valid for. The return
|
||||||
// of as_str must find a str somewhere *before* this function
|
// of as_str must find a str somewhere before this function
|
||||||
// call. Obviously not happening.
|
// call. Obviously not happening.
|
||||||
println!("{}", as_str::<'d>(&'d x));
|
println!("{}", as_str::<'d>(&'d x));
|
||||||
}
|
}
|
||||||
@@ -195,21 +196,21 @@ println!("{}", x);
|
|||||||
|
|
||||||
The problem here is is bit more subtle and interesting. We want Rust to
|
The problem here is is bit more subtle and interesting. We want Rust to
|
||||||
reject this program for the following reason: We have a live shared reference `x`
|
reject this program for the following reason: We have a live shared reference `x`
|
||||||
to a descendent of `data` when try to take a *mutable* reference to `data`
|
to a descendent of `data` when we try to take a mutable reference to `data`
|
||||||
when we call `push`. This would create an aliased mutable reference, which would
|
to `push`. This would create an aliased mutable reference, which would
|
||||||
violate the *second* rule of references.
|
violate the *second* rule of references.
|
||||||
|
|
||||||
However this is *not at all* how Rust reasons that this program is bad. Rust
|
However this is *not at all* how Rust reasons that this program is bad. Rust
|
||||||
doesn't understand that `x` is a reference to a subpath of `data`. It doesn't
|
doesn't understand that `x` is a reference to a subpath of `data`. It doesn't
|
||||||
understand Vec at all. What it *does* see is that `x` has to live for `'b` to
|
understand Vec at all. What it *does* see is that `x` has to live for `'b` to
|
||||||
be printed. The signature of `Index::index` subsequently demands that the
|
be printed. The signature of `Index::index` subsequently demands that the
|
||||||
reference we take to *data* has to survive for `'b`. When we try to call `push`,
|
reference we take to `data` has to survive for `'b`. When we try to call `push`,
|
||||||
it then sees us try to make an `&'c mut data`. Rust knows that `'c` is contained
|
it then sees us try to make an `&'c mut data`. Rust knows that `'c` is contained
|
||||||
within `'b`, and rejects our program because the `&'b data` must still be live!
|
within `'b`, and rejects our program because the `&'b data` must still be live!
|
||||||
|
|
||||||
Here we see that the lifetime system is *much* more coarse than the reference
|
Here we see that the lifetime system is much more coarse than the reference
|
||||||
semantics we're actually interested in preserving. For the most part, *that's
|
semantics we're actually interested in preserving. For the most part, *that's
|
||||||
totally ok*, because it keeps us from spending all day explaining our program
|
totally ok*, because it keeps us from spending all day explaining our program
|
||||||
to the compiler. However it does mean that several programs that are *totally*
|
to the compiler. However it does mean that several programs that are totally
|
||||||
correct with respect to Rust's *true* semantics are rejected because lifetimes
|
correct with respect to Rust's *true* semantics are rejected because lifetimes
|
||||||
are too dumb.
|
are too dumb.
|
||||||
|
|||||||
@@ -29,7 +29,7 @@ Rust, you will never have to worry about type-safety or memory-safety. You will
|
|||||||
never endure a null or dangling pointer, or any of that Undefined Behaviour
|
never endure a null or dangling pointer, or any of that Undefined Behaviour
|
||||||
nonsense.
|
nonsense.
|
||||||
|
|
||||||
*That's totally awesome*.
|
*That's totally awesome.*
|
||||||
|
|
||||||
The standard library also gives you enough utilities out-of-the-box that you'll
|
The standard library also gives you enough utilities out-of-the-box that you'll
|
||||||
be able to write awesome high-performance applications and libraries in pure
|
be able to write awesome high-performance applications and libraries in pure
|
||||||
@@ -41,7 +41,7 @@ low-level abstraction not exposed by the standard library. Maybe you're
|
|||||||
need to do something the type-system doesn't understand and just *frob some dang
|
need to do something the type-system doesn't understand and just *frob some dang
|
||||||
bits*. Maybe you need Unsafe Rust.
|
bits*. Maybe you need Unsafe Rust.
|
||||||
|
|
||||||
Unsafe Rust is exactly like Safe Rust with *all* the same rules and semantics.
|
Unsafe Rust is exactly like Safe Rust with all the same rules and semantics.
|
||||||
However Unsafe Rust lets you do some *extra* things that are Definitely Not Safe.
|
However Unsafe Rust lets you do some *extra* things that are Definitely Not Safe.
|
||||||
|
|
||||||
The only things that are different in Unsafe Rust are that you can:
|
The only things that are different in Unsafe Rust are that you can:
|
||||||
|
|||||||
@@ -12,7 +12,7 @@ language?
|
|||||||
|
|
||||||
Regardless of your feelings on GC, it is pretty clearly a *massive* boon to
|
Regardless of your feelings on GC, it is pretty clearly a *massive* boon to
|
||||||
making code safe. You never have to worry about things going away *too soon*
|
making code safe. You never have to worry about things going away *too soon*
|
||||||
(although whether you still *wanted* to be pointing at that thing is a different
|
(although whether you still wanted to be pointing at that thing is a different
|
||||||
issue...). This is a pervasive problem that C and C++ programs need to deal
|
issue...). This is a pervasive problem that C and C++ programs need to deal
|
||||||
with. Consider this simple mistake that all of us who have used a non-GC'd
|
with. Consider this simple mistake that all of us who have used a non-GC'd
|
||||||
language have made at one point:
|
language have made at one point:
|
||||||
|
|||||||
@@ -14,11 +14,11 @@ struct Iter<'a, T: 'a> {
|
|||||||
|
|
||||||
However because `'a` is unused within the struct's body, it's *unbounded*.
|
However because `'a` is unused within the struct's body, it's *unbounded*.
|
||||||
Because of the troubles this has historically caused, unbounded lifetimes and
|
Because of the troubles this has historically caused, unbounded lifetimes and
|
||||||
types are *illegal* in struct definitions. Therefore we must somehow refer
|
types are *forbidden* in struct definitions. Therefore we must somehow refer
|
||||||
to these types in the body. Correctly doing this is necessary to have
|
to these types in the body. Correctly doing this is necessary to have
|
||||||
correct variance and drop checking.
|
correct variance and drop checking.
|
||||||
|
|
||||||
We do this using *PhantomData*, which is a special marker type. PhantomData
|
We do this using `PhantomData`, which is a special marker type. `PhantomData`
|
||||||
consumes no space, but simulates a field of the given type for the purpose of
|
consumes no space, but simulates a field of the given type for the purpose of
|
||||||
static analysis. This was deemed to be less error-prone than explicitly telling
|
static analysis. This was deemed to be less error-prone than explicitly telling
|
||||||
the type-system the kind of variance that you want, while also providing other
|
the type-system the kind of variance that you want, while also providing other
|
||||||
@@ -57,7 +57,7 @@ Good to go!
|
|||||||
Nope.
|
Nope.
|
||||||
|
|
||||||
The drop checker will generously determine that Vec<T> does not own any values
|
The drop checker will generously determine that Vec<T> does not own any values
|
||||||
of type T. This will in turn make it conclude that it does *not* need to worry
|
of type T. This will in turn make it conclude that it doesn't need to worry
|
||||||
about Vec dropping any T's in its destructor for determining drop check
|
about Vec dropping any T's in its destructor for determining drop check
|
||||||
soundness. This will in turn allow people to create unsoundness using
|
soundness. This will in turn allow people to create unsoundness using
|
||||||
Vec's destructor.
|
Vec's destructor.
|
||||||
|
|||||||
@@ -20,7 +20,7 @@ standard library's Mutex type. A Mutex will poison itself if one of its
|
|||||||
MutexGuards (the thing it returns when a lock is obtained) is dropped during a
|
MutexGuards (the thing it returns when a lock is obtained) is dropped during a
|
||||||
panic. Any future attempts to lock the Mutex will return an `Err` or panic.
|
panic. Any future attempts to lock the Mutex will return an `Err` or panic.
|
||||||
|
|
||||||
Mutex poisons not for *true* safety in the sense that Rust normally cares about. It
|
Mutex poisons not for true safety in the sense that Rust normally cares about. It
|
||||||
poisons as a safety-guard against blindly using the data that comes out of a Mutex
|
poisons as a safety-guard against blindly using the data that comes out of a Mutex
|
||||||
that has witnessed a panic while locked. The data in such a Mutex was likely in the
|
that has witnessed a panic while locked. The data in such a Mutex was likely in the
|
||||||
middle of being modified, and as such may be in an inconsistent or incomplete state.
|
middle of being modified, and as such may be in an inconsistent or incomplete state.
|
||||||
|
|||||||
@@ -12,11 +12,13 @@ it's impossible to alias a mutable reference, so it's impossible to perform a
|
|||||||
data race. Interior mutability makes this more complicated, which is largely why
|
data race. Interior mutability makes this more complicated, which is largely why
|
||||||
we have the Send and Sync traits (see below).
|
we have the Send and Sync traits (see below).
|
||||||
|
|
||||||
However Rust *does not* prevent general race conditions. This is
|
**However Rust does not prevent general race conditions.**
|
||||||
pretty fundamentally impossible, and probably honestly undesirable. Your hardware
|
|
||||||
is racy, your OS is racy, the other programs on your computer are racy, and the
|
This is pretty fundamentally impossible, and probably honestly undesirable. Your
|
||||||
world this all runs in is racy. Any system that could genuinely claim to prevent
|
hardware is racy, your OS is racy, the other programs on your computer are racy,
|
||||||
*all* race conditions would be pretty awful to use, if not just incorrect.
|
and the world this all runs in is racy. Any system that could genuinely claim to
|
||||||
|
prevent *all* race conditions would be pretty awful to use, if not just
|
||||||
|
incorrect.
|
||||||
|
|
||||||
So it's perfectly "fine" for a Safe Rust program to get deadlocked or do
|
So it's perfectly "fine" for a Safe Rust program to get deadlocked or do
|
||||||
something incredibly stupid with incorrect synchronization. Obviously such a
|
something incredibly stupid with incorrect synchronization. Obviously such a
|
||||||
@@ -46,7 +48,7 @@ thread::spawn(move || {
|
|||||||
});
|
});
|
||||||
|
|
||||||
// Index with the value loaded from the atomic. This is safe because we
|
// Index with the value loaded from the atomic. This is safe because we
|
||||||
// read the atomic memory only once, and then pass a *copy* of that value
|
// read the atomic memory only once, and then pass a copy of that value
|
||||||
// to the Vec's indexing implementation. This indexing will be correctly
|
// to the Vec's indexing implementation. This indexing will be correctly
|
||||||
// bounds checked, and there's no chance of the value getting changed
|
// bounds checked, and there's no chance of the value getting changed
|
||||||
// in the middle. However our program may panic if the thread we spawned
|
// in the middle. However our program may panic if the thread we spawned
|
||||||
@@ -75,7 +77,7 @@ thread::spawn(move || {
|
|||||||
|
|
||||||
if idx.load(Ordering::SeqCst) < data.len() {
|
if idx.load(Ordering::SeqCst) < data.len() {
|
||||||
unsafe {
|
unsafe {
|
||||||
// Incorrectly loading the idx *after* we did the bounds check.
|
// Incorrectly loading the idx after we did the bounds check.
|
||||||
// It could have changed. This is a race condition, *and dangerous*
|
// It could have changed. This is a race condition, *and dangerous*
|
||||||
// because we decided to do `get_unchecked`, which is `unsafe`.
|
// because we decided to do `get_unchecked`, which is `unsafe`.
|
||||||
println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst)));
|
println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst)));
|
||||||
|
|||||||
@@ -70,7 +70,7 @@ struct B {
|
|||||||
Rust *does* guarantee that two instances of A have their data laid out in
|
Rust *does* guarantee that two instances of A have their data laid out in
|
||||||
exactly the same way. However Rust *does not* guarantee that an instance of A
|
exactly the same way. However Rust *does not* guarantee that an instance of A
|
||||||
has the same field ordering or padding as an instance of B (in practice there's
|
has the same field ordering or padding as an instance of B (in practice there's
|
||||||
no *particular* reason why they wouldn't, other than that its not currently
|
no particular reason why they wouldn't, other than that its not currently
|
||||||
guaranteed).
|
guaranteed).
|
||||||
|
|
||||||
With A and B as written, this is basically nonsensical, but several other
|
With A and B as written, this is basically nonsensical, but several other
|
||||||
@@ -88,9 +88,9 @@ struct Foo<T, U> {
|
|||||||
```
|
```
|
||||||
|
|
||||||
Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If
|
Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If
|
||||||
Rust lays out the fields in the order specified, we expect it to *pad* the
|
Rust lays out the fields in the order specified, we expect it to pad the
|
||||||
values in the struct to satisfy their *alignment* requirements. So if Rust
|
values in the struct to satisfy their alignment requirements. So if Rust
|
||||||
didn't reorder fields, we would expect Rust to produce the following:
|
didn't reorder fields, we would expect it to produce the following:
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
struct Foo<u16, u32> {
|
struct Foo<u16, u32> {
|
||||||
@@ -112,7 +112,7 @@ The latter case quite simply wastes space. An optimal use of space therefore
|
|||||||
requires different monomorphizations to have *different field orderings*.
|
requires different monomorphizations to have *different field orderings*.
|
||||||
|
|
||||||
**Note: this is a hypothetical optimization that is not yet implemented in Rust
|
**Note: this is a hypothetical optimization that is not yet implemented in Rust
|
||||||
**1.0
|
1.0**
|
||||||
|
|
||||||
Enums make this consideration even more complicated. Naively, an enum such as:
|
Enums make this consideration even more complicated. Naively, an enum such as:
|
||||||
|
|
||||||
@@ -128,8 +128,8 @@ would be laid out as:
|
|||||||
|
|
||||||
```rust
|
```rust
|
||||||
struct FooRepr {
|
struct FooRepr {
|
||||||
data: u64, // this is *really* either a u64, u32, or u8 based on `tag`
|
data: u64, // this is either a u64, u32, or u8 based on `tag`
|
||||||
tag: u8, // 0 = A, 1 = B, 2 = C
|
tag: u8, // 0 = A, 1 = B, 2 = C
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -5,7 +5,7 @@ So what's the relationship between Safe and Unsafe Rust? How do they interact?
|
|||||||
Rust models the separation between Safe and Unsafe Rust with the `unsafe`
|
Rust models the separation between Safe and Unsafe Rust with the `unsafe`
|
||||||
keyword, which can be thought as a sort of *foreign function interface* (FFI)
|
keyword, which can be thought as a sort of *foreign function interface* (FFI)
|
||||||
between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust
|
between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust
|
||||||
is a safe language: all the scary unsafe bits are relegated *exclusively* to FFI
|
is a safe language: all the scary unsafe bits are relegated exclusively to FFI
|
||||||
*just like every other safe language*.
|
*just like every other safe language*.
|
||||||
|
|
||||||
However because one language is a subset of the other, the two can be cleanly
|
However because one language is a subset of the other, the two can be cleanly
|
||||||
@@ -61,13 +61,13 @@ The need for unsafe traits boils down to the fundamental property of safe code:
|
|||||||
**No matter how completely awful Safe code is, it can't cause Undefined
|
**No matter how completely awful Safe code is, it can't cause Undefined
|
||||||
Behaviour.**
|
Behaviour.**
|
||||||
|
|
||||||
This means that Unsafe, **the royal vanguard of Undefined Behaviour**, has to be
|
This means that Unsafe Rust, **the royal vanguard of Undefined Behaviour**, has to be
|
||||||
*super paranoid* about generic safe code. Unsafe is free to trust *specific* safe
|
*super paranoid* about generic safe code. To be clear, Unsafe Rust is totally free to trust
|
||||||
code (or else you would degenerate into infinite spirals of paranoid despair).
|
specific safe code. Anything else would degenerate into infinite spirals of
|
||||||
It is generally regarded as ok to trust the standard library to be correct, as
|
paranoid despair. In particular it's generally regarded as ok to trust the standard library
|
||||||
`std` is effectively an extension of the language (and you *really* just have
|
to be correct. `std` is effectively an extension of the language, and you
|
||||||
to trust the language). If `std` fails to uphold the guarantees it declares,
|
really just have to trust the language. If `std` fails to uphold the
|
||||||
then it's basically a language bug.
|
guarantees it declares, then it's basically a language bug.
|
||||||
|
|
||||||
That said, it would be best to minimize *needlessly* relying on properties of
|
That said, it would be best to minimize *needlessly* relying on properties of
|
||||||
concrete safe code. Bugs happen! Of course, I must reinforce that this is only
|
concrete safe code. Bugs happen! Of course, I must reinforce that this is only
|
||||||
@@ -75,36 +75,36 @@ a concern for Unsafe code. Safe code can blindly trust anyone and everyone
|
|||||||
as far as basic memory-safety is concerned.
|
as far as basic memory-safety is concerned.
|
||||||
|
|
||||||
On the other hand, safe traits are free to declare arbitrary contracts, but because
|
On the other hand, safe traits are free to declare arbitrary contracts, but because
|
||||||
implementing them is Safe, Unsafe can't trust those contracts to actually
|
implementing them is safe, unsafe code can't trust those contracts to actually
|
||||||
be upheld. This is different from the concrete case because *anyone* can
|
be upheld. This is different from the concrete case because *anyone* can
|
||||||
randomly implement the interface. There is something fundamentally different
|
randomly implement the interface. There is something fundamentally different
|
||||||
about trusting a *particular* piece of code to be correct, and trusting *all the
|
about trusting a particular piece of code to be correct, and trusting *all the
|
||||||
code that will ever be written* to be correct.
|
code that will ever be written* to be correct.
|
||||||
|
|
||||||
For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate
|
For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate
|
||||||
between types which can "just" be compared, and those that actually implement a
|
between types which can "just" be compared, and those that actually implement a
|
||||||
*total* ordering. Pretty much every API that wants to work with data that can be
|
total ordering. Pretty much every API that wants to work with data that can be
|
||||||
compared *really* wants Ord data. For instance, a sorted map like BTreeMap
|
compared wants Ord data. For instance, a sorted map like BTreeMap
|
||||||
*doesn't even make sense* for partially ordered types. If you claim to implement
|
*doesn't even make sense* for partially ordered types. If you claim to implement
|
||||||
Ord for a type, but don't actually provide a proper total ordering, BTreeMap will
|
Ord for a type, but don't actually provide a proper total ordering, BTreeMap will
|
||||||
get *really confused* and start making a total mess of itself. Data that is
|
get *really confused* and start making a total mess of itself. Data that is
|
||||||
inserted may be impossible to find!
|
inserted may be impossible to find!
|
||||||
|
|
||||||
But that's okay. BTreeMap is safe, so it guarantees that even if you give it a
|
But that's okay. BTreeMap is safe, so it guarantees that even if you give it a
|
||||||
*completely* garbage Ord implementation, it will still do something *safe*. You
|
completely garbage Ord implementation, it will still do something *safe*. You
|
||||||
won't start reading uninitialized memory or unallocated memory. In fact, BTreeMap
|
won't start reading uninitialized or unallocated memory. In fact, BTreeMap
|
||||||
manages to not actually lose any of your data. When the map is dropped, all the
|
manages to not actually lose any of your data. When the map is dropped, all the
|
||||||
destructors will be successfully called! Hooray!
|
destructors will be successfully called! Hooray!
|
||||||
|
|
||||||
However BTreeMap is implemented using a modest spoonful of Unsafe (most collections
|
However BTreeMap is implemented using a modest spoonful of Unsafe Rust (most collections
|
||||||
are). That means that it is not necessarily *trivially true* that a bad Ord
|
are). That means that it's not necessarily *trivially true* that a bad Ord
|
||||||
implementation will make BTreeMap behave safely. Unsafe must be sure not to rely
|
implementation will make BTreeMap behave safely. BTreeMap must be sure not to rely
|
||||||
on Ord *where safety is at stake*. Ord is provided by Safe, and safety is not
|
on Ord *where safety is at stake*. Ord is provided by safe code, and safety is not
|
||||||
Safe's responsibility to uphold.
|
safe code's responsibility to uphold.
|
||||||
|
|
||||||
But wouldn't it be grand if there was some way for Unsafe to trust *some* trait
|
But wouldn't it be grand if there was some way for Unsafe to trust some trait
|
||||||
contracts *somewhere*? This is the problem that unsafe traits tackle: by marking
|
contracts *somewhere*? This is the problem that unsafe traits tackle: by marking
|
||||||
*the trait itself* as unsafe *to implement*, Unsafe can trust the implementation
|
*the trait itself* as unsafe to implement, unsafe code can trust the implementation
|
||||||
to uphold the trait's contract. Although the trait implementation may be
|
to uphold the trait's contract. Although the trait implementation may be
|
||||||
incorrect in arbitrary other ways.
|
incorrect in arbitrary other ways.
|
||||||
|
|
||||||
@@ -126,7 +126,7 @@ But it's probably not the implementation you want.
|
|||||||
|
|
||||||
Rust has traditionally avoided making traits unsafe because it makes Unsafe
|
Rust has traditionally avoided making traits unsafe because it makes Unsafe
|
||||||
pervasive, which is not desirable. Send and Sync are unsafe is because thread
|
pervasive, which is not desirable. Send and Sync are unsafe is because thread
|
||||||
safety is a *fundamental property* that Unsafe cannot possibly hope to defend
|
safety is a *fundamental property* that unsafe code cannot possibly hope to defend
|
||||||
against in the same way it would defend against a bad Ord implementation. The
|
against in the same way it would defend against a bad Ord implementation. The
|
||||||
only way to possibly defend against thread-unsafety would be to *not use
|
only way to possibly defend against thread-unsafety would be to *not use
|
||||||
threading at all*. Making every load and store atomic isn't even sufficient,
|
threading at all*. Making every load and store atomic isn't even sufficient,
|
||||||
@@ -135,10 +135,10 @@ in memory. For instance, the pointer and capacity of a Vec must be in sync.
|
|||||||
|
|
||||||
Even concurrent paradigms that are traditionally regarded as Totally Safe like
|
Even concurrent paradigms that are traditionally regarded as Totally Safe like
|
||||||
message passing implicitly rely on some notion of thread safety -- are you
|
message passing implicitly rely on some notion of thread safety -- are you
|
||||||
really message-passing if you pass a *pointer*? Send and Sync therefore require
|
really message-passing if you pass a pointer? Send and Sync therefore require
|
||||||
some *fundamental* level of trust that Safe code can't provide, so they must be
|
some fundamental level of trust that Safe code can't provide, so they must be
|
||||||
unsafe to implement. To help obviate the pervasive unsafety that this would
|
unsafe to implement. To help obviate the pervasive unsafety that this would
|
||||||
introduce, Send (resp. Sync) is *automatically* derived for all types composed only
|
introduce, Send (resp. Sync) is automatically derived for all types composed only
|
||||||
of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those
|
of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those
|
||||||
never actually say it (the remaining 1% is overwhelmingly synchronization
|
never actually say it (the remaining 1% is overwhelmingly synchronization
|
||||||
primitives).
|
primitives).
|
||||||
|
|||||||
@@ -8,20 +8,19 @@ captures this with through the `Send` and `Sync` traits.
|
|||||||
* A type is Send if it is safe to send it to another thread. A type is Sync if
|
* A type is Send if it is safe to send it to another thread. A type is Sync if
|
||||||
* it is safe to share between threads (`&T` is Send).
|
* it is safe to share between threads (`&T` is Send).
|
||||||
|
|
||||||
Send and Sync are *very* fundamental to Rust's concurrency story. As such, a
|
Send and Sync are fundamental to Rust's concurrency story. As such, a
|
||||||
substantial amount of special tooling exists to make them work right. First and
|
substantial amount of special tooling exists to make them work right. First and
|
||||||
foremost, they're *unsafe traits*. This means that they are unsafe *to
|
foremost, they're [unsafe traits][]. This means that they are unsafe to
|
||||||
implement*, and other unsafe code can *trust* that they are correctly
|
implement, and other unsafe code can that they are correctly
|
||||||
implemented. Since they're *marker traits* (they have no associated items like
|
implemented. Since they're *marker traits* (they have no associated items like
|
||||||
methods), correctly implemented simply means that they have the intrinsic
|
methods), correctly implemented simply means that they have the intrinsic
|
||||||
properties an implementor should have. Incorrectly implementing Send or Sync can
|
properties an implementor should have. Incorrectly implementing Send or Sync can
|
||||||
cause Undefined Behaviour.
|
cause Undefined Behaviour.
|
||||||
|
|
||||||
Send and Sync are also what Rust calls *opt-in builtin traits*. This means that,
|
Send and Sync are also automatically derived traits. This means that, unlike
|
||||||
unlike every other trait, they are *automatically* derived: if a type is
|
every other trait, if a type is composed entirely of Send or Sync types, then it
|
||||||
composed entirely of Send or Sync types, then it is Send or Sync. Almost all
|
is Send or Sync. Almost all primitives are Send and Sync, and as a consequence
|
||||||
primitives are Send and Sync, and as a consequence pretty much all types you'll
|
pretty much all types you'll ever interact with are Send and Sync.
|
||||||
ever interact with are Send and Sync.
|
|
||||||
|
|
||||||
Major exceptions include:
|
Major exceptions include:
|
||||||
|
|
||||||
@@ -37,13 +36,12 @@ sense, one could argue that it would be "fine" for them to be marked as thread
|
|||||||
safe.
|
safe.
|
||||||
|
|
||||||
However it's important that they aren't thread safe to prevent types that
|
However it's important that they aren't thread safe to prevent types that
|
||||||
*contain them* from being automatically marked as thread safe. These types have
|
contain them from being automatically marked as thread safe. These types have
|
||||||
non-trivial untracked ownership, and it's unlikely that their author was
|
non-trivial untracked ownership, and it's unlikely that their author was
|
||||||
necessarily thinking hard about thread safety. In the case of Rc, we have a nice
|
necessarily thinking hard about thread safety. In the case of Rc, we have a nice
|
||||||
example of a type that contains a `*mut` that is *definitely* not thread safe.
|
example of a type that contains a `*mut` that is definitely not thread safe.
|
||||||
|
|
||||||
Types that aren't automatically derived can *opt-in* to Send and Sync by simply
|
Types that aren't automatically derived can simply implement them if desired:
|
||||||
implementing them:
|
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
struct MyBox(*mut u8);
|
struct MyBox(*mut u8);
|
||||||
@@ -52,12 +50,13 @@ unsafe impl Send for MyBox {}
|
|||||||
unsafe impl Sync for MyBox {}
|
unsafe impl Sync for MyBox {}
|
||||||
```
|
```
|
||||||
|
|
||||||
In the *incredibly rare* case that a type is *inappropriately* automatically
|
In the *incredibly rare* case that a type is inappropriately automatically
|
||||||
derived to be Send or Sync, then one can also *unimplement* Send and Sync:
|
derived to be Send or Sync, then one can also unimplement Send and Sync:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
#![feature(optin_builtin_traits)]
|
#![feature(optin_builtin_traits)]
|
||||||
|
|
||||||
|
// I have some magic semantics for some synchronization primitive!
|
||||||
struct SpecialThreadToken(u8);
|
struct SpecialThreadToken(u8);
|
||||||
|
|
||||||
impl !Send for SpecialThreadToken {}
|
impl !Send for SpecialThreadToken {}
|
||||||
@@ -77,3 +76,5 @@ largely behave like an `&` or `&mut` into the collection.
|
|||||||
|
|
||||||
TODO: better explain what can or can't be Send or Sync. Sufficient to appeal
|
TODO: better explain what can or can't be Send or Sync. Sufficient to appeal
|
||||||
only to data races?
|
only to data races?
|
||||||
|
|
||||||
|
[unsafe traits]: safe-unsafe-meaning.html
|
||||||
|
|||||||
@@ -1,14 +1,14 @@
|
|||||||
% Subtyping and Variance
|
% Subtyping and Variance
|
||||||
|
|
||||||
Although Rust doesn't have any notion of structural inheritance, it *does*
|
Although Rust doesn't have any notion of structural inheritance, it *does*
|
||||||
include subtyping. In Rust, subtyping derives entirely from *lifetimes*. Since
|
include subtyping. In Rust, subtyping derives entirely from lifetimes. Since
|
||||||
lifetimes are scopes, we can partially order them based on the *contains*
|
lifetimes are scopes, we can partially order them based on the *contains*
|
||||||
(outlives) relationship. We can even express this as a generic bound.
|
(outlives) relationship. We can even express this as a generic bound.
|
||||||
|
|
||||||
Subtyping on lifetimes in terms of that relationship: if `'a: 'b` ("a contains
|
Subtyping on lifetimes is in terms of that relationship: if `'a: 'b` ("a contains
|
||||||
b" or "a outlives b"), then `'a` is a subtype of `'b`. This is a large source of
|
b" or "a outlives b"), then `'a` is a subtype of `'b`. This is a large source of
|
||||||
confusion, because it seems intuitively backwards to many: the bigger scope is a
|
confusion, because it seems intuitively backwards to many: the bigger scope is a
|
||||||
*sub type* of the smaller scope.
|
*subtype* of the smaller scope.
|
||||||
|
|
||||||
This does in fact make sense, though. The intuitive reason for this is that if
|
This does in fact make sense, though. The intuitive reason for this is that if
|
||||||
you expect an `&'a u8`, then it's totally fine for me to hand you an `&'static
|
you expect an `&'a u8`, then it's totally fine for me to hand you an `&'static
|
||||||
@@ -72,7 +72,7 @@ to be able to pass `&&'static str` where an `&&'a str` is expected. The
|
|||||||
additional level of indirection does not change the desire to be able to pass
|
additional level of indirection does not change the desire to be able to pass
|
||||||
longer lived things where shorted lived things are expected.
|
longer lived things where shorted lived things are expected.
|
||||||
|
|
||||||
However this logic *does not* apply to `&mut`. To see why `&mut` should
|
However this logic doesn't apply to `&mut`. To see why `&mut` should
|
||||||
be invariant over T, consider the following code:
|
be invariant over T, consider the following code:
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
@@ -109,7 +109,7 @@ between `'a` and T is that `'a` is a property of the reference itself,
|
|||||||
while T is something the reference is borrowing. If you change T's type, then
|
while T is something the reference is borrowing. If you change T's type, then
|
||||||
the source still remembers the original type. However if you change the
|
the source still remembers the original type. However if you change the
|
||||||
lifetime's type, no one but the reference knows this information, so it's fine.
|
lifetime's type, no one but the reference knows this information, so it's fine.
|
||||||
Put another way, `&'a mut T` owns `'a`, but only *borrows* T.
|
Put another way: `&'a mut T` owns `'a`, but only *borrows* T.
|
||||||
|
|
||||||
`Box` and `Vec` are interesting cases because they're variant, but you can
|
`Box` and `Vec` are interesting cases because they're variant, but you can
|
||||||
definitely store values in them! This is where Rust gets really clever: it's
|
definitely store values in them! This is where Rust gets really clever: it's
|
||||||
@@ -118,7 +118,7 @@ in them *via a mutable reference*! The mutable reference makes the whole type
|
|||||||
invariant, and therefore prevents you from smuggling a short-lived type into
|
invariant, and therefore prevents you from smuggling a short-lived type into
|
||||||
them.
|
them.
|
||||||
|
|
||||||
Being variant *does* allows `Box` and `Vec` to be weakened when shared
|
Being variant allows `Box` and `Vec` to be weakened when shared
|
||||||
immutably. So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is
|
immutably. So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is
|
||||||
expected.
|
expected.
|
||||||
|
|
||||||
@@ -126,7 +126,7 @@ However what should happen when passing *by-value* is less obvious. It turns out
|
|||||||
that, yes, you can use subtyping when passing by-value. That is, this works:
|
that, yes, you can use subtyping when passing by-value. That is, this works:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
fn get_box<'a>(str: &'a u8) -> Box<&'a str> {
|
fn get_box<'a>(str: &'a str) -> Box<&'a str> {
|
||||||
// string literals are `&'static str`s
|
// string literals are `&'static str`s
|
||||||
Box::new("hello")
|
Box::new("hello")
|
||||||
}
|
}
|
||||||
@@ -150,7 +150,7 @@ signature:
|
|||||||
fn foo(&'a str) -> usize;
|
fn foo(&'a str) -> usize;
|
||||||
```
|
```
|
||||||
|
|
||||||
This signature claims that it can handle any `&str` that lives *at least* as
|
This signature claims that it can handle any `&str` that lives at least as
|
||||||
long as `'a`. Now if this signature was variant over `&'a str`, that
|
long as `'a`. Now if this signature was variant over `&'a str`, that
|
||||||
would mean
|
would mean
|
||||||
|
|
||||||
@@ -159,10 +159,12 @@ fn foo(&'static str) -> usize;
|
|||||||
```
|
```
|
||||||
|
|
||||||
could be provided in its place, as it would be a subtype. However this function
|
could be provided in its place, as it would be a subtype. However this function
|
||||||
has a *stronger* requirement: it says that it can *only* handle `&'static str`s,
|
has a stronger requirement: it says that it can only handle `&'static str`s,
|
||||||
and nothing else. Therefore functions are not variant over their arguments.
|
and nothing else. Giving `&'a str`s to it would be unsound, as it's free to
|
||||||
|
assume that what it's given lives forever. Therefore functions are not variant
|
||||||
|
over their arguments.
|
||||||
|
|
||||||
To see why `Fn(T) -> U` should be *variant* over U, consider the following
|
To see why `Fn(T) -> U` should be variant over U, consider the following
|
||||||
function signature:
|
function signature:
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
@@ -177,7 +179,7 @@ therefore completely reasonable to provide
|
|||||||
fn foo(usize) -> &'static str;
|
fn foo(usize) -> &'static str;
|
||||||
```
|
```
|
||||||
|
|
||||||
in its place. Therefore functions *are* variant over their return type.
|
in its place. Therefore functions are variant over their return type.
|
||||||
|
|
||||||
`*const` has the exact same semantics as `&`, so variance follows. `*mut` on the
|
`*const` has the exact same semantics as `&`, so variance follows. `*mut` on the
|
||||||
other hand can dereference to an `&mut` whether shared or not, so it is marked
|
other hand can dereference to an `&mut` whether shared or not, so it is marked
|
||||||
|
|||||||
@@ -31,12 +31,12 @@ panics can only be caught by the parent thread. This means catching a panic
|
|||||||
requires spinning up an entire OS thread! This unfortunately stands in conflict
|
requires spinning up an entire OS thread! This unfortunately stands in conflict
|
||||||
to Rust's philosophy of zero-cost abstractions.
|
to Rust's philosophy of zero-cost abstractions.
|
||||||
|
|
||||||
There is an *unstable* API called `catch_panic` that enables catching a panic
|
There is an unstable API called `catch_panic` that enables catching a panic
|
||||||
without spawning a thread. Still, we would encourage you to only do this
|
without spawning a thread. Still, we would encourage you to only do this
|
||||||
sparingly. In particular, Rust's current unwinding implementation is heavily
|
sparingly. In particular, Rust's current unwinding implementation is heavily
|
||||||
optimized for the "doesn't unwind" case. If a program doesn't unwind, there
|
optimized for the "doesn't unwind" case. If a program doesn't unwind, there
|
||||||
should be no runtime cost for the program being *ready* to unwind. As a
|
should be no runtime cost for the program being *ready* to unwind. As a
|
||||||
consequence, *actually* unwinding will be more expensive than in e.g. Java.
|
consequence, actually unwinding will be more expensive than in e.g. Java.
|
||||||
Don't build your programs to unwind under normal circumstances. Ideally, you
|
Don't build your programs to unwind under normal circumstances. Ideally, you
|
||||||
should only panic for programming errors or *extreme* problems.
|
should only panic for programming errors or *extreme* problems.
|
||||||
|
|
||||||
|
|||||||
@@ -60,7 +60,7 @@ of memory at once (e.g. half the theoretical address space). As such it's
|
|||||||
like the standard library as much as possible, so we'll just kill the whole
|
like the standard library as much as possible, so we'll just kill the whole
|
||||||
program.
|
program.
|
||||||
|
|
||||||
We said we don't want to use intrinsics, so doing *exactly* what `std` does is
|
We said we don't want to use intrinsics, so doing exactly what `std` does is
|
||||||
out. Instead, we'll call `std::process::exit` with some random number.
|
out. Instead, we'll call `std::process::exit` with some random number.
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
@@ -84,7 +84,7 @@ But Rust's only supported allocator API is so low level that we'll need to do a
|
|||||||
fair bit of extra work. We also need to guard against some special
|
fair bit of extra work. We also need to guard against some special
|
||||||
conditions that can occur with really large allocations or empty allocations.
|
conditions that can occur with really large allocations or empty allocations.
|
||||||
|
|
||||||
In particular, `ptr::offset` will cause us *a lot* of trouble, because it has
|
In particular, `ptr::offset` will cause us a lot of trouble, because it has
|
||||||
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
|
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
|
||||||
not have dealt with this instruction, here's the basic story with GEP: alias
|
not have dealt with this instruction, here's the basic story with GEP: alias
|
||||||
analysis, alias analysis, alias analysis. It's super important to an optimizing
|
analysis, alias analysis, alias analysis. It's super important to an optimizing
|
||||||
@@ -102,7 +102,7 @@ As a simple example, consider the following fragment of code:
|
|||||||
If the compiler can prove that `x` and `y` point to different locations in
|
If the compiler can prove that `x` and `y` point to different locations in
|
||||||
memory, the two operations can in theory be executed in parallel (by e.g.
|
memory, the two operations can in theory be executed in parallel (by e.g.
|
||||||
loading them into different registers and working on them independently).
|
loading them into different registers and working on them independently).
|
||||||
However in *general* the compiler can't do this because if x and y point to
|
However the compiler can't do this in general because if x and y point to
|
||||||
the same location in memory, the operations need to be done to the same value,
|
the same location in memory, the operations need to be done to the same value,
|
||||||
and they can't just be merged afterwards.
|
and they can't just be merged afterwards.
|
||||||
|
|
||||||
@@ -118,7 +118,7 @@ possible.
|
|||||||
So that's what GEP's about, how can it cause us trouble?
|
So that's what GEP's about, how can it cause us trouble?
|
||||||
|
|
||||||
The first problem is that we index into arrays with unsigned integers, but
|
The first problem is that we index into arrays with unsigned integers, but
|
||||||
GEP (and as a consequence `ptr::offset`) takes a *signed integer*. This means
|
GEP (and as a consequence `ptr::offset`) takes a signed integer. This means
|
||||||
that half of the seemingly valid indices into an array will overflow GEP and
|
that half of the seemingly valid indices into an array will overflow GEP and
|
||||||
actually go in the wrong direction! As such we must limit all allocations to
|
actually go in the wrong direction! As such we must limit all allocations to
|
||||||
`isize::MAX` elements. This actually means we only need to worry about
|
`isize::MAX` elements. This actually means we only need to worry about
|
||||||
@@ -138,7 +138,7 @@ However since this is a tutorial, we're not going to be particularly optimal
|
|||||||
here, and just unconditionally check, rather than use clever platform-specific
|
here, and just unconditionally check, rather than use clever platform-specific
|
||||||
`cfg`s.
|
`cfg`s.
|
||||||
|
|
||||||
The other corner-case we need to worry about is *empty* allocations. There will
|
The other corner-case we need to worry about is empty allocations. There will
|
||||||
be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
|
be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
|
||||||
and `cap > 0` for zero-sized types.
|
and `cap > 0` for zero-sized types.
|
||||||
|
|
||||||
@@ -165,9 +165,9 @@ protected from being allocated anyway (a whole 4k, on many platforms).
|
|||||||
|
|
||||||
However what about for positive-sized types? That one's a bit trickier. In
|
However what about for positive-sized types? That one's a bit trickier. In
|
||||||
principle, you can argue that offsetting by 0 gives LLVM no information: either
|
principle, you can argue that offsetting by 0 gives LLVM no information: either
|
||||||
there's an element before the address, or after it, but it can't know which.
|
there's an element before the address or after it, but it can't know which.
|
||||||
However we've chosen to conservatively assume that it may do bad things. As
|
However we've chosen to conservatively assume that it may do bad things. As
|
||||||
such we *will* guard against this case explicitly.
|
such we will guard against this case explicitly.
|
||||||
|
|
||||||
*Phew*
|
*Phew*
|
||||||
|
|
||||||
|
|||||||
@@ -130,7 +130,7 @@ impl<'a, T> Drop for Drain<'a, T> {
|
|||||||
impl<T> Vec<T> {
|
impl<T> Vec<T> {
|
||||||
pub fn drain(&mut self) -> Drain<T> {
|
pub fn drain(&mut self) -> Drain<T> {
|
||||||
// this is a mem::forget safety thing. If Drain is forgotten, we just
|
// this is a mem::forget safety thing. If Drain is forgotten, we just
|
||||||
// leak the whole Vec's contents. Also we need to do this *eventually*
|
// leak the whole Vec's contents. Also we need to do this eventually
|
||||||
// anyway, so why not do it now?
|
// anyway, so why not do it now?
|
||||||
self.len = 0;
|
self.len = 0;
|
||||||
|
|
||||||
|
|||||||
@@ -10,7 +10,7 @@ handling the case where the source and destination overlap (which will
|
|||||||
definitely happen here).
|
definitely happen here).
|
||||||
|
|
||||||
If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]`
|
If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]`
|
||||||
using the *old* len.
|
using the old len.
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
pub fn insert(&mut self, index: usize, elem: T) {
|
pub fn insert(&mut self, index: usize, elem: T) {
|
||||||
|
|||||||
@@ -21,8 +21,8 @@ read out the value pointed to at that end and move the pointer over by one. When
|
|||||||
the two pointers are equal, we know we're done.
|
the two pointers are equal, we know we're done.
|
||||||
|
|
||||||
Note that the order of read and offset are reversed for `next` and `next_back`
|
Note that the order of read and offset are reversed for `next` and `next_back`
|
||||||
For `next_back` the pointer is always *after* the element it wants to read next,
|
For `next_back` the pointer is always after the element it wants to read next,
|
||||||
while for `next` the pointer is always *at* the element it wants to read next.
|
while for `next` the pointer is always at the element it wants to read next.
|
||||||
To see why this is, consider the case where every element but one has been
|
To see why this is, consider the case where every element but one has been
|
||||||
yielded.
|
yielded.
|
||||||
|
|
||||||
@@ -124,7 +124,7 @@ impl<T> DoubleEndedIterator for IntoIter<T> {
|
|||||||
```
|
```
|
||||||
|
|
||||||
Because IntoIter takes ownership of its allocation, it needs to implement Drop
|
Because IntoIter takes ownership of its allocation, it needs to implement Drop
|
||||||
to free it. However it *also* wants to implement Drop to drop any elements it
|
to free it. However it also wants to implement Drop to drop any elements it
|
||||||
contains that weren't yielded.
|
contains that weren't yielded.
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -32,14 +32,14 @@ pub fn push(&mut self, elem: T) {
|
|||||||
|
|
||||||
Easy! How about `pop`? Although this time the index we want to access is
|
Easy! How about `pop`? Although this time the index we want to access is
|
||||||
initialized, Rust won't just let us dereference the location of memory to move
|
initialized, Rust won't just let us dereference the location of memory to move
|
||||||
the value out, because that *would* leave the memory uninitialized! For this we
|
the value out, because that would leave the memory uninitialized! For this we
|
||||||
need `ptr::read`, which just copies out the bits from the target address and
|
need `ptr::read`, which just copies out the bits from the target address and
|
||||||
intrprets it as a value of type T. This will leave the memory at this address
|
intrprets it as a value of type T. This will leave the memory at this address
|
||||||
*logically* uninitialized, even though there is in fact a perfectly good instance
|
logically uninitialized, even though there is in fact a perfectly good instance
|
||||||
of T there.
|
of T there.
|
||||||
|
|
||||||
For `pop`, if the old len is 1, we want to read out of the 0th index. So we
|
For `pop`, if the old len is 1, we want to read out of the 0th index. So we
|
||||||
should offset by the *new* len.
|
should offset by the new len.
|
||||||
|
|
||||||
```rust,ignore
|
```rust,ignore
|
||||||
pub fn pop(&mut self) -> Option<T> {
|
pub fn pop(&mut self) -> Option<T> {
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
It's time. We're going to fight the spectre that is zero-sized types. Safe Rust
|
It's time. We're going to fight the spectre that is zero-sized types. Safe Rust
|
||||||
*never* needs to care about this, but Vec is very intensive on raw pointers and
|
*never* needs to care about this, but Vec is very intensive on raw pointers and
|
||||||
raw allocations, which are exactly the *only* two things that care about
|
raw allocations, which are exactly the two things that care about
|
||||||
zero-sized types. We need to be careful of two things:
|
zero-sized types. We need to be careful of two things:
|
||||||
|
|
||||||
* The raw allocator API has undefined behaviour if you pass in 0 for an
|
* The raw allocator API has undefined behaviour if you pass in 0 for an
|
||||||
@@ -22,7 +22,7 @@ So if the allocator API doesn't support zero-sized allocations, what on earth
|
|||||||
do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation
|
do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation
|
||||||
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
|
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
|
||||||
to be considered to store or load them. This actually extends to `ptr::read` and
|
to be considered to store or load them. This actually extends to `ptr::read` and
|
||||||
`ptr::write`: they won't actually look at the pointer at all. As such we *never* need
|
`ptr::write`: they won't actually look at the pointer at all. As such we never need
|
||||||
to change the pointer.
|
to change the pointer.
|
||||||
|
|
||||||
Note however that our previous reliance on running out of memory before overflow is
|
Note however that our previous reliance on running out of memory before overflow is
|
||||||
|
|||||||
Reference in New Issue
Block a user