C/CPS 506
Comparative Programming Languages Prof. Alex Ufkes
Topic 10: Ownership and lifetime in Rust
Notice!
Obligatory copyright notice in the age of digital delivery and online classrooms:
The copyright to this original work is held by Alex Ufkes. Students registered in course CCPS 506 can use this material for the purposes of this course but no other use is permitted, and there can be no sale or transfer or use of the work for any other purpose without explicit permission of Alex Ufkes.
© Alex Ufkes, 2020, 2021 2
Course Administration
© Alex Ufkes, 2020, 2021
3
• •
Getting closer! Two more lectures. Don’t forget about the assignments!
© Alex Ufkes, 2020, 2021 4
Control Flow
© Alex Ufkes, 2020, 2021 5
if / else
• As with other imperative languages, the else is optional.
• Recall that this is not the case with Haskell!
• We were required to have a complete if-then-else
© Alex Ufkes, 2020, 2021
6
Boolean Conditions?
Mandatory.
In C/C++ (and Elixir, with caveats):
• Non-zero values are “truthy”.
• Only 0/nil considered false.
if (3.141592)
cout << “Valid!” << endl;
In Java (and Haskell, Rust):
• Conditions must be Boolean
if (3.141592)
System.out.println(
“Compile Error”);
© Alex Ufkes, 2020, 2021
7
Converting non-Boolean to Boolean requires implicit conversion, which, as we’ve seen, Rust does not do.
Boolean Conditions?
Mandatory.
© Alex Ufkes, 2020, 2021
8
Ah! But can we cast?
Nope.
© Alex Ufkes, 2020, 2021
9
if/elseif/else
• As we’d expect.
• We use { } even though there’s only
one statement per branch
• This is required.
• Why? Rust treats these as blocks
whose last line can be an expression.
© Alex Ufkes, 2020, 2021
10
if/elseif/else
© Alex Ufkes, 2020, 2021
11
if/elseif/else
• let state = {...}; is a statement
• {...} is an expression that will evaluate
to a string. if == expression!
• “Frozen”, “Liquid”, or “Boiling”
• Each option is in a scope block { }
• The value of a scope block is the last
expression
• Leaving the ; off makes these strings
expressions.
© Alex Ufkes, 2020, 2021
12
if/elseif/else
© Alex Ufkes, 2020, 2021
13
Problem?
Might return float, might return int
Remember: Strong, static typing. No implicit conversion!
© Alex Ufkes, 2020, 2021 14
© Alex Ufkes, 2020, 2021
15
Looping
Looping
Just like while(true){} in Java
© Alex Ufkes, 2020, 2021 16
Conditional Looping: while
• Similar in form to other imperative languages.
• Rust understands +=
© Alex Ufkes, 2020, 2021
17
Conditional Looping: for Similar to an enhanced for loop in Java:
• Invoke iter() method of array nums • elem takes the value of each
element in the array.
• Safe! Never go out of bounds.
© Alex Ufkes, 2020, 2021
18
Conditional Looping: for Use .. to create a range
• Create a Range containing 0 to 9
• Top of range not included!
• Just like range() in Python
© Alex Ufkes, 2020, 2021
19
Conditional Looping: for Not as safe!
• Here we must be careful
• Higher chance of accidentally
overrunning array bounds
© Alex Ufkes, 2020, 2021
20
A loop is a loop is a loop
© Alex Ufkes, 2020, 2021 Wait, what? 21
Wait, what?
• We didn’t specify the type of i, but shouldn’t it default to i32?
• Rust infers type, i32 should be default.
• HOWEVER!
• Rust doesn’t allow signed integers to
be used as array indexes!
• It inferred the type as unsigned! Thus
checking less than zero is pointless.
© Alex Ufkes, 2020, 2021
22
Rust doesn’t allow signed integers to be used as array indexes!
© Alex Ufkes, 2020, 2021 23
Need to adjust our logic a bit...
© Alex Ufkes, 2020, 2021
24
© Alex Ufkes, 2020, 2021
25
Moving on....
© Alex Ufkes, 2020, 2021
26
Ownership
Ownership
Arguably Rust’s most unique feature:
• In C, the programmer is responsible for allocating and freeing heap memory. Memory leaks common!
• In Java, garbage collector periodically looks for unused memory and frees it.
• Rust takes a third approach: A system of ownership with rules checked at compile time.
o Thus, the program is not slowed at run-time
© Alex Ufkes, 2020, 2021
27
Reminder: Stack VS Heap Stack: Heap:
• Last in, first out
• Push/pop stack frames is fast
• Data has known, fixed size.
• Less organized
• Slower access, follow pointers • Data size can be unknown
• If we dynamically allocate memory in C++, the pointer goes on the stack, the memory itself is in the heap.
• Heap memory is allocated by the OS at the request of the program.
• Stack memory (some fixed amount) belongs to the program, no
need to invoke the OS.
© Alex Ufkes, 2020, 2021 28
Ownership
Three rules:
1. Each value in Rust has a variable that’s called its owner.
2. There can only be one owner at a time.
3. When the owner goes out of scope, the value is dropped.
© Alex Ufkes, 2020, 2021 29
Scope in Rust
© Alex Ufkes, 2020, 2021
30
This is normal, nothing new.
• Primitives stored on the stack behave as per usual.
• How does Rust clean up data stored on the heap?
• Consider Strings – A complex type stored on the heap.
Strings
• String literals are different from regular strings.
• Their size is fixed, encoded directly into the executable.
• Strings not defined as a literal might have unknown size
• They are stored on the heap.
© Alex Ufkes, 2020, 2021
31
Heap Strings
• Memory for string requested at run time.
• Memory must be returned to the OS when we’re done with the string.
• Calling String::from makes a memory request.
• Once again, this is normal behavior. In Java we would say:
String s = new String(“Hello”); to accomplish the same.
What happens when we no longer need that string?
© Alex Ufkes, 2020, 2021 32
What happens when we no longer need that string?
© Alex Ufkes, 2020, 2021
33
• Without garbage collection, we must identify when memory is no longer being used and free it explicitly.
• This has historically been a difficult programming problem.
• Too early, variables become invalid. Too late, waste
memory. Do it twice by accident? Also a problem.
• We need to pair one allocate() to one free().
In Rust, memory is automatically returned when the variable that owns it leaves scope.
In Rust, memory is automatically returned when the variable that owns it leaves scope.
What about having multiple references to a single object? Freeing after one leaves scope invalidates the others. In Java:
Three references, one object!
© Alex Ufkes, 2020, 2021 34
But Remember!
Ownership - Three Rules:
1. Each value in Rust has a variable that’s called its owner. 3. When the owner goes out of scope, the value is dropped.
There can only be one!
© Alex Ufkes, 2020, 2021
35
2. There can only be one owner at a time.
In Rust, memory is automatically returned when the variable that owns it leaves scope.
• When a variable goes out of scope, Rust calls a special function automatically called drop()
• This function is called at the closing }
• What happens if we have multiple variables
interacting with the same data?
• With primitives, we get two separate variables stored in memory (stack)
• x and y are separate – changing one does not affect the other
• This is typical, and efficient
© Alex Ufkes, 2020, 2021
36
On the stack
© Alex Ufkes, 2020, 2021 37
On the heap
© Alex Ufkes, 2020, 2021
38
• Stack data copied; heap data is not.
• Copying heap data is more expensive.
• This is typical in most imperative languages.
• We can still potentially free data twice
• We can still potentially invalidate other
references
1. Each value in Rust has a variable that’s called its owner. 2. There can only be one owner at a time.
3. When the owner goes out of scope, the value is dropped.
© Alex Ufkes, 2020, 2021 39
1. Each value in Rust has a variable that’s called its owner. 2. There can only be one owner at a time.
3. When the owner goes out of scope, the value is dropped.
• When we say let s2=s1, s1 becomes invalid.
• Thus, when it leaves scope, memory is not freed.
• We can no longer use s1!
© Alex Ufkes, 2020, 2021 40
In Rust, we say s1 gets moved to s2
© Alex Ufkes, 2020, 2021
41
In Rust, we say s1 gets moved to s2 Different from a shallow copy, since the
old reference is invalidated.
Only one reference can free the heap memory.
© Alex Ufkes, 2020, 2021 42
clone()
Like most languages, Rust can clone:
© Alex Ufkes, 2020, 2021
43
clone()
Like most languages, Rust can clone:
© Alex Ufkes, 2020, 2021
44
Ownership and Functions
Passing an argument moves or copies, just like assignment:
© Alex Ufkes, 2020, 2021 45
Ownership and Functions
Passing an argument moves or copies, just like assignment:
• Ownership moved from s to word!
• s is now invalid!
• This is very different from any other
language we’re used to.
• This doesn’t happen with primitives
because they will simply be copied.
• We get a hint:
© Alex Ufkes, 2020, 2021 46
Returning Ownership
© Alex Ufkes, 2020, 2021
47
Returning Ownership
• Ownership moved from s to word and back to s
• s is invalid when we move to word
• word is invalid when moved to s
• Allowed because s is mutable.
• When string_pass reaches }, word
has already been moved to s
• Thus word is invalid and the string
on the heap isn’t freed.
© Alex Ufkes, 2020, 2021
48
Returning Ownership
Limiting. Forced to use return value for ownership.
• s1 moves to word, word moves to s2
• Return a tuple consisting of the length of word, and word itself.
• len() function returns length of array.
© Alex Ufkes, 2020, 2021
49
Ownership: Moving VS Borrowing Instead of returning a tuple, pass a reference:
• This looks like C++
• word is now a reference to s1
• What about ownership?
• What’s happening in memory?
© Alex Ufkes, 2020, 2021
50
Ownership: Moving VS Borrowing word s1
• word is a reference to s1, it does NOT point to the string in the heap.
• word has no ownership over s1.
• We call this borrowing.
© Alex Ufkes, 2020, 2021
51
Ownership: Moving VS Borrowing Unlike C++, we can’t modify something we’re borrowing:
© Alex Ufkes, 2020, 2021 ? 52
word is a mutable reference, borrowed from s1
© Alex Ufkes, 2020, 2021 53
Borrowing Rules
Can only have one mutable borrow at a time:
When the first mutable borrow goes out of scope, we can borrow again
© Alex Ufkes, 2020, 2021 54
Borrowing Rules
Can only have one mutable borrow at a time:
• push_strmustmake mutable borrow of s1
• Not allowed!
© Alex Ufkes, 2020, 2021
55
When the first mutable borrow goes out of scope, we can borrow again
© Alex Ufkes, 2020, 2021
56
Scope of r1 Scope of r2
When the first mutable borrow goes out of scope, we can borrow again
Here, r3 is already a reference. We’re not borrowing again.
© Alex Ufkes, 2020, 2021 57
Borrowing Rules
Using an immutably borrowed value prevents mutable borrow:
fn main() {
let mut word = String::from("Weird"); let r1 = &word;
word.push_str(", or what?");
}
println!("{}", r1);
© Alex Ufkes, 2020, 2021 58
Borrowing Rules: In Short
In any given scope, only ONE of the following can be true:
1. We can have a single mutable borrow
2. We can have any number of immutable borrows
These restrictions keep mutation under control
© Alex Ufkes, 2020, 2021 59
Slices
© Alex Ufkes, 2020, 2021 60
Slices
Reference to a subset of an array
• We’ve seen this notation before!
• Remember that the second index
is not included
© Alex Ufkes, 2020, 2021
61
Slices, Arguments, Functions
• Reminder: indexes must be usize
• Pass in reference to array
• Return slice (reference to subarray)
• Array only exists once in memory
• subset and nums point to different
parts of the same memory.
© Alex Ufkes, 2020, 2021
62
String Slices
... are a little bit different.
© Alex Ufkes, 2020, 2021
63
Normal so far
String Slice Type
• &str is a reference to a string slice
• &String is a reference to a String
• String VS string slice: different types
• Other than that, the function works
the same as with numeric arrays.
• A string slice is effectively a read-
only view of a String.
© Alex Ufkes, 2020, 2021
64
String Slice Type
Better to do this:
© Alex Ufkes, 2020, 2021
65
Works for both Strings and string slices
String Literals
Recall:
• String literals are different from regular strings.
• Their size is fixed, encoded directly into the executable.
• They are immutable.
In fact, string literals are slices:
• The type of msg is &str
• It’s a slice pointing to a specific
point of the binary file.
• This is why string literals are
immutable!
© Alex Ufkes, 2020, 2021
66
© Alex Ufkes, 2020, 2021
67
Lifetime
© Alex Ufkes, 2020, 2021
68
Rust Features
Memory Safety:
• •
Rust is designed to be memory safe Null or dangling pointers are not permitted.
Dangling References
Rust prevents them:
dangle()
• Create String s
• Return a reference to it
• s goes out of scope when
dangle function ends.
• What happens to the
reference that was returned?
© Alex Ufkes, 2020, 2021
69
Dangling References
Rust prevents them:
© Alex Ufkes, 2020, 2021
70
Lifetime?
© Alex Ufkes, 2020, 2021
71
Lifetime is a very distinct feature of Rust:
Every reference in Rust has lifetime The lifetime of a reference is the scope for
which that reference is valid.
Lifetimes are typically implicit and inferred, but can be defined explicitly
Just like variable types!
Example
• r is a reference to x
• x goes out of scope while
r is still referring to it!
© Alex Ufkes, 2020, 2021
72
The Borrow Checker
• The Rust compiler has a “Borrow Checker” that compares scope to determine if borrows are valid
• If one variable borrows another, the variable being borrowed must have a lifetime at least as long as the variable doing the borrowing.
What happens if the borrow checker gets confused?
© Alex Ufkes, 2020, 2021
73
Generic Lifetimes
Consider:
Simple program:
• Function accepts two string slices, returns the slice that is longer.
• Recall that slices are just references
• There’s no ownership changing here
• No moves
© Alex Ufkes, 2020, 2021
74
Generic Lifetimes
Consider:
© Alex Ufkes, 2020, 2021
75
Generic Lifetimes
The Borrow Checker can’t determine lifetime of the return value, because it’s not clear which input argument the return value will borrow from.
More generally: The borrow checker follows certain patterns when determining lifetime. If none of its patterns apply, we get a lifetime error.
© Alex Ufkes, 2020, 2021 76
Generic Lifetimes
• We as programmers know that this function is perfectly safe.
• x, y refer to string literals which live the entire duration of the program.
• HOWEVER
• What’s obvious to us is not
necessarily obvious to the compiler.
• Thus, we get compile errors.
© Alex Ufkes, 2020, 2021
77
Generic Lifetimes
It even happens when the return reference is fixed:
© Alex Ufkes, 2020, 2021
78
Lifetime Annotation Syntax
When the borrow checker is confused (for whatever reason), we must be specific:
Specify generic lifetime
• Similar to generic type:
• <‘a> specifies a generic lifetime, a
• &’a says this reference has lifetime a
© Alex Ufkes, 2020, 2021 79
Generic Lifetimes
What does mean precisely?
• The function accepts two arguments
• Both live at least as long as lifetime a
• Also, the string slice returned will live
at least as long as lifetime a
• We don’t know what a is, just that both
arguments and return value have the same lifetime.
© Alex Ufkes, 2020, 2021
80
Generic Lifetimes
However!
• We’re NOT actually changing any lifetimes!
• We’re just explicitly indicating them to help
the confused Borrow Checker.
• The borrow checker will reject any values
that don’t adhere to these constraints.
So how can we break this?
© Alex Ufkes, 2020, 2021
81
Consider
• Lifetime of s1 is different from s2 and s3.
• Lifetime ‘a is the scope in which x and y are
both valid. I.e., when s1 and s2 are valid.
• When we last use s3, s1 and s2 are valid.
• Thus, the borrow checker accepts this code.
• s3 references something that is valid until
after the last time s3 is used.
© Alex Ufkes, 2020, 2021
82
Now This:
• Here, lifetime a excludes a reference made by s3
• s3 references something that might be out of
scope (s2 will be, s1 won’t be)
• When we last use s3, s2 is no longer valid.
• Although in this case it doesn’t matter, because
we’ve declared both s1 and s2 as string slices.
• Slices aren’t on the heap, and thus references to
them will always be valid.
© Alex Ufkes, 2020, 2021
83
Oops. Let’s try again with Strings instead…
© Alex Ufkes, 2020, 2021 84
Lifetime Considerations
In general, we need some sort of lifetime indication any time we’re passing in more than one reference and returning a reference.
This is fine, albeit pointless
© Alex Ufkes, 2020, 2021
85
As is this
Lifetime Considerations
Originally, every reference required a lifetime specifier.
The Rust developers noticed some cases of reference passing were always the same, and thus added them as patterns for the compiler to recognize without requiring explicit lifetime annotations.
© Alex Ufkes, 2020, 2021
86
Lifetime Considerations
The compiler first checks its list of known patterns
If none are found, we get a compile error such as we’ve been seeing
What are these patterns?
© Alex Ufkes, 2020, 2021
87
Lifetime Inference Rules
1. The compiler first assigns a different lifetime to each reference input parameter.
Is seen as:
© Alex Ufkes, 2020, 2021 88
Lifetime Inference Rules
1. The compiler first assigns a different lifetime to each reference input parameter.
2. If there is one input reference parameter, it is assigned the same lifetime as any output references.
Is seen as:
© Alex Ufkes, 2020, 2021 89
Lifetime Inference Rules
1. The compiler first assigns a different lifetime to each reference input parameter.
2. If there is one input reference parameter, it is assigned the same lifetime as any output references.
3. If there are multiple input references, but one of them is &self, then the output references have the same lifetime as &self.
If, after applying these rules, there are still references without a lifetime specifier, we get a compile error.
© Alex Ufkes, 2020, 2021
90
If, after applying these rules, there are still references without a lifetime specifier, we get a compile error.
We don’t get errors here, because applying rules 1 and 2 results in all references having annotated lifetimes
© Alex Ufkes, 2020, 2021 91
We get an error here, because even after applying all three rules, we still don’t have a lifetime annotation for the output:
1. The compiler first assigns a different 11
lifetime to each reference input parameter.
2. If there is one input reference parameter, it is assigned the same lifetime as any output references.
3. If there are multiple input references,
but one of them is &self, then the output references have the same lifetime as &self.
11
© Alex Ufkes, 2020, 2021
92
Rule 1 applies, Rules 2 and 3 do not
We get an error here, because even after applying all three rules, we still don’t have a lifetime annotation for the output:
• No lifetime annotation after applying rules.
• Compile error.
© Alex Ufkes, 2020, 2021
93
Static Lifetime
• A special lifetime that is simply the duration of the program.
• String literals have a static lifetime.
• Makes sense, they’re not on the heap but embedded in the executable
© Alex Ufkes, 2020, 2021 94
Static Lifetime
• You might get error messages suggesting you use static lifetime.
• Be careful doing so. Does your reference really need to live for
the duration of the program? Probably not.
• It’s a lazy solution, much like adding dozens of global variables
to avoid using pointers or references.
© Alex Ufkes, 2020, 2021 95
Fantastic Rust Reference:
https://doc.rust-lang.org/book/second-edition/
© Alex Ufkes, 2020, 2021 96
© Alex Ufkes, 2020, 2021 97