# Smart Pointers Pointers do that, point to address in memory. 0 Overhead `&` References are a kind of pointer `Smart Pointers` are data structe that act like a pointer + metadata and methods. They may `own` data they point to Ex: `String` `Vec` -> THey own memory and allow its manipulation. Basically a `struct` that implements `Deref` and `Drop` traits. - `Deref` : Allows to behave like a reference - `Drop` : Customize what happens when variable goes out of scope === # Using `Box` to Point to Data on the Heap Basically a pointer to heap allocation of the type. Why use them? - Unknown size at compile time, but need to use it in stack (requires exact time when using it)? - Move ownership of large amount of data (MAKE SURE ITS NOT COPIED) - Own any type that implements a trait, rather than a specific type ## Storing data in Heap with `Box` ```rust fn main() { let b = Box::new(5); println!("b = {}", b); } ``` Box allow access to the type directly, like it was in the stack it will be slower tho. When it goes out of scope both the `Box` and the `Type` are deallocated form stack and heap respectively. ## Recursive Types with Boxes Type that hold a value of the same type -> Recursive Type These can't be known at compile time (when does the recursion stop?) ### Cons List Data Structure that comes from Lisp's `cons` function. - `cons` -> Constructs a new pair from 2 arguments, the pair contains the pair forming a list. - Colloquially `to cons x into y` -> Construct a container instance by puttin `x` at the start of the new container followed by `y` An item in a `cons list` contains the item and the next one (`Nil` if it is non-existant, but not invalid) ```rust enum List { Cons(T, List), Nil, } fn main() { let list: List = Cons(1, Cons(2, Cons(3,Nil))); } ``` This will not compile: `List` is an infinite type, can't be constructed in stack How does Rust check the size of a variable? - Compute stack size of an type first, ex: ```rust // Correct Struct enum Message { Quit, // enum size Move { x: i32, y: i32 }, // enum size + sizeof(i32 * 2) Write(String), // enum size + sizeof(String) -> Only the struct with the pointer, not the actual string in heap ChangeColor(i32, i32, i32), // enum size + sizeof(i32 * 3) } // Incorrect Struct enum List { Cons(T, List), // sizeof(T) + sizeof(List(sizeof(T) + sizeof(List(... Nil, } // It will infinetely go through List to compute size! ``` #### Allowing Recursion with Box `Box` is a pointer to heap, we have fixed stack size, we don't care about heap size. ```rust enum List { Cons(i32, Box), // sizeof(i32) + sizeof(Box) Nil } ``` Box size is known and does not need to check further, data pointing is later redirected to heap. === # Smart Pointers as References - `Deref` How the `*` operator functions. ## Following Pointer to value ```rust let x = 5; let y = &x; assert_eq!(5, x); assert_eq!(5, *y); ``` Basic usage of dereferencing, Y points to X so we need to dereference to obtain X from Y. ## Using `Box` like a Reference ```rust let x = 5; let y = Box::new(x); assert_eq!(5, x); assert_eq!(5, *y); ``` Works the same, basically, it implements and allows usage of the dereference operator. ## Defining our own Smart Pointer ```rust struct MyBox(T); impl MyBox { fn new(x: T) -> MyBox { MyBox(x) } } ``` Basically a tuple that holds a Type, but has the `new` method. To make it usable with dereferencing we need to implement `Deref` ```rust use std::ops::Deref; impl Deref for MyBox { type Target = T; fn deref(&self) -> &Self::Target { &self.0 } } ``` `deref` returns a reference to the actual value, thus calling `*box.deref()` actually dereferences it. When we call `*box`, it is automatically substituting for the previous expression, because that is a thing that the `Deref` trait implements alongside. ## Implicit `Deref` Coercion` with Functions and Methods `Deref Coercion`converts type into a referencet to another type. Basically, an implementation of `.deref()` that returns another type like `&String` to `&str`. It is similar to an implicit cast but only for references. ```rust fn hello(name: &str) { println!("Hello, {}!", name); } fn main() { let m = MyBox::new(String::from("Rust")); hello(&m); } ``` `hello(&str)` should be called with a `&str`, but - `MyBox` derefs to `&String` - `&String` derefs to `&str` We can call directly with a `&MyBox` as it will automatically dereference to the type. Without the call to `hello()` would look like `hello(&(*m)[..]);` - `(*m)` -> Dereference MyBox to String - `& [..]` -> Take a container slice of the whole string, converted to string slice as it is the type of the `String` container ## `Deref Coercion` and Mutability Can implement the traits `DerefMut` to deref a mutable reference to another mutable reference: - `&T -> &U` if `T: Deref` - `&mut T -> &U` if `T: Deref` - `&mut T -> &mut U` if `T: DerefMut` Mutable references can only be created if the Type implements `DerefMut` and is coercing an already mutable reference! === # Running code on Cleanup with `Drop` Trait Customize what happens when a variable goes out of scope (CleanUp or Destructor methods but called automatically). ```rust struct CustomSmartPointer { data: String, } impl Drop for CustomSmartPointer { fn drop(&mut self) { println!("Dropping CustomSmartPointer with data `{}`!", self.data); } } fn main() { let c = CustomSmartPointer { data: String::from("my stuff"), }; let d = CustomSmartPointer { data: String::from("other stuff"), }; println!("CustomSmartPointers created."); } ``` When running this, the drop methods will be called when values go out of scope, printing their data. ``` CustomSmartPointers created. Dropping CustomSmartPointer with data `other stuff`! Dropping CustomSmartPointer with data `my stuff`! ``` ## Dropping a Value Early - `std::mem::drop` Ex: You have a lock that manages when access to variables is permitted. You might want to clear the lock as soon as we are done using that variable aso that it becomes accessible to others. You can't call `.drop()` method directly, as it is used as a `Destructor` that would also be called after the variable goes out of scope. To do that call `std::mem::drop(variable)` or use variable specific scopes, whatever feels cleaner to you. === # `Rc` Reference Counted Smart Pointer Allocate data in heap for multiple users to Read, without needing to know who will be the last reader - Only for SingleThread tho. ## Sharing Data with `Rc` Ex: List `a` contains 2 values, List `b` and `c` have other values first and then continue with values from `a` Both need access to `a` ```rust enum List { Cons(i32, Box), Nil, } use crate::List::{Cons, Nil}; fn main() { let a = Cons(5, Box::new(Cons(10, Box::new(Nil)))); let b = Cons(3, Box::new(a)); let c = Cons(4, Box::new(a)); } ``` Here, ownership of `a` is moved to `b`, thus `c` is asking for ownership from a non-owner. That is because `Cons` owns the data it holds. To change that into references, it would be a mess: ```rust // Add lifetime so that List and References have the same lifetime enum List<'a> { Cons(i32, Box<&'a List<'a>>), Nil, } use List::*; // Now Box can't own data, so it owns a reference of some data // Have to declare a variable Nil to own the actual data // Then declare the first node to own `10` and ref to `Nil` // Then a owns `5` and references `list_a`,... fn main() { let nil_v = Nil; let list_a = Cons(10, Box::new(&nil_v)); let a = Cons(5, Box::new(&list_a)); let b = Cons(3, Box::new(&a)); let c = Cons(4, Box::new(&a)); } ``` But with `Rc` ```rust enum List { Cons(i32, Rc), Nil, } use List::*; use std::rc::Rc; fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); let b = Cons(3, Rc::clone(&a)); let c = Cons(4, Rc::clone(&a)); } ``` List now holds a `Reference Counted Smart Pointer` to a List. Then each other list will have a clone of the reference - `clone` method does not clone all data, it only increments the reference count by 1. When that clone is dropped, it drops the count by 1, if count gets to 0, the values is dropped. ```rust fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); println!("count after creating a = {}", Rc::strong_count(&a)); let b = Cons(3, Rc::clone(&a)); println!("count after creating b = {}", Rc::strong_count(&a)); { let c = Cons(4, Rc::clone(&a)); println!("count after creating c = {}", Rc::strong_count(&a)); } println!("count after c goes out of scope = {}", Rc::strong_count(&a)); } // Output count after creating a = 1 count after creating b = 2 count after creating c = 3 count after c goes out of scope = 2 ``` To know exactly number of references use `Rc::strong_count(&self)` `Rc::weak_count(&self)` explained later... === # `RefCell` - Interior Mutability Pattern `Interior Mutability` is a design pattern, mutate data even if there are immutable references. Normally would require `unsafe` code (Chapter 19). This pattern makes it safe and in compliance with the borrowing rules at runtime. ## Enforcing Borrowing Rules at Runtime `RefCell` Represent SINGLE ownership of the data. It is basically a `Box` but the borrowing rules are checked in runtime instead of compile time? Circumvent some operations that are not allowed at compile time, but safe at runtime. It is actually quite impossible to check for all possible states and errors a complex program may go through, given the amount of inputs and states it goes through, so some seemingly safe might be unsafe and some conservatively unsafe will be safe. `RefCell` is SingleThreaded only (Chapter 16 for multithreading). ### Box Rc or RefCell *Box* - Single Owner - Allows immutable and mutable borrows at compile time *Rc* - Multiple Owners - Only immutable borrows at compile time *RefCell* - Single Owner - Allows immutable and mutable borrows, checked in RUNTIME - Allows mutation of value even if it is immutable ## Interior Mutability: Mutable Borrow to Immutable Value ```rust let x = 5; // Immutable value let y = &mut x; // Illegal: Mutable borrow of immutable value ``` `RefCell` does not circumvent this, it will `panic!` if the rules are broken at runtime. Why and How is it useful? ### Ex: Mock Objects `test double` = Use a type in place of another. `Mock objects` are types that record what happens in a test. The exmaple implements the library that tracks the amount of messages a user has sent, limited by their quota. I will not delve deep in to implementation because it seems complicated but will try on things that seems hard to understand at first glance): ```rust pub trait Messenger { fn send(&self, msg: &str); } // LimitTracker Holds reference of a user(messenger), with a lifetime at least equal to the current tracker using it pub struct LimitTracker<'a, T: Messenger> { messenger: &'a T, value: usize, max: usize, } impl<'a, T> LimitTracker<'a, T> where T: Messenger, // Implementation specific for Types that implement Messenger trait { // New function for ease of use pub fn new(messenger: &T, max: usize) -> LimitTracker { LimitTracker { messenger, value: 0, max, } } // Checks that the user has not passed their quota and sends a warning pub fn set_value(&mut self, value: usize) { self.value = value; let percentage_of_max = self.value as f64 / self.max as f64; if percentage_of_max >= 1.0 { self.messenger.send("Error: You are over your quota!"); } else if percentage_of_max >= 0.9 { self.messenger .send("Urgent warning: You've used up over 90% of your quota!"); } else if percentage_of_max >= 0.75 { self.messenger .send("Warning: You've used up over 75% of your quota!"); } } } ``` Then the tests which will fail to compile: ```rust [cfg(test)] mod tests { use super::*; // Define test struct that holds the messages, not send them struct MockMessenger { sent_messages: Vec, } impl MockMessenger { fn new() -> MockMessenger { MockMessenger { sent_messages: vec![], } } } // Implement the Messenger trait // This does not compile, because &self is immutable and we are doing mutable operations to `sent_messages: Vec` impl Messenger for MockMessenger { fn send(&self, message: &str) { self.sent_messages.push(String::from(message)); } } #[test] fn it_sends_an_over_75_percent_warning_message() { let mock_messenger = MockMessenger::new(); let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); limit_tracker.set_value(80); assert_eq!(mock_messenger.sent_messages.len(), 1); } } ``` At compile time this is illegal, we are mutating the `Vec` which does not have a mutable borrow. #### Allowing mutability the wrong way If we wanted to allow all of this, we would have to change almost every function declaration to use mutable borrows, which besides making it more verbose it would then limit other usages as a signle mutable borrow is not allowed if another variable holds any type of borrow. ```rust pub trait Messenger { fn send(&mut self, msg: &str); } pub struct LimitTracker<'a, T: Messenger> { messenger: &'a mut T, value: usize, max: usize, } impl<'a, T> LimitTracker<'a, T> where T: Messenger, { pub fn new(messenger: &mut T, max: usize) -> LimitTracker { LimitTracker { messenger, value: 0, max, } } pub fn set_value(&mut self, value: usize) { self.value = value; let percentage_of_max = self.value as f64 / self.max as f64; if percentage_of_max >= 1.0 { self.messenger.send("Error: You are over your quota!"); } else if percentage_of_max >= 0.9 { self.messenger .send("Urgent warning: You've used up over 90% of your quota!"); } else if percentage_of_max >= 0.75 { self.messenger .send("Warning: You've used up over 75% of your quota!"); } } } #[cfg(test)] mod tests { use super::*; struct MockMessenger { sent_messages: Vec, } impl MockMessenger { fn new() -> MockMessenger { MockMessenger { sent_messages: vec![], } } } impl Messenger for MockMessenger { fn send(&mut self, message: &str) { self.sent_messages.push(String::from(message)); } } #[test] fn it_sends_an_over_75_percent_warning_message() { let mut mock_messenger = MockMessenger::new(); let mut limit_tracker = LimitTracker::new(&mut mock_messenger, 100); limit_tracker.set_value(80); assert_eq!(mock_messenger.sent_messages.len(), 1); } } ``` #### Using `RefCell` ```rust ``` This does work, `RefCell` checks at runtime if the borrows are legal. ### How does `RefCell` Track Borrows? We use the `borrow` and `borrow_mut` methods, which return `Ref` and `RefMut` respectively which implement the `Deref` trait, aka being treated as regular references. The functions increase count of immutable borrows, which decreases when the `Ref/RefMut` go out of scope with `Drop` trait. If a mutable borrow is used, borrows in the same scope will be illegal and `panic!`. ### Multiple Owners of mutable data `Rc + RefCell` `Rc>` Can have multiple owners and each can get mutable borrow access of it. It will still be bound by the Runtime check thus being safe. ```rust // Update List example #[derive(Debug)] enum List { Cons(Rc>, Rc), Nil, } use List::*; use std::cell::RefCell; use std::rc::Rc; fn main() { let value = Rc::new(RefCell::new(5)); let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a)); let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a)); *value.borrow_mut() += 10; println!("a after = {:?}", a); println!("b after = {:?}", b); println!("c after = {:?}", c); } ``` In the previous `List` example: a,b,c get immutable borrows of `Rc`, no one has actually borrowed the `RefCell`. Then we can still access the `RefCell` mutably, and be correct. === # Memory Leaks in Reference Cycles Making inifinite references like used with `Rc` or `RefCell`, can create memory leaks if they reference each other. ## Creating Reference cycles Updating the List example ```rust #[derive(Debug)] enum List { Cons(i32, RefCell>), Nil, } impl List { fn tail(&self) -> Option<&RefCell>> { match self { Cons(_, item) => Some(item), Nil => None, } } } ``` We want to be able to change which List is pointing `Rc` to, so `RefCell>` We can make, list `a` and `b` point to `a`, then make `a` point to `b` using the `tail()` method -> Reference Cycle. The following main will run, but the commented print would overflow as it will cyclycally go through `a` and `b` to print all the items on the list! ```rust fn main() { let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil)))); println!("a initial rc count = {}", Rc::strong_count(&a)); println!("a next item = {:?}", a.tail()); let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a)))); println!("a rc count after b creation = {}", Rc::strong_count(&a)); println!("b initial rc count = {}", Rc::strong_count(&b)); println!("b next item = {:?}", b.tail()); if let Some(link) = a.tail() { *link.borrow_mut() = Rc::clone(&b); } println!("b rc count after changing a = {}", Rc::strong_count(&b)); println!("a rc count after changing a = {}", Rc::strong_count(&a)); // Uncomment the next line to see that we have a cycle; // it will overflow the stack //println!("a next item = {:?}", a.tail()); } ``` At the end of the program, `b` drops 1 reference, but a references it to, so it goes from 2 references to 1 Same goes to `a` which will be dropped by 1, but was 2 because `b` referenced it but did not drop it. Solution: Make some references non-owning an dother have it. Owners will drop entirely the value instead of counts. That is invalid in our example, we need ownership of the references for each one. ## Preventing Reference Cycles -> `Weak` Using `Rc::clone()` increases `strong_count` of references. The value will only be cleaned up if `strong_count == 0`. `Rc::downgrade() -> Weak` : Increases `weak_count` by 1, which does not need to be 0 to be cleaned up. Strogn reference share ownership, weak referencees don't. To access value from `Weak` -> `Weak::upgrade()` -> Checks if the value is still valid and retursn `Option>`. ### Example: Tree Data Structure ```rust use std::rc::{Rc, Weak}; #[derive(Debug)] struct Node { value: i32, parent: RefCell>, children: RefCell>>, } fn nodes() { let leaf = Rc::new(Node { value: 3, parent: RefCell::new(Weak::new()), children: RefCell::new(vec![]), }); let branch = Rc::new(Node { value: 5, parent: RefCell::new(Weak::new()), children: RefCell::new(vec![Rc::clone(&leaf)]), }); *leaf.parent.borrow_mut() = Rc::downgrade(&branch); println!("Leaf parentL {:#?}", leaf.parent.borrow().upgrade()); } ``` `Node` owns the reference to each of the childrens(`Rc`), but can change who owns them (RefCell, which allows mutability of the vector). We also want the node, to know who is its parent, but not own them, thus `Weak` because it does not own, and the parent might change so `RefCell`. In the main, we create each node without a parent, then we `borrow_mut` the leaf's parent and change it to the downgraded reference of `branch` variable. ### Visualizing Counts ```rust fn nodes() { let leaf = Rc::new(Node { value: 3, parent: RefCell::new(Weak::new()), children: RefCell::new(vec![]), }); println!( "leaf strong = {}, weak = {}", Rc::strong_count(&leaf), Rc::weak_count(&leaf), ); // Leaf strong owns the node as itself { let branch = Rc::new(Node { value: 5, parent: RefCell::new(Weak::new()), children: RefCell::new(vec![Rc::clone(&leaf)]), // Branch strong owns the leaf node as child }); // Branch strong owns branch node as itself *leaf.parent.borrow_mut() = Rc::downgrade(&branch); // Leaf weak owns the branch node as parent println!( "branch strong = {}, weak = {}", Rc::strong_count(&branch), Rc::weak_count(&branch), ); // Branch owned 1 time strongly by itself, 1 time weakly by leaf println!( "leaf strong = {}, weak = {}", Rc::strong_count(&leaf), Rc::weak_count(&leaf), ); // Leaf owned 2 times strongly by itself and branch, 0 times weakly } // Branch is dropped as only itself had strong ownership (Reference ot 0), 1 strong own of leaf is dropped, leaf still owns weakly branch as parent println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); // Call upgrade to check borrow parent, as it returns None, the weak ownership is now dropped to branch println!( "leaf strong = {}, weak = {}", Rc::strong_count(&leaf), Rc::weak_count(&leaf), ); // Leaf has 1 strong ownership of itself, 0 weak references by anyone. } ```