Shredder: Garbage Collection as a Library for Rust
I'm excited to announce shredder
, a library for Rust that provides a garbage collected smart pointer, Gc
. This smart pointer is useful in the same sorts of situations as Rc
, but can handle reference cycles. It's available on crates.io and Github. Here's how you could use it:
use std::cell::RefCell;
use shredder::{
Gc, Scan,
run_with_gc_cleanup,
number_of_active_handles,
number_of_tracked_allocations,
};
#[derive(Scan)]
struct Node {
data: String,
directed_edges: Vec<Gc<RefCell<Node>>>,
}
fn main() {
// Using `run_with_gc_cleanup` is good practice
// (it helps ensure destructors are run)
run_with_gc_cleanup(|| {
let a = Gc::new(RefCell::new(Node {
data: "A".to_string(),
directed_edges: Vec::new(),
}));
let b = Gc::new(RefCell::new(Node {
data: "B".to_string(),
directed_edges: Vec::new(),
}));
// Usually would need `get` for `Gc` data,
// but `RefCell` is a special case
a.borrow_mut().directed_edges.push(b.clone());
b.borrow_mut().directed_edges.push(a);
// We now have cyclical data!
});
// Everything was cleaned up!
assert_eq!(number_of_tracked_allocations(), 0);
assert_eq!(number_of_active_handles(), 0);
}
In this example we had #[derive(Scan)]
on our struct. By implementing this trait, we let shredder
know how it can find what Gc
handles this data contains. This is an essential part of the tracing garbage collection that shredder
implements. After we've done that, using the smart pointer is basically just plug and play!
The only ergonomic issue to note is the need for guard objects. In shredder
, you always need a guard object when accessing data inside a Gc
. Usually you need to use .get
to get a GcGuard
. (Guarded access is key to implementing shredder
's guarantees in a performant way.) Now, notice how we didn't have to do this in the above example. In general, Gc
is only useful with interior mutability, so Gc<RefCell<?>>
, Gc<Mutex<?>>
and Gc<RwLock<?>>
have special methods that let you skip this step (see documentation for more details).
People have experimented with garbage collection in Rust before, but I wasn't happy with existing libraries. I do not believe there is an existing Rust library providing an easy to use garbage collection solution that can robustly work in concurrent programs, as well as handle non-'static
data. shredder
does that and more. Let me break down what I consider the best features of shredder
. (If you're interested in a deep dive into how it all works, I hope to provide a follow post on the design.)
1 Safety First) It should be impossible to cause unsafe behavior in safe code using shredder
. The only unsafe parts of the API are writing manual implementations of the core traits: Scan
, GcSafe
and Finalize
. I've done my best to ensure they have complete documentation explaining how to implement them safely.
2 Ergonomics Second) I believe a lot of the existing solutions are fairly hard to grok, requiring complex macros or weird setups with lifetimes. In shredder
, Gc
is basically just a fancy Arc
, and I hope that makes the library easier to use. (And in the future I even want to make the guard object optional, assuming I can make it safe and performant!)
3 Ready for Fearless Concurrency) shredder
was always designed to deal with multithreaded programs. You can send a Gc
around just as if it was an Arc
, and I think that's really cool! Keeping with this, I've designed the collector so that even if a thread holds onto a ton of GcGuard
s, collection can still proceed as normal. The only operation that should implicitly block on collection in shredder
is calling .get
(or one of the helper methods wrapping .get
).
4 Seamless Destruction) By default, Gc::new
only takes 'static
data, and when it is found to be unreachable, it is dropped and its destructor is called. (There are safety checks in place to prevent you from exploiting cyclical data in the destructor.) This saves users the pain of having to deal with Finalize
in the common case. The run_with_gc_cleanup
method is also provided, and helps ensure destructors are run at the end of your program.
5 Clean Finalization) Sometimes, however, you need non-'static
data. Unfortunately, since we may scan data after lifetimes have expired, you cannot use raw references. Instead you must replace &'a T
with R<'a, T>
and &'a mut T
with RMut<'a, T>
. Outside of Scan
and Finalize
implementations, these can be used as if they were regular references. Speaking of Finalize
, if you implement it, and use Gc::new_with_finalizer
, then you can use unsafe code to define code to be run right before your data is deallocated.
6 Concurrent Collection and Destruction) Finally, as a nice side effect of shredder
s design, automatic collection always happens in a background thread, and hopefully doesn't implement your regular processing too badly. Same goes for running destructors, which might actually be a performance win if your destructors are expensive.
shredder
isn't perfect. It needs a lot more performance optimization, and I think the background collection can become even more concurrent in the future. It also has a non-trivial memory overhead. And of course it is very new and immature.
However, I think that's okay for now. I designed shredder
to be another tool in the Rust programmer's toolbox. Gc
is a very extreme escape hatch that lets you escape usual ownership rules. It's something that should be used carefully in small doses, where giving up the superior performance of RAII memory management is worth it. (Perhaps for user provided data with unpredictable loops, or interestingly shaped graphs.) I'm excited to see what people can do with it!
It's available on crates.io and Github. (If you've read this far, maybe you're interested in contributing? I've put some issues on the Github, and that might be a good place to start!)