Introduction

I'm happy to announce a new release of shredder! In case you don't know, shredder is a garbage collection library for Rust. It's safe, easy to use, and concurrency friendly. Here is a small example of it in action:

use std::cell::RefCell;

use shredder::{
    number_of_active_handles, 
    number_of_tracked_allocations, 
    run_with_gc_cleanup, 
    Gc, Scan,
};

#[derive(Scan)]
struct Node {
    data: String,
    directed_edges: Vec<Gc<RefCell<Node>>>,
}

fn main() {
    // Using `run_with_gc_cleanup` is good practice, since it helps ensure destructors are run 
    // (but this is not required for correctness)
    run_with_gc_cleanup(|| {
        let a = Gc::new(RefCell::new(Node {
            data: "A".to_string(),
            directed_edges: Vec::new(),
        }));

        let b = Gc::new(RefCell::new(Node {
            data: "B".to_string(),
            directed_edges: Vec::new(),
        }));

        // Usually would need `get` for `Gc` data, but `RefCell` is a special case
        a.borrow_mut().directed_edges.push(b.clone());
        b.borrow_mut().directed_edges.push(a);
        // We now have cyclical data!
    });
    // Everything was cleaned up!
    assert_eq!(number_of_tracked_allocations(), 0);
    assert_eq!(number_of_active_handles(), 0);
}

And you can read more on Github or in my previous blog post!

While most of the feature work for this was done a while ago, I'm bad at actually writing notes and releasing things, so this announcement/crates.io release is a bit delayed.

New Features

Trait System Refactor (@Others)

The main goal of this release was to enable some sort of Deref support. After thinking on this for a while, I came to the conclusion this was impossible with the original trait design. In 0.1, there were two critical traits: GcSafe and Scan. GcSafe captured every invariant required for data to be stored in a Gc. But in 0.2, with multiple smart pointers, that design falls apart.

That means that we needed to break down into more marker traits. Now in 0.2 we have:

  • GcSafe = captures being Send + 'static, or enough like Send + 'static to be safely scanned and managed in the background with exclusive access
  • GcDrop = can safely run the destructor of this data in the background (allowable to access this data in a Drop implementation run by the collector)
  • GcDeref = Sync + can be safely stored in a DerefGc (see below)
  • Scan = captures the scanning logic, as previously

These are of course imprecise definitions of unsafe traits--if you're curious, the exact rules for implementing these can be found in the crate documentation.

DerefGc (@Others)

Now for the most exciting new feature! The new smart pointer DerefGc.

First, a discussion of the issues with deref in the context of shredder. shredder lets you store !Sync data in Gc, so it's not possible to simply implement Deref for Gc (since data is scanned in the background). Further complicating this is the issue of Drop and interior mutability. shredder being able to run Drop implementations in the background is only possible due to the get guard we use to prevent access of data inside a dead Gc. This guard is also needed to prevent interior mutability being used to modify the object graph while the collector is running.

All that said, it is still possible to implement a DerefGc type. However we need to set the following restrictions:

  1. The data stored in DerefGc must be Sync
  2. We can't naively run the Drop implementation of the interior data, or access the DerefGc in any Drop implementation run by the collector
  3. The data inside the DerefGc must be immutable through &T (no interior mutability), except if that interior mutable data is nested in a Gc

To assure these conditions, we utilize the new trait system. DerefGc is !GcDrop, assuring (2). Then, for any T in DerefGc<T>, T must be GcDeref, a new trait capturing condition (1) and (3).

While these restrictions (especially 3) are strict, it still leaves us with as useful a Deref type as I believe is possible with shredder's current restrictions. The nice thing is that accessing data inside a DerefGc can never block (unlike calling get on a regular Gc)

AtomicGc (@Others)

Along with enabling DerefGc, implementing the new trait system also gave me the tools to implement an atomic version of Gc. This is still pretty experimental, but it is pretty similar to crossbeams Atomic type (but with completely different performance characteristics). This type is !GcDrop for similar reasons to DerefGc.

This feature is pretty raw, and I'm pushing it for experimentation. It needs a bit more performance optimization to be truly useful.

Finalize (@Others)

With these new smart pointer types, just using Drop is no longer good enough for shredder. While Gc itself will continue to default to Droping things, many new types are !GcDrop. Thus, like all gc crates before us, we are introducing a Finalize trait.

This is quite similar to Drop, but unsafe to implement, and with many restrictions to prevent unsoundness. (see crate documentation) This is what you'll need for cleanup logic for DerefGc<T>. Gc also has support if you'd prefer not to use Drop.

Derive Improvements (@Others)

The Scan derive has been updated to support enums, and implementing the new traits described above. GcDrop is opt-out, and GcDeref is opt-in. (This is for backwards compatibility reasons.) Please read the crate documentation for details.

Also in this release is a new Finalize derive, which eases implementation of Finalize for structs with finalizable fields.

Both derives also now support enums.

Scan for fXX, bool and BTreeMap and BTreeSet (@alekratz, @DrGabble)

In the realm of "this shows someone else has run this code." Some new types got Scan implementations in this release! There are tons more standard library types that need traits implemented, and these are easy contributions if you want to get started with the codebase.

Unsized Types (@alekratz)

In the most sizable contribution from another human so far, @alekratz contributed support for storing unsized types in Gcs!  Since CoerceUnsized is nightly only (with no obvious route to stabilization?), this feature is a lot more useful if you are on the dark side, where you can now store whatever unsized type you want in a Gc:

let data: Gc<dyn Debug> = Gc::new(0u32);

That being said, we also have janky stable support using the ToScan trait. This comes with a few asterisks. In order to get a Gc<dyn D> on stable you basically need to control the definition of D and your data needs to be 'static. Until we have better ways of working with unsized data on stable, or someone contributes a cleaner hack, that is.

Downcast (@alekratz, @Others)

Along with being able to store trait objects, you also can now downcast them! While it's easy to forget the Any trait, for some applications a level of dynamic reflection is a must. With Gc now supporting unsized types and downcasting, there is definitely some cool interpreter to be built here!

Collector Refactoring (@Others)

As part of onboarding other contributors to the codebase, the core collector code was refactored and better documented. This should helpfully make it easier to read and work on this slightly thorny code. (Which is ultimately just a simple mark and sweep collector, with some Rust specific tweaks.)

Also as part of this work, the concurrent HashMaps we were using behind the scenes were removed. They were replaced with a nifty chunked-linked-list plus arc-swap combo. Huge shoutout to the arc-swap crate, which is a beautiful  piece of engineering, and also indirectly powers the DerefGc mentioned above.

Soundness Fixes (@Others)

Also while working in the collector, Inoticed a subtle soundness hole in the collector. The issue was related to how garbage is destructed in the background. Here is a diagram:

The essential problem is that destructors for garbage could hold handles to other garbage data (which would be otherwise unreachable), and send them to other threads. The solution is to mark all garbage as "dead" before running any destructors. (Which is slightly less efficient, but sound.) This allows "zombie" Gcs to exist, but since dead Gcs can't be turned into references, there is no soundness hole.

The Future (0.3.0?)

In the future, I hope to spend some more time optimizing the crate. While both memory usage and CPU usage have been vastly reduced since initial versions, I have more ideas to try and optimize, and I don't want to leave performance on the table. ( I wanted to save this work until 0.2 has baked in the wild .) The two major optimizations are: (1) switching from tracking handles individually to simple handle counts, and (2) shortening the "stop the world" phase of collection with a more greedy approach.

I also feel shredder is reaching the point where a major limiting factor is lack of use cases. This supports my initial hypothesis that garbage collection is not needed in 99% of Rust programs. I do think there is room for me to experiment in places it could be useful. (For example, it'd be awesome to have a concurrent hash-map backed by AtomicGc rather than crossbeam's epoch based reclamation, if only just to compare performance!) And of course if you have a use-case well served by garbage collection, please get in touch!

In terms of feature work, I'd quite like the crate to support no-std use-cases where alloc is available. While there is no fundamental reason why this should be impossible, the pervasive use of rayon, parking-lot and threading internally poses an issue. I think solving this properly requires writing a shim layer to abstract away the concurrent aspects from the collector implementation.

If you're interested in helping with this project (either to work on something mentioned in this post, or a smaller issue), feel free to have a look at the Github issues, where I've documented what known issues and features need some love!