Rust's fearless concurrency in practice

Unfortunately, using threads is not a free and easy win.

Concurrency issues are the fear of a lot of developers. Due to their unpredictable behavior, they are extremely hard to spot and debug. They can go undetected for a long time, and then, one day, simply because your system is handling more requests per second or because you upgraded your CPU, your application starts to behave strangely. The cause is almost always that a concurrency bug is hidden in your codebase

This post contains excerpts from my course Black Hat Rust where you'll learn Rust, offensive security and cryptography.

One of the most fabulous things about Rust is that thanks to its ownership system, the compiler guarantees our programs to be data race free.

For example, when we try to modify a vector at (roughly) the same time in two different threads:

ch_02/snippets/thread_error/src/main.rs

use std::thread;

fn main() {
    let mut my_vec: Vec<i64> = Vec::new();

    thread::spawn(|| {
        add_to_vec(&mut my_vec);
    });

    my_vec.push(34)
}

fn add_to_vec(vec: &mut Vec<i64>) {
    vec.push(42);
}

The compiler throws the following error:

error[E0373]: closure may outlive the current function, but it borrows `my_vec`, which is owned by the current function
 --> src/main.rs:7:19
  |
7 |     thread::spawn(|| {
  |                   ^^ may outlive borrowed value `my_vec`
8 |         add_to_vec(&mut my_vec);
  |                         ------ `my_vec` is borrowed here
  |
note: function requires argument type to outlive `'static`
 --> src/main.rs:7:5
  |
7 | /     thread::spawn(|| {
8 | |         add_to_vec(&mut my_vec);
9 | |     });
  | |______^
help: to force the closure to take ownership of `my_vec` (and any other referenced variables), use the `move` keyword
  |
7 |     thread::spawn(move || {
  |                   ^^^^^^^

error[E0499]: cannot borrow `my_vec` as mutable more than once at a time
  --> src/main.rs:11:5
   |
7  |       thread::spawn(|| {
   |       -             -- first mutable borrow occurs here
   |  _____|
   | |
8  | |         add_to_vec(&mut my_vec);
   | |                         ------ first borrow occurs due to use of `my_vec` in closure
9  | |     });
   | |______- argument requires that `my_vec` is borrowed for `'static`
10 |
11 |       my_vec.push(34)
   |       ^^^^^^ second mutable borrow occurs here

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0373, E0499.
For more information about an error, try `rustc --explain E0373`.
error: could not compile `thread_error`

To learn more, run the command again with --verbose.

The error is explicit and even suggests a fix. Let's try it:

use std::thread;

fn main() {
    let mut my_vec: Vec<i64> = Vec::new();

    thread::spawn(move || { // <- notice the move keyword here
        add_to_vec(&mut my_vec);
    });

    my_vec.push(34)
}

fn add_to_vec(vec: &mut Vec<i64>) {
    vec.push(42);
}

But it also produces an error:

error[E0382]: borrow of moved value: `my_vec`
  --> src/main.rs:11:5
   |
4  |     let mut my_vec: Vec<i64> = Vec::new();
   |         ---------- move occurs because `my_vec` has type `Vec<i64>`, which does not implement the `Copy` trait
5  |
6  |     thread::spawn(move || { // <- notice the move keyword here
   |                   ------- value moved into closure here
7  |     // thread::spawn(|| {
8  |         add_to_vec(&mut my_vec);
   |                         ------ variable moved due to use in closure
...
11 |     my_vec.push(34)
   |     ^^^^^^ value borrowed here after move

error: aborting due to previous error

For more information about this error, try `rustc --explain E0382`.
error: could not compile `thread_error`

To learn more, run the command again with --verbose.

However hard we try it, the compiler won't let us compile code with data races.

The three causes of data races

  • Two or more pointers access the same data at the same time.
  • At least one of the pointers is being used to write to the data.
  • There's no mechanism being used to synchronize access to the data

The three rules of ownership

  • Each value in Rust has a variable that's called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

The two rules of references

  • At any given time, you can have either one mutable reference or any number of immutable references.
  • References must always be valid.

These rules are extremely important and are the foundations of Rust's memory and concurrency safety.

If you need more details about ownership, take some time to read the dedicated chapter online.

Other concurrency problems

Data races are not the only concurrency bugs, there also are deadlocks and race conditions.

Adding multithreading to our scanner

Now we have seen what multithreading is in theory. Let's see how to do it in idiomatic Rust.

Usually, multithreading is dreaded by developers because of the high probability of introducing the bugs we have just seen.

But in Rust this is another story. Other than for launching long-running background jobs or workers, it's rare to directly use the thread API from the standard library.

Instead, we use rayon, a data-parallelism library for Rust.

Why a data-parallelism library? Because thread synchronization is hard. It's better to design our programs in a functional way that doesn't require threads to be synchronized.

ch_02/tricoder/src/main.rs

// ...
use rayon::prelude::*;


fn main() -> Result<()> {
    // ..
    // we use a custom threadpool to improve speed
    let pool = rayon::ThreadPoolBuilder::new()
        .num_threads(256)
        .build()
        .unwrap();

    // pool.install is required to use our custom threadpool, instead of rayon's default one
    pool.install(|| {
        let scan_result: Vec<Subdomain> = subdomains::enumerate(&http_client, target)
            .unwrap()
            .into_par_iter()
            .map(ports::scan_ports)
            .collect();

        for subdomain in scan_result {
            println!("{}:", &subdomain.domain);
            for port in &subdomain.open_ports {
                println!("    {}", port.port);
            }

            println!("");
        }
    });
    // ...
}

Aaaand... That's all. Really. We replaced into_iter() by into_par_iter() (which means "into parallel iterator". What is an iterator? More on that in chapter 3), and now our scanner will scan all the different subdomains on dedicated threads.

1 email / week to learn how to (ab)use technology for fun & profit: Programming, Hacking & Entrepreneurship.
I hate spam even more than you do. I'll never share your email, and you can unsubscribe at any time.

Tags: hacking, programming, rust

Want to learn Rust, Cryptography and Security? Get my book Black Hat Rust!