Using compare-and-swap (CAS) in Concurrency

What is Compare-and-Swap (CAS)?

Compare-and-swap (CAS) is an atomic instruction that compares the contents of a memory location to a given value and, if they are equal, modifies the contents of that memory location to a new value. If the contents are not equal, the memory location remains unchanged.

The CAS operation is typically implemented in hardware and is supported by modern processors. It provides a way to atomically update a shared variable without the need for locks, making it a fundamental building block for lock-free and wait-free algorithms.

The CAS operation takes three arguments:

The memory location to be updated.
The expected value of the memory location.
The new value to be written to the memory location.

The CAS operation returns a boolean value indicating whether the update was successful or not.

Here’s a pseudo-code representation of the CAS operation:

function cas(memory_location, expected_value, new_value):
if *memory_location == expected_value:
*memory_location = new_value
return true
else:
return false

Using CAS in Practice

CAS can be used to implement various concurrent data structures and algorithms. Let’s explore a few examples.

Atomic Counter

One common use case for CAS is implementing an atomic counter. An atomic counter is a shared variable that can be incremented or decremented concurrently by multiple threads without the need for locks.

Here’s an example implementation of an atomic counter using CAS in Java:

import java.util.concurrent.atomic.AtomicInteger;

public class AtomicCounter {
    private AtomicInteger count = new AtomicInteger(0);

    public void increment() {
        int expected;
        do {
            expected = count.get();
        } while (!count.compareAndSet(expected, expected + 1));
    }

    public int get() {
        return count.get();
    }
}

In this example, we use the AtomicInteger class from the java.util.concurrent.atomic package. The increment() method uses a CAS loop to atomically increment the counter. It keeps retrying until the CAS operation succeeds.

Lock-Free Stack

Another example of using CAS is implementing a lock-free stack. A lock-free stack allows multiple threads to push and pop elements concurrently without the need for locks.

Here’s an example implementation of a lock-free stack using CAS in Rust:

use std::sync::atomic::{AtomicPtr, Ordering};

struct Node<T> {
    value: T,
    next: AtomicPtr<Node<T>>,
}

pub struct Stack<T> {
    head: AtomicPtr<Node<T>>,
}

impl<T> Stack<T> {
    pub fn new() -> Self {
        Stack {
            head: AtomicPtr::new(std::ptr::null_mut()),
        }
    }

    pub fn push(&self, value: T) {
        let new_node = Box::into_raw(Box::new(Node {
            value,
            next: AtomicPtr::new(std::ptr::null_mut()),
        }));

        loop {
            let current_head = self.head.load(Ordering::Relaxed);
            unsafe {
                (*new_node).next.store(current_head, Ordering::Relaxed);
            }

            if self
                .head
                .compare_exchange(
                    current_head,
                    new_node,
                    Ordering::Release,
                    Ordering::Relaxed,
                )
                .is_ok()
            {
                break;
            }
        }
    }

    pub fn pop(&self) -> Option<T> {
        loop {
            let current_head = self.head.load(Ordering::Acquire);

            if current_head.is_null() {
                return None;
            }

            let next_node = unsafe { (*current_head).next.load(Ordering::Relaxed) };

            if self
                .head
                .compare_exchange(
                    current_head,
                    next_node,
                    Ordering::Release,
                    Ordering::Relaxed,
                )
                .is_ok()
            {
                unsafe {
                    let value = (*current_head).value;
                    Box::from_raw(current_head);
                    return Some(value);
                }
            }
        }
    }
}

In this Rust implementation, we define a Node struct to represent each node in the stack and a Stack struct to represent the stack itself. The push() method uses a CAS loop to atomically update the head of the stack with a new node. The pop() method uses a CAS loop to atomically update the head to the next node and returns the value of the popped node.

Advantages and Considerations

CAS has several advantages when used in concurrent programming:

Lock-Free: CAS enables lock-free and wait-free algorithms, avoiding the overhead and potential deadlocks associated with locks.
Scalability: CAS-based algorithms often scale well with increasing numbers of threads, as they allow multiple threads to compete for updating shared variables without blocking each other.
Performance: CAS can provide better performance compared to lock-based approaches in certain scenarios, especially when contention is low.

However, there are also some considerations to keep in mind when using CAS:

ABA Problem: CAS is susceptible to the ABA problem, where a value is read, modified, and then written back, but between the read and write, another thread modifies the value to a different value and then back to the original value. This can lead to incorrect behavior in certain algorithms.
Spinning: CAS-based algorithms often involve spinning in a loop until the CAS operation succeeds. If contention is high, this can lead to increased CPU usage and reduced performance.
Memory Ordering: CAS operations have specific memory ordering guarantees that need to be considered when designing concurrent algorithms. Incorrect usage of memory ordering can lead to subtle bugs and incorrect behavior.