Paper Reading 09-09: C++ and the Perils of Double-Checked Locking

An interesting paper on the perils of C++, design pattern and multi-threading when they’re mixed together:

C++ and the Perils of Double-Checked Locking

The DCLP(Double-Checked Locking Pattern) is often-used in singleton design pattern: you’d like to initialize a shared object for singleton pattern, you follow the steps:

  • check lock if the resource is already initialized
  • if no, lock the mutex
  • check again if the resource is locked inside the mutex-protected area.
  • and again if no, initialize the object

See C++ example:

1
2
3
4
5
6
7
8
9
10
11
Singleton* Singleton::instance() {
if (pInstance == 0) { // 1st check, to avoid locking every time
Lock lock;

if (pInstance == 0) { // 2nd check, a safe check to guarantee correctness
pInstance = new Singleton;
}
}

return pInstance
}

This pattern however, introduces subtle bugs when described in C++ with multi-threading.

The issue is with this statement:

1
pInstance = new Singleton;

The following steps happen:

  1. Allocate memory for the object
  2. Construct an object in the allocated memory.
  3. Assign pInstance to the allocated memory.

But C++ specification don’t enforce the steps happen in order, and compilers are therefore not constrained to reorder them for sake of optimization. As long as the observable outcome of the instructions are correct, compilers are free to place instructions in an order so that CPUs are most utilized. Consider the following case with DCLP:

  • Thread A execute the DCLP piece of code for the first time, performs the 1st check, lock the mutex, performs the 2nd check, allocates memory for Singleton object, points the pInstance to the allocated memory. But before the Singleton object is constructed, thread A is suspended or another thread is scheduled at the same time.
  • Thread B enters DCLP area, determines that pInstance is non-null, and start using the object even before it’s fully constructed, and start accessing the Singleton object.

Oops. This is a very subtle bug, and hard to detect issue when we’re trying to initialize a shared resource once.

The paper digs into details on how compiler can leverage all sorts of different optimizations to spoil you effort to correct the DCLP code, and how to actually implement it correctly with volatile keyword.

It’s a very interesting paper on algorithm, C++, and programming, It makes you stand in awe of the difficulty and intricacies of C++ and multi-threaded programming.

A Note on Linux Hugepages

Page table is where Linux stores virtual to physical page address translation, and its size can get huge when memory usage is high. One way to reduce the size of page tables, and reduce the number of page faults, is to use huge pages. I’ve been digging some information on hugepages for my own curiosity, and it looks like Linux has pretty good support for huge pages. And this blog serves as a quick note on my readings.

Read More