Reading Summary: 11/06/2020

Technology

Edsger Dijkstra: The Man Who Carried Computer Science on His Shoulders

The often untold story behind a mastermind of Computer Science: Dijkstra, whose name has been an important algorithm widely used in GPS navigation.

The blog described a wise, hard-thinker, a great mind who made unparallel contributions to both Computer Science as a mathematical and logical view, as well as Software Engineering which focuses on building software and hardware components.

He’s most famous for his private reports, named “EWD”, and continued for more than forty years, describing his views on Computer Science and Software Engineering in general, and sometimes worked as reviews for others’ work. One of the most influencing “EWD” report was “Notes on Structured Programming,” which argued programming as a serious form of skill that demands intellectual rigor.

In 1972, Dijkstra received the ACM Turing Award, he was recognized for:

contributions to programming as a high, intellectual challenge; for eloquent insistence and practical demonstration that programs should be composed correctly, not just debugged into correctness; for illuminating perception of problems at the foundations of program design.

He has great passion for his art, and his strong personality sometimes sparked controversies. One of the most famous was the discussion on critiquing “GOTO” statements as harmful. It brought widespread, heated debate, yet Dijkstra’s view finally prevailed, and his insistence made a monumental change to programming paradigm.

There are much more interesting details around his personal and academic life in the original post, too long to be summarized here. For example, his had a mini-van in Austin, which he often drove to national parks with his wife, and it was named the “Touring Machine.” If you are passionate with computers and software, have a long weekend afternoon, it’s worth a good read.

Read More

Paper Reading 09-09: C++ and the Perils of Double-Checked Locking

An interesting paper on the perils of C++, design pattern and multi-threading when they’re mixed together:

C++ and the Perils of Double-Checked Locking

The DCLP(Double-Checked Locking Pattern) is often-used in singleton design pattern: you’d like to initialize a shared object for singleton pattern, you follow the steps:

  • check lock if the resource is already initialized
  • if no, lock the mutex
  • check again if the resource is locked inside the mutex-protected area.
  • and again if no, initialize the object

See C++ example:

1
2
3
4
5
6
7
8
9
10
11
Singleton* Singleton::instance() {
if (pInstance == 0) { // 1st check, to avoid locking every time
Lock lock;

if (pInstance == 0) { // 2nd check, a safe check to guarantee correctness
pInstance = new Singleton;
}
}

return pInstance
}

This pattern however, introduces subtle bugs when described in C++ with multi-threading.

The issue is with this statement:

1
pInstance = new Singleton;

The following steps happen:

  1. Allocate memory for the object
  2. Construct an object in the allocated memory.
  3. Assign pInstance to the allocated memory.

But C++ specification don’t enforce the steps happen in order, and compilers are therefore not constrained to reorder them for sake of optimization. As long as the observable outcome of the instructions are correct, compilers are free to place instructions in an order so that CPUs are most utilized. Consider the following case with DCLP:

  • Thread A execute the DCLP piece of code for the first time, performs the 1st check, lock the mutex, performs the 2nd check, allocates memory for Singleton object, points the pInstance to the allocated memory. But before the Singleton object is constructed, thread A is suspended or another thread is scheduled at the same time.
  • Thread B enters DCLP area, determines that pInstance is non-null, and start using the object even before it’s fully constructed, and start accessing the Singleton object.

Oops. This is a very subtle bug, and hard to detect issue when we’re trying to initialize a shared resource once.

The paper digs into details on how compiler can leverage all sorts of different optimizations to spoil you effort to correct the DCLP code, and how to actually implement it correctly with volatile keyword.

It’s a very interesting paper on algorithm, C++, and programming, It makes you stand in awe of the difficulty and intricacies of C++ and multi-threaded programming.

A Note on Linux Hugepages

Page table is where Linux stores virtual to physical page address translation, and its size can get huge when memory usage is high. One way to reduce the size of page tables, and reduce the number of page faults, is to use huge pages. I’ve been digging some information on hugepages for my own curiosity, and it looks like Linux has pretty good support for huge pages. And this blog serves as a quick note on my readings.

Read More