04 05 2020

Book Review: Black Swan - The Impact of the Highly Improbable

I’ve just finished the major part (without the postscript essays) of the famous and oft-discussed book, once a best seller - the Black Swan. The author was knowledgable, and the book was insightful and well-crafted, with his unique style of discussing serious topics with occasional anecdotes and vivid storytelling. It was a fantastic ride.

01 25 2020

Reading

Reading Summary: Ultralearning

Ultralearning is a quite interesting book from one of my favorite bloggers: Scott Young. Famous for his “MIT Challenge” – which he completed four years of MIT coursework in one single year by completely self-studying – he now blogs regularly on studying methods, student cognitions, and everything related.

This book is his summary of his researches and experiences of studying. The book’s author argued that: there’s one possible way to learn and improve yourself, with intensive training and exercises. Like training muscles, you can adopt an extraordinary, unorthodox training plan for your brains, and pick up a new skill in a short amount of time, be it a foreign language, programming, sketch, or even public speaking. He called it “ultralearning.” In the book, he researched many references and interviewed like-minded friends, who had similar experiences of acquiring or improving a skill intensively. And he summarizes all the essential principles, as the guide to a successful “ultralearning” project.

01 20 2020

PaperReading

Paper Reading: Zookeeper

Paper: https://www.usenix.org/legacy/events/atc10/tech/full_papers/Hunt.pdf

Presentation: https://www.usenix.org/conference/usenix-atc-10/zookeeper-wait-free-coordination-internet-scale-systems

11 03 2019

Reading

Book Review: What the Dormouse Said

https://play.google.com/store/books/details?pcampaignid=books_read_action&id=cTyfxP-g2IIC
https://www.amazon.com/What-Dormouse-Said-Counterculture-Personal/dp/0143036769

When logic and proportion

Have fallen sloppy dead

And the White Knight is talking backwards

And the Red Queen’s off with her head

Remember what the dormouse said

Feed your head

Feed your head

09 16 2019

Reading

Book Review: Data and Goliath

https://play.google.com/store/books/details/Bruce_Schneier_Data_and_Goliath_The_Hidden_Battles?id=MwF-BAAAQBAJ https://www.amazon.com/dp/039335217X/

“Data and Goliath” is an excellent book a friend recommended. It’s a summary of all the dangerous and negative ways data, and the “Big Data” technology can shape our societies. The author Bruce Schneier is a prominent expert in cryptography who published impactful works on cryptography and issues on privacy. He’s also on the board of directors of Electronic Frontier Foundation.

https://www.schneier.com/blog/about/
https://en.m.wikipedia.org/wiki/Bruce_Schneier
Renowned Security Expert Bruce Schneier Joins EFF Board of Directors

08 18 2019

Reading

Reading Summary 2019-08

Cassandra Time Series Bucketing

How to model timeseries data with Cassandra.

Simple GoRPC

The best way to understand something, is to build one yourself. This tutorial covers basic network programming in Go, struct design and the usage of reflect package.

Optimizing M3: How Uber Halved Our Metrics Ingestion Latency by Forking the Go Compiler

A great experience sharing blog on how to debug a performance issue in their services. And with profiling and analysis tools, the Uber team was able to pinpoint this issue in worker pool and goroutine stack allocation, and then they forked the Go compiler to prove it’s a regression in the Go compiler. A very nice read and analysis process.

Book: Programming Models for Distributed Computation

A programming book on topics in distributed computation, from teaching experience in distributed system course, from Northeastern University.

Spotify Engineering Culture

A very nice engineering blog from 2014. A excellent overview of Spotify culture, and an introduction on how to build the “agile” team.

How We Helped Our Reporters Learn to Love Spreadsheets

NYTimes has released its in-house course to teach journalists data science. Journalism can also benefit from a little coding/data analytics skills.

05 05 2019

Reading

Reading Summary 2019-04

An Overview of Go’s Tooling

If go is one of your favorite languages as well, this is a must read: it introduces all the basic tooling that comes with Go’s ecosystem, which might greatly save your time.

HackerNews thread on TLA+:

A thread from HackerNews, discussing the importance of formal verification for distributed systems.

TLA+ and formal verification is notoriously known for its complexity and steep learning curve. This might be one of my very future goals.

Are You a Software Architect?

What it takes to be a software architect, a great blog post from InfoQ.

InfluxData is Building a Fast Implementation of Apache Arrow in Go Using c2goasm and SIMD

TIL that it is possible to convert your C/C++ assembly into Go’s assembly, and call from Go’s code. InfluxData leverages the tooling to embed AVX/SSE instructions into Golang’s assembly, thus boosts Go code’s performance, sometimes by orders of magnitude.

More information on this tool, c2goasm, work from Minio.

Org-Mode Is One of the Most Reasonable Markup Languages to Use for Text

I think so, too. But it’ll require a community and proper tooling to see it really prosper. Hope to see that some day.

Why and How Capitalism Needs to Be Reformed

A great piece from Ray Dalio, the founder of investment firm Bridgewaters, a seasoned investor, discusses in his recent long post why American capitalism is sick in distributing resources, especially educational resources, and needs to be reformed to stay healthy.

04 01 2019

PaperReading

Blog Reading: The log - What every software engineer should know about real-time data's unifying abstraction

Link: https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

Kafka is a message queue, a pub-sub system, an event sourcing tool, and a stream processing infrastructure, is a key part of many streaming distributed systems that requires streaming data. Its underlying idea, is to aggregate data from a distributed sources, to a unifying linear log structure.

The blog is from Kafka’s creator Jay Kreps when he was at LinkedIn, contemplating the log abstraction as a key part of any distributed systems. This is not Kafka’s design paper, implementation or a tutorial, but rather the process of brewing the idea that led to its birth, and I found it equally interesting. The following are my notes.

The link to Kafka paper: https://www.semanticscholar.org/paper/Kafka-%3A-a-Distributed-Messaging-System-for-Log-Kreps/9f948448e7a5f0cc94cd53656410face8b31b18a

03 17 2019

Reading

Reading-Summary 2019-03

10 Breakthrough Technologies in 2019, by Bill Gates

Take a look at what Mr. Gates thinks are the greatest technology breakthroughs right now. The list might surprise you.

What happens when you click Play button on Netflix

How Netflix leverages AWS technologies to build world-scale, highly-availbile, fault-tolerant distributed video streaming system.

Lyft Case Study - Amazon Web Services

Lyft architecture evolution on AWS.

Compounding Knowledge

From Farnam Street – an interesting blog site I found recently.

Also on Farnam Street and its “mental models”: The Mental Model Fallacy. TL;DR: The so-called “mental models” from Farnam Street is not of much value when it’s from non-practitioners. And to learn businees, like basketball, swimming, etc., you’ll need to actually practice to learn the intricate knowledge that are not easily translated into writings.

Parsing Gigabytes of JSON per Second

Unfortunately I didn’t have time to finish reading this paper. But it’s good to learn the concept of branchless algorithms to fill the CPU pipeline and achieve amazing performance.

03 10 2019

PaperReading

Paper Reading: Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center

Link to paper: https://people.eecs.berkeley.edu/~alig/papers/mesos.pdf

Presentation: https://www.usenix.org/conference/nsdi11/mesos-platform-fine-grained-resource-sharing-data-center

Mesos is a cluster resource management software from UC Berkeley. Unlike many other frameworks already existed, Mesos is designed to support heterogeneous frameworks (Hadoop, MPI, etc) in the same cluster and share resources between them, by providing a thin layer that making resource offers to the framework schedulers, and delegate the scheduling decision to the frameworks themselves.

With this design, Mesos can achieve pretty good elasticity between frameworks, and letting frameworks choose their own resources results in better data locality.

Kevin Hu's Blog

A Hungry Fool

Book Review: Black Swan - The Impact of the Highly Improbable

Reading Summary: Ultralearning

Paper Reading: Zookeeper

Book Review: What the Dormouse Said

Book Review: Data and Goliath

Reading Summary 2019-08

Cassandra Time Series Bucketing

Simple GoRPC

Optimizing M3: How Uber Halved Our Metrics Ingestion Latency by Forking the Go Compiler

Book: Programming Models for Distributed Computation

Spotify Engineering Culture

How We Helped Our Reporters Learn to Love Spreadsheets

Reading Summary 2019-04

An Overview of Go’s Tooling

HackerNews thread on TLA+:

Are You a Software Architect?

InfluxData is Building a Fast Implementation of Apache Arrow in Go Using c2goasm and SIMD

Org-Mode Is One of the Most Reasonable Markup Languages to Use for Text

Why and How Capitalism Needs to Be Reformed

Blog Reading: The log - What every software engineer should know about real-time data's unifying abstraction

Reading-Summary 2019-03

10 Breakthrough Technologies in 2019, by Bill Gates

What happens when you click Play button on Netflix

Lyft Case Study - Amazon Web Services

Compounding Knowledge

Parsing Gigabytes of JSON per Second

Paper Reading: Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center