Making a Toy Gradient Descent Implementation

1 Introduction

I’ve recently came across a few of Andrej Karpathy’s video tutorial series on Machine Learning and I found them immensely fun and educational. I highly appreciate his hands-on approach to teaching basic concepts of Machine Learning.

Richard Feynman once famously stated that “What I cannot create, I do not understand.” So here’s my attempt to create a toy implementation of gradient descent, to better understand the core algorithm that powers Deep Learning after learning from by Karpathy’s video tutorial of micrograd:

https://github.com/hxy9243/toygrad

Even though there’s a plethora of books, blogs, and references that explains the gradient descent algorithm, it’s a totally different experience when you get to build it yourself from the scratch. During this course I found there are quite a few knowledge gaps for myself, things that I’ve taken for granted and didn’t really fully understand.

And this blog post is my notes during this experience. Even writing this post helped my understanding in many ways.

Read More