Making a Toy Gradient Descent Implementation

1 Introduction

I’ve recently came across a few of Andrej Karpathy‘s video tutorial series on Machine Learning and I found them immensely fun and educational. I highly appreciate his hands-on approach to teaching basic concepts of Machine Learning.

Richard Feynman once famously stated that “What I cannot create, I do not understand.” So here’s my attempt to create a toy implementation of gradient descent, to better understand the core algorithm that powers Deep Learning after learning from by Karpathy’s video tutorial of micrograd:

https://github.com/hxy9243/toygrad

Even though there’s a plethora of books, blogs, and references that explains the gradient descent algorithm, it’s a totally different experience when you get to build it yourself from the scratch. During this course I found there are quite a few knowledge gaps for myself, things that I’ve taken for granted and didn’t really fully understand.

And this blog post is my notes during this experience. Even writing this post helped my understanding in many ways.

Training and inference example using toygrad

Read More

Hacking LangChain For Fun and Profit - I

1 Overview

Recently I’ve looked into the LangChain project and I was surprised by how it could be such a powerful and mature a project built in such short span of time. It covers many essential tools for creating your own LLM-driven projects, abstracting cumbersome steps with only a few lines of code.

I like where the project direction is going, and the development team has been proactively including and introducing new ideas of the latest LLM features in the project.

The path to understanding this new project weren’t really smooth. It has its own opinions for code organization and it could be unintuitive to guess how to hack your own projects for more than the tutorials. Many of the tutorials out there explains how to create a small application with LangChain but doesn’t cover how to intuitively comprehend the abstraction and design choices.

Hence I have taken the initiative to document my personal cognitive process throughout this journey. By doing so, I aim to clarify my own understanding while also providing assistance to y’all who are interested in hacking LangChain for fun and profit.

This blog post will dedicate to the overall understanding of all the concepts. I found it really helpful to start by understanding the concepts that directly interacts with the LLM, especially the core API interfaces. Once you have the mindmap of all the LangChain abstractions, it’s much more intuitive to hack and extend your own implementation.

Read More

Paper Readings on LLM Task Performing

1 Overview

I’ve spent the last couple of months reading about the development of AI and NLP development in general ever since the release of ChatGPT. And here’s some of my personal findings specifically on task performing capabilities.

The field of AI has been advancing rapidly, and the results have exceeded expectations for many users and researchers. One particularly impressive development is the Language Model (LLM), which has demonstrated a remarkable ability to generate natural, human-like text. Another exciting example is ChatGPT from OpenAI, which has shown impressive task performing and logical reasoning capabilities.

Looking ahead, I am optimistic that LLM will continue to be incredibly effective at performing more complex tasks with the help of plugins, prompt engineering, and some human input/interactions. The potential applications for LLM are vast and promising.

I’ve compiled a list of papers of extending the task performing capabilities in this field. I’m quite enthusiastic and excited about the potential of longer term of this capability that brings LLMs like ChatGPT to more powerful applications.

Here’s my first list of paper, also what I consider to be more fundamental papers, along with my very quick summaries.

Read More

OpenAPI Generator For Go Web Development

The Openapi Generator for Go API and Go web app development works surprisingly well, but somehow I found that it’s not so often mentioned. Recently I’ve tried it in one of my projects, and in my (limited) experience with it, I was pleasantly surprised by how good it was. With some setup, it could generate Go code with decent quality, and it’s fairly easy to use once you get a hang of it. Whether you’re building a standalone web-app from scratch or creating a service with REST API endpoints, openapi-generator might come up handy for you.

Using a generator might save much time to kick-start your web app project. And most importantly, I found that a good, well-defined, consistent API definition is so crucial to your development, testing, and most importantly, communication among teams and customers. I highly recommend that for any sizable project, you spend some quality time on writing a good API spec. It’ll become essential to your development workflow. I used to highly doubt this, and now I don’t think I can live without it.

And if you manually keep documentation, or API specifictions in sync with your Go code, you’ll have a hard time reviewing, checking, and testing between code and specs. The best way IMHO is to automate the process, by either generating the API code from spec, or the other around. Many toolings support either one of these, and openapi-generator is one of the really nice tools that I’m going to introduce in this blog post.

Openapi Generator supports many languages on the server as well as on the client side. And it has generator for different frameworks of Go. Right here I’m going to use go-server generator as an example. It uses the Gorilla framework for the server-side code.

For this blog post I’ve also made an example of code generation in my Github repo. I’ve generated the code, and implemented only one endpoint /books with example data:

https://github.com/hxy9243/go-examples/tree/main/src/openapi

Read More

Golang Channel Idioms

While learning Golang, I was fascinated with the power Golang’s goroutines and channels. Channel is a powerful tool to tackle synchronization problems in asynchronous programs. It acts as a bridge between async goroutines and can describe some complicated logic expressively. Together they can be powerful weapons in building async applications. On the other hand, when misused, they can be a nightmare to debug.

Here I’ve summarized a few of the valuable idioms of using Golang routines from multiple references as well as my own experience. They can serve as a helpful toolbox that comes in handy for similar problems. So that you don’t have to design them from scratch, which might help you avoid synchronization errors.

Read More

Paper Reading: 150 Successful Machine Learning Models Deployed: 6 Lessons Learned At Booking.com

Paper link: https://www.kdd.org/kdd2019/accepted-papers/view/150-successful-machine-learning-models-6-lessons-learned-at-booking.com

Or download.

First published in KDD from booking.com, the paper described its lessons from deploying Machine Learning models in their production service. It provided some intriguing insights. I believe many are very valuable to understanding applying Machine Learning in real-world scenarios.

Here are some of my takeaways.

Read More

Reading Summary: Ultralearning

Ultralearning is a quite interesting book from one of my favorite bloggers: Scott Young. Famous for his “MIT Challenge” – which he completed four years of MIT coursework in one single year by completely self-studying – he now blogs regularly on studying methods, student cognitions, and everything related.

This book is his summary of his researches and experiences of studying. The book’s author argued that: there’s one possible way to learn and improve yourself, with intensive training and exercises. Like training muscles, you can adopt an extraordinary, unorthodox training plan for your brains, and pick up a new skill in a short amount of time, be it a foreign language, programming, sketch, or even public speaking. He called it “ultralearning.” In the book, he researched many references and interviewed like-minded friends, who had similar experiences of acquiring or improving a skill intensively. And he summarizes all the essential principles, as the guide to a successful “ultralearning” project.

Read More