Posts by Year

2023 1
2022 2
2021 20
2020 67
2019 21

2023

LoRA

August 20 2023

I recently completed another summer internship at Meta (formerly Facebook). I was surprised to learn that one of the intern friends I met was an avid read...

2022

Hacking Word Hunt

August 21 2022

Update: The code was modified with further optimizations. In particular, instead of checking the trie per every DFS call, we update the trie pointer along...

Glow-TTS

April 11 2022

Note: This blog post was completed as part of Yale’s CPSC 482: Current Topics in Applied Machine Learning.

2021

Reflections and Expectations

December 27 2021

Last year, I wrote a blog post reflecting on the year 2020. Re-reading what I had written then was surprisingly insightful, particularly because I could see ...

Score Matching

December 26 2021

Recently, I’ve heard a lot about score-based networks. In this post, I will attempt to provide a high-level overview of what scores are and how the concept o...

Flow Models

June 21 2021

In this post, we will take a look at Flow models, which I’ve been obsessed with while reading papers like Glow-TTS and VITS. This post is heavily based on th...

From ELBO to DDPM

May 17 2021

In this short post, we will take a look at variational lower bound, also referred to as the evidence lower bound or ELBO for short. While I have referenced E...

Reboot

May 15 2021

It has been a while since I last posted on this blog. Admittedly, a lot has happened in my life: I have been discharged from the Republic of Korea Army, rece...

Linear Attention Computation in Nyströmformer

March 15 2021

In this post, we will take a look at Nyström approximation, a technique that I came across in Nyströmformer: A Nyström-based Algorithm for Approximating Self...

Relative Positional Encoding

March 01 2021

In this post, we will take a look at relative positional encoding, as introduced in Shaw et al (2018) and refined by Huang et al (2018). This is a topic I me...

Locality Sensitive Hashing

February 25 2021

These days, I’ve found myself absorbed in the world of memory-efficient transformer architectures. Transformer models require $O(n^2)$ runtime and memory due...

BERT’s Common Sense, or Lack Thereof

February 18 2021

A few days ago, I came across a simple yet nonetheless interesting paper, titled “NumerSense: Probing Numerical Commonsense Knowledge of Pre-Trained Language...

GPT from Scratch

February 15 2021

These days, I’m exploring the field of natural language generation, using auto-regressive models such as GPT-2. HuggingFace transformers offers a host of pre...

NLI Models as Zero-Shot Classifiers

February 10 2021

In the previous post, we took a look at how to extract keywords from a block of text using transformer models like BERT. In that blog post, you might recall ...

Keyword Extraction with BERT

February 05 2021

I’ve been interested in blog post auto-tagging and classification for some time. Recently, I was able to fine-tune RoBERTa to develop a decent multi-label, m...

NLG with GPT-2

February 01 2021

When GPT-3 was released, people were amazed by its ability to generate coherent, natural-sounding text. In fact, it wasn’t just text; it could generate JavaS...

Rejection Sampling

January 25 2021

In today’s post, we will take a break from deep learning and turn our attention to the topic of rejection sampling. We’ve discussed the topic of sampling som...

Attention is All You Need

January 20 2021

Today, we are finally going to take a look at transformers, the mother of most, if not all current state-of-the-art NLP models. Back in the day, RNNs used to...

Attention Mechanism

January 15 2021

Attention took the NLP community by storm a few years ago when it was first announced. I’ve personally heard about attention many times, but never had the ch...

Plotting Prime Numbers

January 10 2021

Today’s article was inspired by a question that came up on a Korean mathematics Facebook group I’m part of. The gist of the question could probably be transl...

(Attempt at) Knowledge Distillation

January 08 2021

For the past couple of months or so, I’ve been spending time looking into transformers and BERT. Transformers are state of the art NLP models that are now re...

Fast Gradient Sign Method

January 05 2021

In today’s post, we will take a look at adversarial attacks. Adversarial attacks have become an active field of research in the deep learning community, for ...

Reflections and Expectations

January 01 2021

2020 was unlike any other. The COVID pandemic fundamentally transformed our ways of life. Masks became a norm; classes were taught on Zoom; social distancing...

2020

Better seq2seq

December 27 2020

In the previous post, we took a look at how to implement a basic sequence-to-sequence model in PyTorch. Today, we will be implementing a small improvement to...

Introduction to seq2seq models

December 20 2020

For a very long time, I’ve been fascinated by sequence-to-sequence models. Give the model a photo as input, it spits out a caption to go along with it; give ...

Neural Style Transfer

December 10 2020

In today’s post, we will take a look at neural style transfer, or NMT for short. NMT is something that I first came across about a year ago when reading Fran...

GAN in PyTorch

November 30 2020

In this blog post, we will be revisiting GANs, or general adversarial networks. This isn’t the first time we’ve seen GANs on this blog: we’ve implemented GAN...

Monte Carlo Coin Toss

November 23 2020

While mindlessly browsing through Math Stack Exchange, I stumbled across an interesting classic:

InceptionNet in PyTorch

November 14 2020

In today’s post, we’ll take a look at the Inception model, otherwise known as GoogLeNet. I’ve actually written the code for this notebook in October 😱 but wa...

VGG PyTorch Implementation

November 01 2020

In today’s post, we will be taking a quick look at the VGG model and how to implement one using PyTorch. This is going to be a short post since the VGG archi...

PyTorch RNN from Scratch

October 25 2020

In this post, we’ll take a look at RNNs, or recurrent neural networks, and attempt to implement parts of it in scratch through PyTorch. Yes, it’s not entirel...

PyTorch, From Data to Modeling

October 20 2020

These past few weeks, I’ve been powering through PyTorch notebooks and tutorials, mostly because I enjoyed the PyTorch API so much and found so many of it us...

PyTorch Tensor Basics

October 10 2020

This is a very quick post in which I familiarize myself with basic tensor operations in PyTorch while also documenting and clarifying details that initially ...

Data Viz Basics with Python

September 28 2020

This post is based on this article on Medium, titled “Matplotlib + Seaborn + Pandas: An Ideal Amalgamation for Statistical Data Visualization.” This article ...

BLEU from scratch

September 21 2020

Recently, I joined the Language, Information, and Learning at Yale lab, led by Professor Dragomir Radev. Although I’m still in what I would consider to be th...

A PyTorch Primer

September 14 2020

I’ve always been a fan of TensorFlow, specifically tf.keras, for its simplicity and ease of use in implementing algorithms and building models. Today, I deci...

Beta, Bayes, and Multi-armed Bandits

August 28 2020

Recently, I fortuitously came across an interesting blog post on the multi-armed bandit problem, or MAB for short. I say fortuitous because the contents of t...

Django and Summer Internship

August 15 2020

For the past month and a half, I’ve been working as a backend developer for ReRent, a Yale SOM-based hospitality startup. Working alongside motivated, inspir...

Gaussian Mixture Models

August 01 2020

We’ve discussed Gaussians a few times on this blog. In particular, recently we explored Gaussian process regression, which is personally a post I really enjo...

Gamma and Zeta

July 28 2020

Maintaining momentum in writing and self-learning has admittedly been difficult these past few weeks since I’ve started my internship. Normally, I would writ...

Text Preprocessing with Blog Post Data

July 22 2020

In today’s post, we will finally start modeling the auto-tagger model that I wanted to build for more blog. As you may have noticed, every blog post is class...

Docker Blitz

July 17 2020

Docker was one of these things that I always wanted to learn, but never got into. Part of the reason was that it seemed distant and even somewhat unnecessary...

Word2vec from Scratch

July 13 2020

In a previous post, we discussed how we can use tf-idf vectorization to encode documents into vectors. While probing more into this topic and geting a taste ...

Complex Fibonacci

July 10 2020

A few days ago, a video popped up in my YouTube suggestions. We all know how disturbingly powerful the YouTube recommendation algorithm is: more than 90 perc...

Introduction to tf-idf

July 05 2020

Although I’ve been able to automate some portion of the blog workflow, there’s always been a challenging part that I wanted to further automate myself using ...

Gaussian Process Regression

July 02 2020

In this post, we will explore the Gaussian Process in the context of regression. This is a topic I meant to study for a long time, yet was never able to due ...

Traveling Salesman Problem with Genetic Algorithms

June 28 2020

The traveling salesman problem (TSP) is a famous problem in computer science. The problem might be summarized as follows: imagine you are a salesperson who n...

Revisiting Basel with Fourier

June 25 2020

In the last post, we revisited the Riemann Zeta function, which we had briefly introduced in another previous post on Euler’s take on the famous Basel proble...

Riemann Zeta and Prime Numbers

June 23 2020

The other day, I came across an interesting article by Chris Henson on the relationship between the Riemann Zeta function and prime numbers. After encounteri...

On BFS and DFS

June 20 2020

In this post, we will be taking a look at a very simple yet popular search algorithm, namely breadth-first search and depth-first search methods. To give you...

Newton-Raphson, Secant, and More

June 16 2020

Recently, I ran into an interesting video on YouTube on numerical methods (at this pont, I can’t help but wonder if YouTube can read my mind, but now I digre...

The Gibbs Sampler

June 12 2020

In this post, we will explore Gibbs sampling, a Markov chain Monte Carlo algorithm used for sampling from probability distributions, somewhat similar to the ...

Scikit-learn Sprint

June 09 2020

A reflection on my first open source contribution sprint

Introduction to PySpark

June 05 2020

I’ve stumbled across the word “Apache Spark” on the internet so many times, yet I never had the chance to really get to know what it was. For one thing, it s...

Dissecting LSTMs

June 02 2020

In this post, we will revisit the topic of recurrent neural networks, or RNNs. Although we have used RNNs before in a previous post on character-based text p...

Scikit-learn Pipelines with Titanic

May 30 2020

In today’s post, we will explore ways to build machine learning pipelines with Scikit-learn. A pipeline might sound like a big word, but it’s just a way of c...

Natural Gradient and Fisher

May 27 2020

In a previous post, we took a look at Fisher’s information matrix. Today, we will be taking a break from the R frenzy and continue our exploration of this to...

Blog Workflow Cleanup

May 23 2020

These past few days, I’ve been writing posts on R while reading Hadley Wickham’s R for Data Science. R is no Python, but I’m definitely starting to see what ...

R Tutorial (4)

May 22 2020

In this post, we will continue our journey down the R road to take a deeper dive into data frames. R is great for data analysis and wranging when it comes to...

SQL Basics with Pandas

May 19 2020

Recently, I was compelled by my own curiosity to study SQL, a language I have heard about quite a lot but never had a chance to study. At first, SQL sounded ...

R Tutorial (3)

May 10 2020

A few days ago, I saw a friend who posted an Instagram story looking for partners to study R with. I jumped at the opportunity without hesitation—based on my...

Learning C

May 05 2020

So I’ve been spending some time this past week or so picking up a new language: C. C is considered by many to be one of the most basic and fundamental of all...

Understanding the Leibniz Rule

May 01 2020

Before I begin, I must say that this video by Brian Storey at Olin College is the most intuitive explanation of the Leibniz rule I have seen so far. Granted,...

R Tutorial (2)

April 25 2020

In this post, we will continue our journey with the R programming language. In the last post, we explored some basic plotting functions and how to use them t...

R Tutorial (1)

April 16 2020

It’s been a while since we last took a look at the R programming language. While I don’t see R becoming my main programming language (I’ll always be a Python...

Fisher Score and Information

April 11 2020

Fisher’s information is an interesting concept that connects many of the dots that we have explored so far: maximum likelihood estimation, gradient, Jacobian...

On Expectations and Integrals

April 05 2020

Expectation is a core concept in statistics, and it is no surprise that any student interested in probability and statistics may have seen some expression li...

Stirling Approximation

April 01 2020

It’s about time that we go back to the old themes again. When I first started this blog, I briefly dabbled in real analysis via Euler, with a particular focu...

Principal Component Analysis

March 22 2020

Principal component analysis is one of those techniques that I’ve always heard about somewhere, but didn’t have a chance to really dive into. PCA would come ...

Fourier Series

March 19 2020

Taylor series is used in countless areas of mathematics and sciences. It is a handy little tool in the mathematicians arsenal that allows us to decompose any...

The Math Behind GANs

March 15 2020

Generative Adversarial Networks refer to a family of generative models that seek to discover the underlying distribution behind a certain data generating pro...

MLE and KL Divergence

March 09 2020

These days, I’ve been spending some time trying to read published research papers on neural networks to gain a more solid understanding of the math behind de...

Contributing Open Source

March 04 2020

Programming is difficult but fun. Or maybe it’s the other way around. Either way, any developer would know that external libraries are something that makes p...

First Date with Flask

February 28 2020

These past few days, I’ve been taking a hiatus from the spree of neural networks and machine learning to explore an entirely separate realm of technology: we...

My First GAN

February 25 2020

Generative models are fascinating. It is no wonder that GANs, or General Adversarial Networks, are considered by many to be where future lies for deep learni...

A Step Up with Variational Autoencoders

February 22 2020

In a previous post, we took a look at autoencoders, a type of neural network that receives some data as input, encodes them into a latent representation, and...

So What are Autoencoders?

February 18 2020

In today’s post, we will take yet another look at an interesting application of a neural network: autoencoders. There are many types of autoencoders, but the...

A Simple Autocomplete Model

February 10 2020

You might remember back in the old days when autocomplete was just terrible. The suggestions provided by autocomplete would be useless if not downright stupi...

A Brief Introduction to Recurrent Neural Networks

February 08 2020

Neural networks are powerful models that can be used to identify complex hidden patterns in data. There are many types of neural networks, two of which we ha...

Building Neural Network From Scratch

February 05 2020

Welcome back to another episode of “From Scratch” series on this blog, where we explore various machine learning algorithms by hand-coding them from scratch....

Convolutional Neural Network with Keras

February 01 2020

Recently, a friend recommended me a book, Deep Learning with Python by Francois Chollet. As an eager learner just starting to fiddle with the Keras API, I de...

Writing with Typora

January 26 2020

Disclaimer: I was not sponsored by the developers of Typora to write this post, although that would have been great.

Convex Combinations and MAP

January 25 2020

In a previous post, we briefly explored the notion of maximum a posteriori and how it relates to maximum likelihood estimation. Specifically, we derived a ge...

The Exponential Family

January 22 2020

Normal, binomial, exponential, gamma, beta, poisson… These are just some of the many probability distributions that show up on just about any statistics text...

Bayesian Linear Regression

January 20 2020

In today’s post, we will take a look at Bayesian linear regression. Both Bayes and linear regression should be familiar names, as we have dealt with these tw...

Naive Bayes Model From Scratch

January 17 2020

Welcome to part three of the “from scratch” series where we implement machine learning models from the ground up. The model we will implement today, called t...

First Neural Network with Keras

January 15 2020

Lately, I have been on a DataCamp spree after unlocking a two-month free unlimited trial through Microsoft’s Visual Studio Dev Essentials program. If you hav...

A Short R Tutorial

January 09 2020

This is an experimental jupyter notebook written using IRkernel. The purpose of this notebook is threefolds: first, to document my progress with self-learnin...

Conda Virtual Environments with Jupyter

January 07 2020

As a novice who just started learning Python just three months ago, I was clueless about what virtual environments were. All I knew was that Anaconda was pur...

An Introduction to Markov Chain Monte Carlo

January 02 2020

Finally, here is the post that was promised ages ago: an introduction to Monte Carolo Markov Chains, or MCMC for short. It took a while for me to understand ...

Jake Tae

Posts by Year

2023

2022

2021

2020

2019