Neural networks

I learned very early the difference between knowing the name of something and knowing something.

Neural networks

01 October 2025

I trace an academic history of some of the core ideas behind large language models, such as distributed representations, transducers, attention, the transformer, and generative pre-training.

The Reparameterization Trick

29 April 2018

A common explanation for the reparameterization trick with variational autoencoders is that we cannot backpropagate through a stochastic node. I provide a more formal justification.

Why Backprop Goes Backward

15 April 2018

Backprogation is an algorithm that computes the gradient of a neural network, but it may not be obvious why the algorithm uses a backward pass. The answer allows us to reconstruct backprop from first principles.

From Convolution to Neural Network

24 February 2017

Most explanations of CNNs assume the reader understands the convolution operation and how it relates to image processing. I explore convolutions in detail and explain how they are implemented as layers in a neural network.