Normalizing Flows

Normalizing flows have both latent space and can produce tractable explicit probability distributions (closer to Autoregressive Modelling, they have tractable distributions, but not a latent space). This means we are able to get the likelihoods of a certain sample. This approach to modelling a flexible distribution is called a normalizing flow because the transformation of a probability distribution through a sequence of mappings is somewhat analogous to the flow of a fluid. From (Bishop & Bishop 2024) ...

June 2, 2025 · Reading Time: 10 minutes ·  By Xuanqiang Angelo Huang

Autoregressive Modelling

On Autoregressivity The main idea of autoregressivity is to use previous prediction to predict the next state. The Autoregressive property Autoregressive models model a joint distribution of aleatoric variables by assuming a chain rule like decomposition: $$ p(x) = \prod_{i=1}^{n} p(x_i | x_{1:i-1}) $$ If we assume independence between the variables, we don’t need many variables to model it $2T$, but this assumption is too strong. If we just use a tabular approach, we’ll have a combinatorial explosion: we will have about $2^{T - 1}$ possible states (if we assume the aleatoric variables are binary, and we are creating a table for each intermediate variable). ...

June 1, 2025 · Reading Time: 2 minutes ·  By Xuanqiang Angelo Huang

Neural Networks

Introduction: a neuron I am lazy, so I’m skipping the introduction for this set of notes. Look at Andrew Ng’s Coursera course for this part (here are the notes). Historical paper is (Rosenblatt 1958). One can view a perceptron to be a Log Linear Models with the temperature of the softmax that goes to 0 (so that it is an argmax). Trained with a stochastic gradient descent with a batch of 1 (this is called the perceptron update rule, see The Perceptron Model). ...

June 1, 2025 · Reading Time: 8 minutes ·  By Xuanqiang Angelo Huang

The Perceptron Model

The perceptron is a fundamental binary linear classifier introduced by (Rosenblatt 1958). It maps an input vector $\mathbf{x} \in \mathbb{R}^n$ to an output $y \in \{0,1\}$ using a weighted sum followed by a threshold function. Introduction to the Perceptron A mathematical model Given an input vector $\mathbf{x} = (x_1, x_2, \dots, x_n)$ and a weight vector $\mathbf{w} = (w_1, w_2, \dots, w_n)$, the perceptron computes: $$ z = \mathbf{w}^\top \mathbf{x} + b = \sum_{i=1}^{n} w_i x_i + b $$$$ y = f(z) = \begin{cases} 1, & \text{if } z \geq 0 \\ 0, & \text{otherwise} \end{cases} $$Learning Rule Given a labeled dataset $\{ (\mathbf{x}^{(i)}, y^{(i)}) \}_{i=1}^{m}$, the perceptron uses the following weight update rule for misclassified samples ($y^{(i)} \neq f(\mathbf{w}^\top \mathbf{x}^{(i)} + b)$): ...

May 31, 2025 · Reading Time: 3 minutes ·  By Xuanqiang Angelo Huang

Advanced 3D Representations

3D representations In this section, we present some of the most common 3D representations used in computer graphics and computer vision. Each representation has its own advantages and disadvantages, and the choice of representation often depends on the specific application. Voxels With voxels we discretize 3D space into a 3d grid, it is an intuitive manner to represent the data, but it has limited resolution. It needs $\mathcal{O}(n^{3})$ memory. Points and Volumetric primitives We can discretize surfaces into 3D points. Yet, this does not model connectivity, and might vary from frame to frame if it is a video. ...

May 30, 2025 · Reading Time: 12 minutes ·  By Xuanqiang Angelo Huang

Recurrent Neural Networks

Recurrent Neural Networks allows us to model arbitrarily long sequence dependencies, at least in theory (this is also why they seem a very nice choice in theory for time series). This is very handy, and has many interesting theoretical implication. But here we are also interested in the practical applicability, so we may need to analyze common architectures used to implement these models, the main limitation and drawbacks, the nice properties and some applications. ...

May 30, 2025 · Reading Time: 6 minutes ·  By Xuanqiang Angelo Huang

Backpropagation

Backpropagation is perhaps the most important algorithm of the 21st century. It is used everywhere in machine learning and is also connected to computing marginal distributions. This is why all machine learning scientists and data scientists should understand this algorithm very well. An important observation is that this algorithm is linear: the time complexity is the same as the forward pass. Derivatives are unexpectedly cheap to calculate. This took a lot of time to discover. See colah’s blog. Karpathy has a nice resource for this topic too! Stanford lecture on backpropagation is another resource. ...

May 29, 2025 · Reading Time: 8 minutes ·  By Xuanqiang Angelo Huang

Transformers

Transformers, introduced in NLP language translation in (Vaswani et al. 2017), are one of the cornerstones of modern deep learning. For this reason, it is quite important to understand how they are done. Introduction to Transformers Transformers are called in this manner because they transform the input data space into another with the same dimensionality. The goal of the transformation is that the new space will have a richer internal representation that is better suited to solving downstream tasks. (Bishop & Bishop 2024) ...

May 29, 2025 · Reading Time: 10 minutes ·  By Xuanqiang Angelo Huang

Convolutional Neural Network

Introduction to Convolutional NN Design Goals We want to be invariant to some transformations but also at the same time to be specific to some thing. Convolutional Neural Networks (CNNs) are a class of deep neural networks that are particularly effective for image processing tasks. They are designed to automatically and adaptively learn spatial hierarchies of features from images. Compared to standard Fully connected Neural Networks, they reuse weights, making their number of parameter much fewer. ...

May 28, 2025 · Reading Time: 13 minutes ·  By Xuanqiang Angelo Huang

Egocentric Vision

Egocentric vision is a sub-field of computer vision that studies vision understanding from a centered point of view, that typical of animals. One historical thing is MIT 1997 they had to bring around very heavy cameras. Now we have glasses. Other examples of egocentric vision are cars with cameras that see their surrounding, or robots equipped with cameras mimicking human vision. The difference of egocentric vision compared to standard vision techniques is the high variability and instability of the video, and the concept of movement and interactions inside the image. Standard computer vision is disembodied and controlled field of view. ...

May 28, 2025 · Reading Time: 7 minutes ·  By Xuanqiang Angelo Huang