Kalman Filters

Kalman Filters are defined as follows: We start with a variable $X_{0} \sim \mathcal{N}(\mu, \Sigma)$, then we have a motion model and a sensor model: $$ \begin{cases} X_{t + 1} = FX_{t} + \varepsilon_{t} & F \in \mathbb{R}^{d\times d}, \varepsilon_{t} \sim \mathcal{N}(0, \Sigma_{x})\\ Y_{t} = HX_{t} + \eta_{t} & H \in \mathbb{R}^{m \times d}, \eta_{t} \sim \mathcal{N}(0, \Sigma_{y}) \end{cases} $$ Inference is just doing things with the Gaussians. One can interpret the $Y$ to be the observations and $X$ to be the underlying beliefs about a certain state....

2 min · Xuanqiang 'Angelo' Huang

Markov Chains

Introduzione alle catene di Markov La proprietà di Markov Una sequenza di variabili aleatorie $X_{1}, X_{2}, X_{3}, \dots$ gode della proprietà di Markov se vale: $$ P(X_{n}| X_{n - 1}, X_{n - 2}, \dots, X_{1}) = P(X_{n}|X_{n-1}) $$ Ossia posso scordarmi tutta la storia precedente, mi interessa solamente lo stato precedente per sapere la probabilità attuale. Da un punto di vista filosofico/fisico, ha senso perché mi sta dicendo che posso predire lo stato successivo se ho una conoscenza (completa, (lo dico io completo, originariamente non esiste)) del presente....

7 min · Xuanqiang 'Angelo' Huang

Kernel Methods

As we will briefly see, Kernels will have an important role in many machine learning applications. In this note we will get to know what are Kernels and why are they useful. Intuitively they measure the similarity between two input points. So if they are close the kernel should be big, else it should be small. We briefly state the requirements of a Kernel, then we will argue with a simple example why they are useful....

9 min · Xuanqiang 'Angelo' Huang

Active Learning

Active Learning concerns methods to decide how to sample the most useful information in a specific domain; how can you select the best sample for an unknown model? In this setting, we are interested in the concept of usefulness of information. One of our main goals is to reduce uncertainty, thus, Entropy-based (mutual information) methods are often used. For example, we can use active learning to choose what samples needs to be labelled in order to have highest accuracy on the trained model, when labelling is costly....

9 min · Xuanqiang 'Angelo' Huang

Maximum Entropy Principle

The maximum entropy principle is one of the most important guiding motives in artificial artificial intelligence. Its roots emerge from a long tradition of probabilistic inference that goes back to Laplace and Occam’s Razor, i.e. the principle of parsimony. Let’s start with a simple example taken from Andreas Kraus’s Lecture notes in the ETH course of Probabilistic Artificial Intelligence: Consider a criminal trial with three suspects, A, B, and C. The collected evidence shows that suspect C can not have committed the crime, however it does not yield any information about sus- pects A and B....

2 min · Xuanqiang 'Angelo' Huang