Kernel Methods

As we will briefly see, Kernels will have an important role in many machine learning applications. In this note we will get to know what are Kernels and why are they useful. Intuitively they measure the similarity between two input points. So if they are close the kernel should be big, else it should be small. We briefly state the requirements of a Kernel, then we will argue with a simple example why they are useful....

13 min 路 Xuanqiang 'Angelo' Huang

Clustering

Gaussian Mixture Models This set takes inspiration from chapter 9.2 of (Bishop 2006). We assume that the reader already knows quite well what is a Gaussian mixture model and we will just restate the models here. We will discuss the problem of estimating the best possible parameters (so, this is a density estimation problem) when the data is generated by a mixture of Gaussians. Remember that the standard multivariate Gaussian has this format: $$ \mathcal{N}(x \mid \mu, \Sigma) = \frac{1}{\sqrt{ 2\pi }} \frac{1}{\lvert \Sigma \rvert^{1/2} } \exp \left( -\frac{1}{2} (x - \mu)^{T} \Sigma^{-1}(x - \mu) \right) $$ Problem statement 馃煩 Given a set of data points $x_{1}, \dots, x_{n}$ in $\mathbb{R}^{d}$ sampled by $k$ Gaussian each with responsibility $\pi_{k}$ the objective of this problem is to estimate the best $\pi_{k}$ for each Gaussian and the relative mean and covariance matrix....

8 min 路 Xuanqiang 'Angelo' Huang

Provably Approximately Correct Learning

PAC Learning is one of the most famous theories in learning theory. Learning theory concerns in answering questions like: What is learnable? Somewhat akin to La macchina di Turing for computability theory. How well can you learn something? PAC is a framework that allows to formally answer these questions. Now there is also a bayesian version of PAC in which there is a lot of research. Some definitions Empirical Risk Minimizer and Errors The empirical risk minimizer is defined as $$ \arg \min_{\hat{c} \in \mathcal{H}} \hat{R}_{n}(\hat{c}) $$ Where the inside is the empirical error....

10 min 路 Xuanqiang 'Angelo' Huang

Active Learning

Active Learning concerns methods to decide how to sample the most useful information in a specific domain; how can you select the best sample for an unknown model? Gathering data is very costly, we would like to create some principled manner to choose the best data point to humanly label in order to have the best model. In this setting, we are interested in the concept of usefulness of information. One of our main goals is to reduce uncertainty, thus, Entropy-based (mutual information) methods are often used....

13 min 路 Xuanqiang 'Angelo' Huang

Logistic Regression

Queste note sono molto di base. Per cose leggermente pi霉 avanzate bisogna guardare Bayesian Linear Regression, Linear Regression methods. Introduzione alla logistic regression Giustificazione del metodo Questo 猫 uno dei modelli classici, creati da Minsky qualche decennio fa In questo caso andiamo direttamente a computare il valore di $P(Y|X)$ durante l鈥檌nferenza, quindi si parla di modello discriminativo. Introduzione al problema Supponiamo che $Y$ siano variabili booleane $X_{i}$ siano variabili continue $X_{i}$ siano indipendenti uno dall鈥檃ltro....

4 min 路 Xuanqiang 'Angelo' Huang