Compiler Limitations

On Compiler # Adding compilation flags to gcc not always makes it faster, it just enables a specific set of optimization methods. It's also good to turn on platform specific flags to turn on some specific optimization methods to that architecture. Remember that compilers are…

Architettura software del OS

A seconda dell'utilizzatore l’OS può essere molte cose, come solamente l’interfaccia se sei un programmatore, servizi (se sei un utente, ma gran parte dei servizi sono astratti e l'utente ne può anche essere a non-conoscenza). Ma se sei un programmatore OS ti interessa capire le…

February 28, 2025 · Reading Time: 7 minutes · By Xuanqiang Angelo Huang

Provably Approximately Correct Learning

PAC Learning is one of the most famous theories in learning theory. Learning theory concerns in answering questions like: What is learnable? Somewhat akin to La macchina di Turing for computability theory. How well can you learn something? PAC is a framework that allows to…

February 22, 2025 · Reading Time: 18 minutes · By Xuanqiang Angelo Huang

Clustering

Gaussian Mixture Models # This set takes inspiration from chapter 9.2 of (Bishop 2006) . We assume that the reader already knows quite well what is a Gaussian Mixture Model and we will just restate the models here. We will discuss the problem of estimating the best possible…

February 6, 2025 · Reading Time: 12 minutes · By Xuanqiang Angelo Huang

Dirichlet Processes

The DP (Dirichlet Processes) is part of family of models called non-parametric models. Non parametric models concern learning models with potentially infinite number of parameters. One of the classical application is unsupervised techniques like clustering. Intuitively,…

February 6, 2025 · Reading Time: 10 minutes · By Xuanqiang Angelo Huang

Support Vector Machines

This is a quite good resource about this part of Support Vector Machines (step by step derivation). (Bishop 2006) chapter 7 is a good resource. The main idea about this supervised method is separating with a large gap . The thing is that we have a hyperplane, when this plane is…

February 6, 2025 · Reading Time: 15 minutes · By Xuanqiang Angelo Huang

Active Learning

Active Learning concerns methods to decide how to sample the most useful information in a specific domain; how can you select the best sample for an unknown model? Gathering data is very costly, we would like to create some principled manner to choose the best data point to…

Bayesian Information Criterion

This note is one of the few notes that was generated with the help of chatgpt. Bayesian Information Criterion (BIC) # The Bayesian Information Criterion (BIC) is a model selection criterion that helps compare different statistical models while penalizing model complexity. It is…

February 2, 2025 · Reading Time: 4 minutes · By Xuanqiang Angelo Huang

Beta and Dirichlet Distributions

The beta distribution # The beta distribution is a powerful tool for modeling probabilities and proportions between 0 and 1. Here's a structured intuition to grasp its essence: Core Concept # The beta distribution, defined on [ 0 , 1 ] , is parameterized by two shape parameters:…

February 1, 2025 · Reading Time: 6 minutes · By Xuanqiang Angelo Huang

Bloom Filters

How Bloom Filters Work # A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is possibly in a set or definitely not in a set . It allows for false positives but never false negatives. One example of application is the membership query…