Object Detection

Introduction # Semantic segmentation # Vorremo trovare regioni che corrispondano a categorie diverse . E dividere in questo modo l’immagine secondo zone di informazione. Object detection # Vogliamo trovare il più piccolo box che vada a contenere l’oggetto. Questo è fatto con il…

Reading Time: 2 minutes · By Xuanqiang Angelo Huang

Ad-hoc Teamwork

Ad-Hoc Teamwork (AHT) in Reinforcement Learning # Problem Setting & Motivation # Ad-hoc teamwork concerns agents that must cooperate effectively with previously unknown teammates without any prior coordination, communication protocol, or shared learning history . This is…

Reading Time: 9 minutes · By Xuanqiang Angelo Huang

Distributional Reinforcement Learning

Distributional Reinforcement Learning # Motivation: Why Bother With the Whole Distribution? # Standard value-based RL collapses the random return into a single scalar via expectation: Q ( s , a ) = E [ Z ( s , a )] . The distributional perspective (Bellemare, Dabney, Munos,…

Reading Time: 15 minutes · By Xuanqiang Angelo Huang

Proximal Polixy Optimization

This document is DEPRECATED, please see RL Function Approximation . This documents attempts to briefly present the algorithm and some experiments found online about it. The following repo seems to be a good resource: here . Usually, PPO is explained as an actor critic framework…

Reading Time: 1 minutes · By Xuanqiang Angelo Huang

RL Losses

SDPO # See (Hübotter et al. 2026) GRPO # https://hlfshell.ai/posts/grpo/ GRPO (Group Relative Policy Optimization) comes from the DeepSeekMath paper. Its whole reason for existing is to get rid of the value/critic network that PPO needs. Instead of learning a separate model to…

Reading Time: 5 minutes · By Xuanqiang Angelo Huang

Communication Games

Introduction to Communication Games # We start by defining the fundamental problem: strategic information transmission between agents where information is asymmetric. The Communication Problem # Information Asymmetry : One player (Sender) knows something the other (Receiver)…

Reading Time: 11 minutes · By Xuanqiang Angelo Huang

Information Bottleneck

These notes cover the core concepts of the Information Bottleneck method, widely used in machine learning and theoretical neuroscience. We start by defining the fundamental tension in learning and representation. Learning Design Goals # Compression : The representation should be…

Reading Time: 5 minutes · By Xuanqiang Angelo Huang

Kolmogorov complexity

Gran parte di quanto scrivo ora è tratto da (Li & Vitányi 2019) . Chaitin, Kolmogorov e Solomonoff hanno elaborato il tema in modo indipendente e allo stesso tempo verso gli anni '60! Solomonoff lo ha trovato sul problema dell'induzione all'età di 38 anni, Kolmogorov invece…

Reading Time: 11 minutes · By Xuanqiang Angelo Huang

Sobolev Spaces

Sobolev Spaces # Motivation & Setup # PDE theory and the calculus of variations require function spaces in which (i) differentiation makes sense for non-smooth functions, (ii) the space is complete under an L p -flavored norm, and (iii) one can embed into L q or Hölder…

Reading Time: 22 minutes · By Xuanqiang Angelo Huang

Topology Crash Course

A Crash Course in Topology # This is a tour of the landmarks beyond Topological Spaces and Metric Spaces . Order of climb: first see how topologies are generated , then ascend the axiom ladder (separation, countability, compactness), then meet the invariants (algebraic topology,…

Reading Time: 17 minutes · By Xuanqiang Angelo Huang