Xuanqiang Angelo Huang's Blog

Object Detection

Introduction # Semantic segmentation # Vorremo trovare regioni che corrispondano a categorie diverse . E dividere in questo modo l’immagine secondo zone di informazione. Object detection # Vogliamo trovare il più piccolo box che vada a contenere l’oggetto. Questo è fatto con il…

Ad-hoc Teamwork

Ad-Hoc Teamwork (AHT) in Reinforcement Learning # Problem Setting & Motivation # Ad-hoc teamwork concerns agents that must cooperate effectively with previously unknown teammates without any prior coordination, communication protocol, or shared learning history . This is…

Distributional Reinforcement Learning

Distributional Reinforcement Learning # Motivation: Why Bother With the Whole Distribution? # Standard value-based RL collapses the random return into a single scalar via expectation: Q ( s , a ) = E [ Z ( s , a )] . The distributional perspective (Bellemare, Dabney, Munos,…

Proximal Polixy Optimization

This document is DEPRECATED, please see RL Function Approximation . This documents attempts to briefly present the algorithm and some experiments found online about it. The following repo seems to be a good resource: here . Usually, PPO is explained as an actor critic framework…

RL Losses

SDPO # See (Hübotter et al. 2026) GRPO # https://hlfshell.ai/posts/grpo/ GRPO (Group Relative Policy Optimization) comes from the DeepSeekMath paper. Its whole reason for existing is to get rid of the value/critic network that PPO needs. Instead of learning a separate model to…

Communication Games

Introduction to Communication Games # We start by defining the fundamental problem: strategic information transmission between agents where information is asymmetric. The Communication Problem # Information Asymmetry : One player (Sender) knows something the other (Receiver)…

Information Bottleneck

These notes cover the core concepts of the Information Bottleneck method, widely used in machine learning and theoretical neuroscience. We start by defining the fundamental tension in learning and representation. Learning Design Goals # Compression : The representation should be…

Kolmogorov complexity

Gran parte di quanto scrivo ora è tratto da (Li & Vitányi 2019) . Chaitin, Kolmogorov e Solomonoff hanno elaborato il tema in modo indipendente e allo stesso tempo verso gli anni '60! Solomonoff lo ha trovato sul problema dell'induzione all'età di 38 anni, Kolmogorov invece…

Sobolev Spaces

Sobolev Spaces # Motivation & Setup # PDE theory and the calculus of variations require function spaces in which (i) differentiation makes sense for non-smooth functions, (ii) the space is complete under an L p -flavored norm, and (iii) one can embed into L q or Hölder…

Topology Crash Course

A Crash Course in Topology # This is a tour of the landmarks beyond Topological Spaces and Metric Spaces . Order of climb: first see how topologies are generated , then ascend the axiom ladder (separation, countability, compactness), then meet the invariants (algebraic topology,…