Generative Adversarial Networks

Generative Adversarial Network has been introduced in 2014 by Ian Goodfellow (at that time they where still gray and white). Now the images have been improved so much with Diffusion Models. This idea has been considered by Yann LeCun as one of the most important ideas. Nowadays (2025) they are still used for super-resolution and other applications, but it has still some limitations (mainly stability), and now has good competition against other models. The resolution purported by GAN is much higher than VAE (see Autoencoders#Variational Autoencoders). This is a easy plugin to improve the results of other models (VAE, flow, Diffusion). Also ChatGPT has some sort of adversarial learning for example, not explained in the same manner as here. ...

8 min · Xuanqiang 'Angelo' Huang

HTTP e REST

HTTP is the acronym for HyperText Transfer Protocol. Caratteristiche principali (3) Comunicazioni fra client e server, e quanto sono comunicate le cose si chiude la connessione e ci sono politiche di caching molto bone (tipo con i proxy) Generico: perché è un protocollo utilizzato per caricare moltissime tipologie di risorse! Stateless, ossia non vengono mantenute informazioni su scambi vecchi, in un certo modo ne abbiamo parlato in Sicurezza delle reti quando abbiamo parlato di firewall stateless. Solitamente possiamo intendere questo protocollo come utile per scambiare risorse di cui abbiamo parlato in Uniform Resource Identifier. ...

6 min · Xuanqiang 'Angelo' Huang

Language Models

In order to understand language models we need to understand structured prediction. If you are familiar with Sentiment Analysis, where given an input text we need to classify it in a binary manner, in this case the output space usually scales in an exponential manner. The output has some structure, for example it could be a tree, it could be a set of words etc… This usually needs an intersection between statistics and computer science. ...

2 min · Xuanqiang 'Angelo' Huang

Log Linear Models

Log Linear Models can be considered the most basic model used in natural languages. The main idea is to try to model the correlations of our data, or how the posterior $p(y \mid x)$ varies, where $x$ is our single data point features and $y$ are the labels of interest. This is a form of generalization because contextualized events (x, y) with similar descriptions tend to have similar probabilities. These kinds of models are so common that it has been discovered in many fields (and thus assuming different names): some of the most famous are Gibbs distributions, undirected graphical models, Markov Random Fields or Conditional Random Fields, exponential models, and (regularized) maximum entropy models. Special cases include logistic regression and Boltzmann machines. ...

5 min · Xuanqiang 'Angelo' Huang

Paginazione e segmentazione

Memoria sistema Operativo Guardare Memoria virtuale Per vedere come vengono rimpiazzate le pagine In quest sezione andiamo a parlare di come fanno molti processi a venire eseguiti insieme, anche se lo spazio di memoria fisico è lo stesso. Andiamo quindi a parlare di spazio di indirizzi, risoluzione di questi indirizzi logici, segmentazione e paginazione. (e molto di più!) MMU Controlla se l’accesso di memoria è bono o meno. (traduzione fra indirizzo logico e fisico) ...

8 min · Xuanqiang 'Angelo' Huang

Part of Speech Tagging

What is a part of Speech? A part of speech (POS) is a category of words that display similar syntactic behavior, i.e., they play similar roles within the grammatical structure of sentences. It has been known since the Latin era that some categories of words behave similarly (verbs for declination for example). The intuitive take is that knowing a specific part of speech can help understand the meaning of the sentence. ...

5 min · Xuanqiang 'Angelo' Huang

Recurrent Neural Networks

Recurrent Neural Networks allows us to model arbitrarily long sequence dependencies, at least in theory. This is very handy, and has many interesting theoretical implication. But here we are also interested in the practical applicability, so we may need to analyze common architectures used to implement these models, the main limitation and drawbacks, the nice properties and some applications. These network can bee seen as chaotic systems (non-linear dynamical systems), see Introduction to Chaos Theory. ...

4 min · Xuanqiang 'Angelo' Huang

Semirings

Semirings allow us to generalize many many common operations. One of the most powerful usages is the algebraic view of dynamic programming. Definition of a semiring A semiring is a 5-tuple $R = (A, \oplus, \otimes, \bar{0}, \bar{1})$ such that. $(A, \oplus, \bar{0})$ is a commutative monoid $(A, \otimes, \bar{1})$ is a monoid $\otimes$ distributes over $\oplus$. $\bar{0}$ is annihilator for $\otimes$. Monoid Let $K, \oplus$ be a set and a operation, then: ...

3 min · Xuanqiang 'Angelo' Huang

Sentiment Analysis

Sentiment analysis is one of the oldest tasks in natural language processing. In this note we will introduce some examples and terminology, some key problems in the field and a simple model that we can understand by just knowing Backpropagation Log Linear Models and the Softmax Function. We say: Polarity: the orientation of the sentiment. Subjectivity: if it expresses personal feelings. See demo Some applications: Businesses use sentiment analysis to understand if users are happy or not with their product. It’s linked to revenue: if the reviews are good, usually you make more money. But companies can’t read every review, so they want automatic methods. ...

2 min · Xuanqiang 'Angelo' Huang

Sparse Matrix Vector Multiplication

Algorithms for Sparse Matrix-Vector Multiplication Compressed Sparse Row 🟨– This is an optimized way to store rows for sparse matrices: Sparse MVM using CSR void smvm(int m, const double* values, const int* col_idx, const int* row_start, double* x, double* y) { int i, j; double d; /* Loop over m rows */ for (i = 0; i < m; i++) { d = y[i]; /* Scalar replacement since reused */ /* Loop over non-zero elements in row i */ for (j = row_start[i]; j < row_start[i + 1]; j++) { d += values[j] * x[col_idx[j]]; } y[i] = d; } } Let’s analyze this code: Spatial locality: with respect to row_start, col_idx and values we have spatial locality. Temporal locality: with respect to y we have temporal locality. (Poor temporal with respect to $x$) Good storage efficiency for the sparse matrix. But it is 2x slower than the dense matrix multiplication when the matrix is dense. Block CSR But we cannot do block optimizations for the cache with this storage method. ...

3 min · Xuanqiang 'Angelo' Huang