Log Linear Models

Log Linear Models can be considered the most basic model used in natural languages. The main idea is to try to model the correlations of our data, or how the posterior $p(y \mid x)$ varies, where $x$ is our single data point features and $y$ are the labels of interest. This is a form of generalization because contextualized events (x, y) with similar descriptions tend to have similar probabilities. These kinds of models are so common that it has been discovered in many fields (and thus assuming different names): some of the most famous are Gibbs distributions, undirected graphical models, Markov Random Fields or Conditional Random Fields, exponential models, and (regularized) maximum entropy models. Special cases include logistic regression and Boltzmann machines. ...

5 min · Xuanqiang 'Angelo' Huang

Paginazione e segmentazione

Memoria sistema Operativo Guardare Memoria virtuale Per vedere come vengono rimpiazzate le pagine In quest sezione andiamo a parlare di come fanno molti processi a venire eseguiti insieme, anche se lo spazio di memoria fisico è lo stesso. Andiamo quindi a parlare di spazio di indirizzi, risoluzione di questi indirizzi logici, segmentazione e paginazione. (e molto di più!) MMU Controlla se l’accesso di memoria è bono o meno. (traduzione fra indirizzo logico e fisico) ...

8 min · Xuanqiang 'Angelo' Huang

Part of Speech Tagging

What is a part of Speech? A part of speech (POS) is a category of words that display similar syntactic behavior, i.e., they play similar roles within the grammatical structure of sentences. It has been known since the Latin era that some categories of words behave similarly (verbs for declination for example). The intuitive take is that knowing a specific part of speech can help understand the meaning of the sentence. ...

5 min · Xuanqiang 'Angelo' Huang

Recurrent Neural Networks

Recurrent Neural Networks allows us to model arbitrarily long sequence dependencies, at least in theory. This is very handy, and has many interesting theoretical implication. But here we are also interested in the practical applicability, so we may need to analyze common architectures used to implement these models, the main limitation and drawbacks, the nice properties and some applications. These network can bee seen as chaotic systems (non-linear dynamical systems), see Introduction to Chaos Theory. ...

4 min · Xuanqiang 'Angelo' Huang

Semirings

Semirings allow us to generalize many many common operations. One of the most powerful usages is the algebraic view of dynamic programming. Definition of a semiring A semiring is a 5-tuple $R = (A, \oplus, \otimes, \bar{0}, \bar{1})$ such that. $(A, \oplus, \bar{0})$ is a commutative monoid $(A, \otimes, \bar{1})$ is a monoid $\otimes$ distributes over $\oplus$. $\bar{0}$ is annihilator for $\otimes$. Monoid Let $K, \oplus$ be a set and a operation, then: ...

3 min · Xuanqiang 'Angelo' Huang

Sentiment Analysis

Sentiment analysis is one of the oldest tasks in natural language processing. In this note we will introduce some examples and terminology, some key problems in the field and a simple model that we can understand by just knowing Backpropagation Log Linear Models and the Softmax Function. We say: Polarity: the orientation of the sentiment. Subjectivity: if it expresses personal feelings. See demo Some applications: Businesses use sentiment analysis to understand if users are happy or not with their product. It’s linked to revenue: if the reviews are good, usually you make more money. But companies can’t read every review, so they want automatic methods. ...

2 min · Xuanqiang 'Angelo' Huang

Sparse Matrix Vector Multiplication

Algorithms for Sparse Matrix-Vector Multiplication Compressed Sparse Row 🟨– This is an optimized way to store rows for sparse matrices: Sparse MVM using CSR void smvm(int m, const double* values, const int* col_idx, const int* row_start, double* x, double* y) { int i, j; double d; /* Loop over m rows */ for (i = 0; i < m; i++) { d = y[i]; /* Scalar replacement since reused */ /* Loop over non-zero elements in row i */ for (j = row_start[i]; j < row_start[i + 1]; j++) { d += values[j] * x[col_idx[j]]; } y[i] = d; } } Let’s analyze this code: Spatial locality: with respect to row_start, col_idx and values we have spatial locality. Temporal locality: with respect to y we have temporal locality. (Poor temporal with respect to $x$) Good storage efficiency for the sparse matrix. But it is 2x slower than the dense matrix multiplication when the matrix is dense. Block CSR But we cannot do block optimizations for the cache with this storage method. ...

3 min · Xuanqiang 'Angelo' Huang

The Exponential Family

This is the generalization of the family of function where Softmax Function belongs. Many many functions are part of this family, most of the distributions that are used in science are part of the exponential family, e.g. beta, Gaussian, Bernoulli, Categorical distribution, Gamma, Beta, Poisson, are all part of the exponential family. The useful thing is the generalization power of this set of functions: if you prove something about this family, you prove it for every distribution that is part of this family. This family of functions is also closely linked too Generalized Linear Models (GLMs). ...

6 min · Xuanqiang 'Angelo' Huang

Transliteration systems

This note is still a TODO. Transliteration is learning learning a function to map strings in one character set to strings in another character set. The basic example is in multilingual applications, where it is needed to have the same string written in different languages. The goal is to develop a probabilistic model that can map strings from input vocabulary $\Sigma$ to an output vocabulary $\Omega$. We will extend the concepts presented in Automi e Regexp for Finite state automata to a weighted version. You will also need knowledge from Descrizione linguaggio for definitions of alphabets and strings, Kleene Star operations. ...

4 min · Xuanqiang 'Angelo' Huang

Cloud Storage

Object Stores Characteristics of Cloud Systems Object storage design principles 🟨++ We don’t want the hierarchy that is common in Filesystems, so we need to simplify that and have these four principles: Black-box objects Flat and global key-value model (trivial model, easy to access, without the need to trasverse a file hierarchy). Flexible metadata Commodity hardware (the battery idea of Tesla until 2017). Object storage usages 🟩 Object storage are useful to store things that are usually read-intensive. Some examples are ...

18 min · Xuanqiang 'Angelo' Huang