Data Cubes

Data Cubes is a data format especially useful for heavy reads. It has been popularized in business environments where the main use for data was to make reports (many reads). This also links with the OLAP (Online Analytical Processing) vs OLTP (Online Transaction Processing)…

December 20, 2024 · Reading Time: 4 minutes · By Xuanqiang Angelo Huang

Introduction to Big Data

Data Science is similar to physics: it attemps to create theories of realities based on some formalism that another science brings. For physics it was mathematics, for data science it is computer science. Data has grown expeditiously in these last years and has reached a…

December 20, 2024 · Reading Time: 10 minutes · By Xuanqiang Angelo Huang

Lagrange Multipliers

This is also known as Lagrange Optimization or undetermined multipliers . Some of these notes are based on Appendix E of (Bishop 2006) , others were found when studying bits of rational mechanics. Also (Boyd & Vandenberghe 2004) chapter 5 should be a good resource on this…

HTML

Un pò di storia # È importante capire un pò di storia per vedere che strano robo abbiamo oggi. Due linee di sviluppo, uno è uno standard di W3C, l'altro è il living standard. È di fortissimo cambiamento, quindi di difficile definizione! (cambia significato sia di semantica e che…

December 15, 2024 · Reading Time: 7 minutes · By Xuanqiang Angelo Huang

HTTP e REST

HTTP is the acronym for HyperText Transfer Protocol. Caratteristiche principali (3) # Comunicazioni fra client e server, e quanto sono comunicate le cose si chiude la connessione e ci sono politiche di caching molto bone (tipo con i proxy) Generico : perché è un protocollo…

Inverse Transform

NOTE: this is an old set of note, and it is of quite bad quality, it should be completely rewritten. How can we transform a uniform into a random variable? It is true that we have F ( x ) = ∫ − ∞ x ​ f ( t ) d t A volte la densità non è definita, mentre la funzione cumulativa lo…

December 1, 2024 · Reading Time: 12 minutes · By Xuanqiang Angelo Huang

Querying Denormalized Data

TODO: write the introduction to the note. JSONiq purports as an easy query language that could run everywhere. It attempts to solve common problems in SQL i.e. the lack of support for nested data structures and also the lack of support for JSON data types. A nice thing about…

November 26, 2024 · Reading Time: 6 minutes · By Xuanqiang Angelo Huang

Probabilistic Parsing

Language Constituents # A constituent is a word or a group of words that function as a single unit within a hierarchical structure This is because there is a lot of evidence pointing towards an hierarchical organization of human language. Example of constituents # Let's have…

Semirings

Semirings allow us to generalize many many common operations. One of the most powerful usages is the algebraic view of dynamic programming. Definition of a semiring # A semiring is a 5-tuple R = ( A , ⊕ , ⊗ , 0 ˉ , 1 ˉ ) such that. ( A , ⊕ , 0 ˉ ) is a commutative monoid ( A , ⊗…

Transliteration systems

This note is still a TODO. Transliteration is learning learning a function to map strings in one character set to strings in another character set. The basic example is in multilingual applications, where it is needed to have the same string written in different languages. The…