Xuanqiang Angelo Huang's Blog

Bloom Filters

How Bloom Filters Work # A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is possibly in a set or definitely not in a set . It allows for false positives but never false negatives. One example of application is the membership query…

Uniform Resource Identifier

URI # Sono stata LA vera invenzione di Berners Lee accennati in Storia del web . Il problema è avere un modo per identificare una risorsa in modo univoco sull’internet. Introduzione # La risorsa # Una risorsa è qualunque struttura che sia oggetto di scambio tra applicazioni…

Markup

Introduzione alle funzioni del markup # La semantica di una parola è caratterizzata dalla mia scelta (design sul significato). Non mi dice molto, quindi proviamo a raccontare qualcosa in più. Definiamo markup ogni mezzo per rendere esplicita una particolare interpretazione di un…

Massive Parallel Processing

We have a group of mappers that work on dividing the keys for some reducers that actually work on that same group of data. The bottleneck is the assigning part: when mappers finish and need to handle the data to the reducers. Introduction # Common input formats # You need to…

Data Models and Validation

A data model is an abstract view over the data that hides the way it is stored physically. The same idea from (Codd 1970) This is why we should not modify data directly, but pass though some abstraction that maintain the properties of that specific data model. Data Models # Tree…

Distributed file systems

We want to know how to handle systems that have a large number of data. In previous lesson we have discovered how to quickly access and make Scalable systems with huge dimensions, see Cloud Storage . Object storage could store billions of files, we want to handle millions of…

The Market

Let's consider first a simple model for apartments in a college. Here we are interested to predict the prices of the rooms, and how we can allocate them to students. For simplicity, we will assume that they are all equal except for the location, which could be inner or outer.…

Maximum Entropy Principle

The maximum entropy principle is one of the most important guiding motives in artificial artificial intelligence. Its roots emerge from a long tradition of probabilistic inference that goes back to Laplace and Occam's Razor, i.e. the principle of parsimony. Let's start with a…

Tabular Reinforcement Learning

This note extends the content Markov Processes in this specific context. One nice expansion, which treats the field a little bit more from the behavioural sciences perspectiv eis Intrinsic Motivation and Playfulness . Standard notions # Explore-exploit dilemma # We have seen…

Counterfactual Invariance

Machine learning cannot distinguish between causal and environment features. Shortcut learning # Often we observe shortcut learning : the model learns some dataset dependent shortcuts (e.g. the machine that was used to take the X-ray) to make inference, but this is very brittle,…