Uniform Resource Identifier

URI # Sono stata LA vera invenzione di Berners Lee accennati in Storia del web . Il problema è avere un modo per identificare una risorsa in modo univoco sull’internet. Introduzione # La risorsa # Una risorsa è qualunque struttura che sia oggetto di scambio tra applicazioni…

Markup

Introduzione alle funzioni del markup # La semantica di una parola è caratterizzata dalla mia scelta (design sul significato). Non mi dice molto, quindi proviamo a raccontare qualcosa in più. Definiamo markup ogni mezzo per rendere esplicita una particolare interpretazione di un…

Massive Parallel Processing

We have a group of mappers that work on dividing the keys for some reducers that actually work on that same group of data. The bottleneck is the assigning part: when mappers finish and need to handle the data to the reducers. Introduction # Common input formats # You need to…

January 28, 2025 · Reading Time: 14 minutes · By Xuanqiang Angelo Huang

Data Models and Validation

A data model is an abstract view over the data that hides the way it is stored physically. The same idea from (Codd 1970) This is why we should not modify data directly, but pass though some abstraction that maintain the properties of that specific data model. Data Models # Tree…

January 26, 2025 · Reading Time: 10 minutes · By Xuanqiang Angelo Huang

Distributed file systems

We want to know how to handle systems that have a large number of data. In previous lesson we have discovered how to quickly access and make Scalable systems with huge dimensions, see Cloud Storage . Object storage could store billions of files, we want to handle millions of…

January 26, 2025 · Reading Time: 10 minutes · By Xuanqiang Angelo Huang

The Market

Let's consider first a simple model for apartments in a college. Here we are interested to predict the prices of the rooms, and how we can allocate them to students. For simplicity, we will assume that they are all equal except for the location, which could be inner or outer.…

January 24, 2025 · Reading Time: 6 minutes · By Xuanqiang Angelo Huang

Maximum Entropy Principle

The maximum entropy principle is one of the most important guiding motives in artificial artificial intelligence. Its roots emerge from a long tradition of probabilistic inference that goes back to Laplace and Occam's Razor, i.e. the principle of parsimony. Let's start with a…

Tabular Reinforcement Learning

This note extends the content Markov Processes in this specific context. One nice expansion, which treats the field a little bit more from the behavioural sciences perspectiv eis Intrinsic Motivation and Playfulness . Standard notions # Explore-exploit dilemma # We have seen…

Counterfactual Invariance

Machine learning cannot distinguish between causal and environment features. Shortcut learning # Often we observe shortcut learning : the model learns some dataset dependent shortcuts (e.g. the machine that was used to take the X-ray) to make inference, but this is very brittle,…

January 18, 2025 · Reading Time: 13 minutes · By Xuanqiang Angelo Huang

Performance at Large Scales

Some specific phenomenons in modern systems happen only when we scale into large systems. This note will gather some observations about the most important phenomena we observe at these scales. Tail Latency Phenomenon # Tail latency refers to the high-end response time…

January 18, 2025 · Reading Time: 3 minutes · By Xuanqiang Angelo Huang