Kernel Methods

As we will briefly see, Kernels will have an important role in many machine learning applications. In this note we will get to know what are Kernels and why are they useful. Intuitively they measure the similarity between two input points. So if they are close the kernel should…

Apache Spark

This is a new framework that is faster than MapReduce (See Massive Parallel Processing ). It is written in Scala and has a more functional approach to programming. Spark extends the previous MapReduce framework to a generic distributed dataflow, properly modeled as a DAG. There…

December 27, 2024 · Reading Time: 9 minutes · By Xuanqiang Angelo Huang

Multi Variable Derivatives

Multi-variable derivative # To the people that are not used to matrix derivatives (like me) it could be useful to see how ∂ u ∂ u T S u ​ = 2 S u First, we note that if you derive with respect to some matrix, the output will be of the same dimension of that matrix. That notation…

December 26, 2024 · Reading Time: 8 minutes · By Xuanqiang Angelo Huang

Demand

Here we analyze how demand changes when prices and income changes. Types of Goods # Here we will define two main types of Goods: Normal Goods : The demand increases linearly with the income. Inferior or Ordinary Goods : The demand decreases when the income is higher, one example…

December 26, 2024 · Reading Time: 2 minutes · By Xuanqiang Angelo Huang

Gaussian Processes

Gaussian processes can be viewed through a Bayesian lens of the function space: rather than sampling over individual data points, we are now sampling over entire functions. They extend the idea of bayesian linear regression by introducing an infinite number of feature functions…

Budget and Preferences

Budget # A definition for Budget # Economist want simple models to start to model things. One of the things we will model here is how do you describe what you can afford about some goods. Budget Set # The easy way is to define a vector space of possible goods X , where its…

December 25, 2024 · Reading Time: 14 minutes · By Xuanqiang Angelo Huang

Cross Validation and Model Selection

There is a big difference between the empirical score and the expected score; in the beginning, we had said something about this in Introduction to Advanced Machine Learning . We will develop more methods to better comprehend this fundamental principles. How can we estimate the…

December 24, 2024 · Reading Time: 6 minutes · By Xuanqiang Angelo Huang

Rademacher Complexity

This note used the definitions present in Provably Approximately Correct Learning . So, go there when you encounter a word you don't know. Or search online Rademacher Complexity # Given an hypothesis set H , we define a family of loss functions as: G = { g : ( x , y ) → L ( h (…

December 21, 2024 · Reading Time: 2 minutes · By Xuanqiang Angelo Huang

Structured Query Language

Little bits of history # It was invented in 1970 in Almaden (San Jose) by IBM (Don Chamberlin, Raymond Boyce worked on this) for the first relational database, called system R. Then for copyright issues it hasn't been called SEQUEL, so they branded it as SQL. SQL is a…

Markov Chains

Introduzione alle catene di Markov # La proprietà di Markov # Una sequenza di variabili aleatorie X 1 ​ , X 2 ​ , X 3 ​ , … gode della proprietà di Markov se vale: P ( X n ​ ∣ X n − 1 ​ , X n − 2 ​ , … , X 1 ​ ) = P ( X n ​ ∣ X n − 1 ​ ) Ossia posso scordarmi tutta la storia…