NOTE: this is an old set of note, and it is of quite bad quality, it should be completely rewritten.

How can we transform a uniform into a random variable? It is true that we have

A volte la densità non è definita, mentre la funzione cumulativa lo è , per questo spesso cominciamo a definire partendo dalla definizione.

Suppose we have a where is a cumulative distribution function, same thing, we just need to take the set, normal cumulative distribution function that we saw a lot in other courses.

Generalized inverse

This is known as the universality of the uniform, every invertible CDF can be written with respect to the uniform distribution.

Definition

We want to have an inverse of the cdf, but i don't know why. By some definition the transformation is still a random variable. The definition of the generalized inverse is:

This has sense because we know that the inverse is a continuous function (we do or not?). This is useful because when we have a cumulative probability distribution this allows to recreate the original random variable distribution, and it's quite easy to get it in this way.

Probability Inverse transform

Sample from when it's difficult to sample from that distribution (for example function difficult to calculate?, Difficult to implement function?)

Requirements:

  1. I know the functional form of
  2. in continuous.

For the exam you will be given the cumulative distribution and you need to use this theorem to sample

Theorem

The inverse distribution the cumulative of the uniform distribution is the same for that random variable

Where this is just the definition of the cumulative function, the catch is the ???

  1. is continuous then it is invertible
  2. If then the c.d.f is easy, and it's equal to

And this is easy. This is proved to be the only distribution with this property, that the CDF is an identify 3. this is also called PIT See here for proof (it's cool)

Proof 1

In the first passage we used that the cumulative distribution function is monotonically increasing, in the second a clear property for the inverse, and at the end we used a property of the cumulative uniform distribution.

Proof 2

Proof of the same fact, so we have: Proof of the point 3.

After you have this you can just use the inverse ->

So now you can sample.

Main Proof

This proof is my favorite. Recall the definition of change of variables for distributions:

Suppose now is the uniform over , meaning it has density and 0 elsewhere, and , then plugging everything we see that:

This implies if we use the reverse transformation for , we can effectively sample starting from the uniform distribution . This needs two main ingredients:

  1. The function must be invertible
  2. The function must exist and be well behaved, i.e. must be continuous

Example of application

And given , with Find the value of random variable.

Solution example:

  1. , then set up the equation and solve

We can notice that and are both uniform distribution, so the above is the same as And in this way you get the correct random variable. Now we can sample from the uniform and get the correct result! -> Direct transformation method. With this process we prooved that

Famous function CDF

Logistic function

This is a function very similar to that used in Logistic Regression. We have , then

Let's make the derivation:

Where and are parameters of the distribution.

Cauchy distribution

How to simulate data from a Cauchy distribution?

The derivation

So also in this case we have a way to sample a Cauchy distribution by just manipulating a uniform distribution.

How for the Gaussian?

Doesn't have a clear evaluation function because it's difficult, in R it uses the Probability inverse transform.

Empirical C.D.F

The definition is simple:, if we have a i.i.d. sample

It's just the sum of all the sampled data that is lower than a certain value! (unbiased, on average, the emprirical cdf we have the right cdf!).

Other famous distributions

Chi squared distribution math exchange

if we have then it is true that

Where is the degrees of freedom, bad thing is that this usually just even, so useful for that. (chi square gamma distribution)

Which is a Gamma distribution, exponential is special case of gamma?

Which is a Beta regression, used for microbiome in the guts, I don't know why.

Transformation of random variables (no exam)

La cosa strana per questi statistici e che non si capisce per quale fine stiamo facendo queste trasformazioni, che non sono molto utili a primo impatto. Nel senso che non so nemmeno quale sia il problema che stiamo provando a risolvere!

Suppose , and with an invertible function, then suppose Then the support of the density is in , we have that

Non so esattamente cosa mi possa servire questo e non so nemmeno perché funziona. Dopo: alla fine la dimostrazione di questo è molto semplice (io sono un po' incapace nelle dimo a quanto pare e tendo ad impararle a memoria) Comunque lo trovi in https://en.wikipedia.org/wiki/Random_variable

Uniform random variable

Permette la creazione di uniform tale per cui non siano sempre nel classico 0, 1

Allora la trasformazione sarebbe:

Then

and we have that is in when we need to have in order to have a density different than 0.

Standard Gaussian

then . Which is closed under linear transformations! and Then:

Where the first part is the density of a Gaussian

Box-Muller algorithm

This is one resource. This is another. Also (Bishop & Bishop 2024) section 14.1.2 presents this.

This algorithm is used to produce Gaussian random variables: we have and that are normal uniform, then we can take

Without any approximation, it is exact for some reason

Multivariate cases

We can use the re-parametrization trick to gen just multiplying by the variance and add the mean. We are asking if we can do the same even for the multivariate case, how can we get ? Let's take a bi-variate Gaussian for example. It seems like a result that we have that the square root of a matrix is almost the same as a Cholesky decomposition. That will be our sigma. it's always possible because we have a positive definite symmetric matrix for the Sigma. See Algebra lineare numerica for cholesky

The discrete case

Vogliamo andare a prendere molte probabilità in questo modo

È sempre come conoscere una CDF discreta. Poi prendiamo

e così facciamo sampling della variabile per le distribuzioni discrete

References

[1] Bishop & Bishop “Deep Learning: Foundations and Concepts” Springer International Publishing 2024