Inverse Transform

Huang, Xuanqiang Angelo

Home » Notes

Inverse Transform

December 1, 2024 · Reading Time: 12 minutes · By Xuanqiang Angelo Huang

Table of Contents

Generalized inverse
Main Proof
- Example of application
Famous function CDF
Transformation of random variables (no exam)
- Uniform random variable
- Standard Gaussian
Box-Muller algorithm
Multivariate cases
The discrete case

NOTE: this is an old set of note, and it is of quite bad quality, it should be completely rewritten.

How can we transform a uniform into a random variable? It is true that we have

F (x) = \int_{- \infty}^{x} f (t) d t

A volte la densità non è definita, mentre la funzione cumulativa lo è , per questo spesso cominciamo a definire partendo dalla definizione.

Suppose we have a $x \sim F_{X} (x)$ where $F$ is a cumulative distribution function, same thing, we just need to take the set, normal cumulative distribution function that we saw a lot in other courses.

F_{X} (x) = P (X \leq x) = \int_{- \infty}^{x} f_{X} (z) d z

Generalized inverse

This is known as the universality of the uniform, every invertible CDF can be written with respect to the uniform distribution.

Definition

We want to have an inverse of the cdf, but i don't know why. By some definition the transformation is still a random variable. The definition of the generalized inverse is:

F_{X}^{- 1} (u) = in f {x; F_{X} (x) \geq u}

This has sense because we know that the inverse is a continuous function (we do or not?). This is useful because when we have a cumulative probability distribution this allows to recreate the original random variable distribution, and it's quite easy to get it in this way.

Probability Inverse transform

Sample from $X$ when it's difficult to sample from that distribution (for example function difficult to calculate?, Difficult to implement function?)

Requirements:

I know the functional form of $F_{X} (x)$
$X$ in continuous.

For the exam you will be given the cumulative distribution and you need to use this theorem to sample

Theorem

The inverse distribution the cumulative of the uniform distribution is the same for that random variable

U \sim F_{X} (x) ⟺ P (Y \leq x) = X

Where $Y = F_{X}^{- 1} (U)$ this is just the definition of the cumulative function, the catch is the ???

$F_{X} (u)$ is continuous then it is invertible
If $U \sim U ni f (0, 1)$ then the c.d.f is easy, and it's equal to

⎩ ⎨ ⎧ 0, u \leq 0 u, 0 \leq u \leq 1 1, u \leq 1

And this is easy. This is proved to be the only distribution with this property, that the CDF is an identify 3. $F_{X} (X) = U$ this is also called PIT See here for proof (it's cool)

Proof 1

P (F_{X}^{- 1} (U) \leq x) = P (F_{X} [F_{X}^{- 1} (U)] \leq F_{X} (x)) = P (U \leq F_{X} (x)) = F_{U} [F_{X} (x)] = F_{X} (x)

In the first passage we used that the cumulative distribution function is monotonically increasing, in the second a clear property for the inverse, and at the end we used a property of the cumulative uniform distribution.

Proof 2

Proof of the same fact, so we have: Proof of the point 3.

U = F_{U} (u) = P (U \leq u) = P (F_{X} (X) \leq F_{X} (x)) = P (F_{X}^{- 1} (F_{X} (X)) \leq F_{X}^{- 1} (F_{X} (x))) = P (X \leq x) = F_{X} (x)

After you have this you can just use the inverse ->

F_{X}^{- 1} (U) = X

So now you can sample.

Main Proof

This proof is my favorite. Recall the definition of change of variables for distributions:

p (y) = p (x) \frac{δ x}{δ y}

Suppose now $p (x)$ is the uniform over $[0, 1]$ , meaning it has density $\forall x \in [0, 1] : p (x) = 1$ and 0 elsewhere, and $x = h (\overset{y}{^}) = \int_{- \infty}^{\overset{y}{^}} p (y) d y$ , then plugging everything we see that:

p (y) = p (x) \frac{δ x}{δ y} = 1 \frac{δ x}{δ y} = \frac{\partial h ( y )}{\partial y} = ∣ p (y)∣ = p (y)

This implies if we use the reverse transformation for $h$ , we can effectively sample $y$ starting from the uniform distribution $x$ . This needs two main ingredients:

The function $h$ must be invertible
The function $h$ must exist and be well behaved, i.e. must be continuous

Example of application

X \sim E x p (λ = 1)

And given $F_{X} (x) = 1 - e^{- x}$ , with $x \in R^{+} \cup {0}$ Find the value of $X$ random variable.

Solution example:

$F_{X} (X) = U$ , then set up the equation and solve

1 - e^{- X} = U ⟹ e^{- X} = 1 - U ⟹ X = - lo g (1 - U)

We can notice that $1 - U$ and $U$ are both uniform distribution, so the above is the same as $X = - lo g (U)$ And in this way you get the correct random variable. Now we can sample from the uniform and get the correct result! -> Direct transformation method. With this process we prooved that $- lo g U \sim E x p (λ = 1)$

Famous function CDF

Logistic function

This is a function very similar to that used in Logistic Regression. We have $X \sim Logistic (μ; β)$ , then

F_{X} (x) = \frac{1}{1 + e ^{- (x - μ) / β}}

Let's make the derivation:

U = F_{X} (X) = \frac{1}{1 + e ^{- (X - μ) / β}} ⟹ 1 + e^{- (X - μ) / β} = \frac{1}{U} ⟹ - (X - μ) / β = lo g (\frac{1}{U} - 1)

⟹ X = - β lo g (\frac{1}{U} - 1) + μ

Where $β$ and $μ$ are parameters of the distribution.

Cauchy distribution

How to simulate data from a Cauchy distribution?

F_{X} (x) = \frac{1}{2} + \frac{1}{π} arctan ((x - μ) / σ)

The derivation

U = F_{X} (X) = \frac{1}{2} + \frac{1}{π} arctan ((X - μ) / σ) ⟹ tan (π (U - \frac{1}{2})) = (X - μ) / σ ⟹ X = σ \cdot tan (π (U - \frac{1}{2})) + μ

So also in this case we have a way to sample a Cauchy distribution by just manipulating a uniform distribution.

How for the Gaussian?

Φ (x) = \int_{- \infty}^{x} f_{X} (z) d z \int_{- \infty}^{x} \frac{1}{2 π σ ^{2}} exp {- \frac{( z - μ ) ^{2}}{2 σ ^{2}}} d z

Doesn't have a clear evaluation function because it's difficult, in R it uses the Probability inverse transform.

Empirical C.D.F

The definition is simple:, if we have a i.i.d. sample

(X_{1}, \dots, X_{n}) : \hat{F}_{X, m} (x) = \frac{1}{n} \cdot i = 1 \sum n 1 (X_{i} \leq x)

It's just the sum of all the sampled data that is lower than a certain value! (unbiased, on average, the emprirical cdf we have the right cdf!).

Other famous distributions

Chi squared distribution math exchange

if we have $X_{i} \sim E x p (1)$ then it is true that

Y = 2 i = 1 \sum n X_{i} \sim χ_{2 n}^{2}

Where $n$ is the degrees of freedom, bad thing is that this usually just even, so useful for that. (chi square gamma distribution)

Y = β i = 1 \sum n X_{i} \sim G (a, β) : a \in N^{*}

Which is a Gamma distribution, exponential is special case of gamma?

Y = \frac{\sum _{i = 1}^{a} X _{i}}{\sum _{i = 1}^{a + b} X _{i}} \sim B e (a, b)

Which is a Beta regression, used for microbiome in the guts, I don't know why.

Transformation of random variables (no exam)

La cosa strana per questi statistici e che non si capisce per quale fine stiamo facendo queste trasformazioni, che non sono molto utili a primo impatto. Nel senso che non so nemmeno quale sia il problema che stiamo provando a risolvere!

Suppose $X \sim f_{X} (x)$ , and $Z = g (X)$ with $g$ an invertible function, then suppose $Z \sim f_{Z} (z)$ Then the support of the density is in $X$ , we have that

f_{Z} (z) = \frac{δ F _{Z} ( z )}{δ z} = f_{X} [g^{- 1} (z)] \cdot \frac{δ g ^{- 1} ( z )}{δ z}

Non so esattamente cosa mi possa servire questo e non so nemmeno perché funziona. Dopo: alla fine la dimostrazione di questo è molto semplice (io sono un po' incapace nelle dimo a quanto pare e tendo ad impararle a memoria) Comunque lo trovi in https://en.wikipedia.org/wiki/Random_variable

Uniform random variable

Permette la creazione di uniform tale per cui non siano sempre nel classico 0, 1

U \sim U ni f (0, 1) ⟹ f_{U} (u) = {0 : u \neq \in [0, 1] 1 : u \in [0, 1]

Allora la trasformazione sarebbe:

X = (b - a) U + a, b > a

Then

U = \frac{x - a}{b - a} = g^{- 1} (X) ⟹ \frac{δ g ^{- 1} ( X )}{δ x} = \frac{1}{b - a} ⟹ f_{X} (x) = f_{U} [\frac{x - a}{b - a}] \cdot \frac{1}{b - a}

and we have that $f_{U}$ is in $[0, 1]$ when we need to have $x \geq a \land x \leq b$ in order to have a density different than 0.

Standard Gaussian

$X \sim N (μ, σ^{2})$ then $Z = (X - μ) / σ$ . Which is closed under linear transformations! and $X = g^{- 1} (Z) = σ \cdot Z + μ$ Then:

f_{Z} (z) = f_{X} (σ z + μ) σ = \frac{1}{2 π} exp {- \frac{Z ^{2}}{2}}

Where the first part is the density of a Gaussian

Box-Muller algorithm

This is one resource. This is another. Also (Bishop & Bishop 2024) section 14.1.2 presents this.

This algorithm is used to produce Gaussian random variables: we have $U_{1}$ and $U_{2}$ that are normal uniform, then we can take

X_{1} = - 2 lo g (U_{1}) cos (2 π U_{2}), X_{2} = - 2 lo g (U_{2}) sin (2 π U_{1}),

Without any approximation, it is exact for some reason

Multivariate cases

We can use the re-parametrization trick to gen $N (μ, σ^{2})$ just multiplying by the variance and add the mean. We are asking if we can do the same even for the multivariate case, how can we get $N_{p} (μ, Σ)$ ? Let's take a bi-variate Gaussian for example. It seems like a result that we have that the square root of a matrix is almost the same as a Cholesky decomposition. That will be our sigma. it's always possible because we have a positive definite symmetric matrix for the Sigma. See Algebra lineare numerica for cholesky

The discrete case

Vogliamo andare a prendere molte probabilità in questo modo

p_{0} = P_{θ} (X \leq 0), p_{1} = P_{θ} (X \leq 1) e v ia

È sempre come conoscere una CDF discreta. Poi prendiamo

X = k, p_{k - 1} \leq U \leq p_{k}

e così facciamo sampling della variabile per le distribuzioni discrete $X$

References

[1] Bishop & Bishop “Deep Learning: Foundations and Concepts” Springer International Publishing 2024

Generalized inverse#

Definition#

Probability Inverse transform#

Theorem#

Proof 1#

Proof 2#

Main Proof#

Example of application#

Famous function CDF#

Logistic function#

Cauchy distribution#

Empirical C.D.F#

Other famous distributions#

Transformation of random variables (no exam)#

Uniform random variable#

Standard Gaussian#

Box-Muller algorithm#

Multivariate cases#

The discrete case#

References#