Multi-variable derivative

To the people that are not used to matrix derivatives (like me) it could be useful to see how $$ \frac{ \partial u^{T}Su }{ \partial u } = 2Su $$ First, we note that if you derive with respect to some matrix, the output will be of the same dimension of that matrix. That notation is just deriving every single component independently and then joining them together, so it will be better understood as as $$ \frac{ \partial u^{T}Su }{ \partial u } = \begin{bmatrix} \frac{ \partial u^{T}Su }{ \partial u_{1} } \ \dots \ \frac{ \partial u^{T}Su }{ \partial u_{M} } \ \end{bmatrix}

\begin{bmatrix} 2(Su){1} \ \dots \ 2(Su){M} \end{bmatrix} = 2Su

$$ So we can prove each derivative independently, it's just a lot of manual work! We see that $u^{T}Su$ is just a quadratic form, studied in [Massimi minimi multi-variabile#Forme quadratiche](/notes/massimi-minimi-multi-variabile#forme-quadratiche) so it is just computing this: $$

u^{T}Su = \sum_{i, j = 1, 1}^{M} u_{i}u_{j}S_{ij} \implies \frac{ \partial u^{T}Su }{ \partial u_{1} } =2u_{1}S_{11} + \sum_{j \neq 1}^{M}(u_{j}S_{1j} + u_{j}S_{j1}) = 2\left( u_{1}S_{11} + \sum_{j \neq 1}u_{j}S_{1j} \right) = 2(Su)_{1} $$ Last equation is true because $S$ is a symmetric matrix, then we easily see that indeed it’s true that indeed it’s the first row of the $Su$ matrix multiplied by 2.

Determinant

Also see wikipedia.

$$ \frac{\partial \det(\mathbf{A}(t))}{\partial \mathbf{t}} = \det(\mathbf{A}) \cdot \left( \text{tr}(\mathbf{A}^{-1}) \cdot \frac{ \partial A(t) }{ \partial x } \right) $$

In the special case we have:

$$ \frac{\partial \det(\mathbf{A})}{\partial \mathbf{A}} = \det(\mathbf{A}) \cdot (\mathbf{A}^{-1})^\top $$

Proof:

$$ \begin{align} \frac{\partial \det(\mathbf{A})}{\partial \mathbf{A}} &= \det(\mathbf{A}) \cdot \frac{\partial \ln \det(\mathbf{A})}{\partial \mathbf{A}} \\ &= \det(\mathbf{A}) \cdot \frac{\partial \text{tr} (\ln A)}{\partial \mathbf{A}}\\ \\ &= \det(\mathbf{A}) \cdot (\mathbf{A}^{-1})^\top \end{align} $$

I don’t think I have understood this thing quite well…

Matrix Inverse

$$ \frac{\partial \mathbf{A}^{-1}}{\partial \mathbf{A}} = -\mathbf{A}^{-1} \otimes \mathbf{A}^{-1}. $$ $$ \begin{align} \frac{\partial}{\partial \mathbf{A}} (\mathbf{A} \mathbf{A}^{-1}) &= \frac{\partial \mathbf{I}}{\partial \mathbf{A}} = 0 \\ &\implies \frac{\partial \mathbf{A}}{\partial \mathbf{A}} \cdot \mathbf{A}^{-1} + \mathbf{A} \cdot \frac{\partial \mathbf{A}^{-1}}{\partial \mathbf{A}} = 0 \\ &\implies\mathbf{I} \cdot \mathbf{A}^{-1} + \mathbf{A} \cdot \frac{\partial \mathbf{A}^{-1}}{\partial \mathbf{A}} = 0 \\ &\implies \frac{\partial \mathbf{A}^{-1}}{\partial \mathbf{A}} = -\mathbf{A}^{-1} \cdot \mathbf{A}^{-1}. \end{align} $$

Quadratic Form

$$ \frac{\partial}{\partial \mathbf{A}} \left( \mathbf{v}^\top \mathbf{A} \mathbf{v} \right) = \mathbf{v} \mathbf{v}^\top. $$

This should be easy, and quite similar to the above case when we have derived $v$.

Quadratic Inverse

$$ \frac{\partial}{\partial \mathbf{A}} \left( \mathbf{v}^\top \mathbf{A}^{-1} \mathbf{v} \right) = -\mathbf{A}^{-1} \mathbf{v} \mathbf{v}^\top \mathbf{A}^{-1}. $$

You can interpret this as a function composition.