Spectral Theorem

The Spectral Theorem: A Theorem-Chain Construction

Strategic Overview

The spectral theorem is not one theorem but a family. Here we build the finite-dimensional versions — both for self-adjoint operators on real inner product spaces and normal operators on complex inner product spaces — and end by sketching the path to the infinite-dimensional (bounded / unbounded / compact) generalizations.

Two genuinely different proof routes are possible. The thing to remember is that the complex case is cleaner because the Fundamental Theorem of Algebra hands you an eigenvalue for free; the real case requires an extra trick (complexification or a quadratic-factor argument) to extract an eigenvalue from a self-adjoint operator.

Route	Key Step	Works Over
Complex / Schur	Triangularize, then show normality forces diagonal	$C$
Real / Induction	Find one real eigenvalue, peel off, induct on $W^{⊥}$	$R$

Throughout, $V$ is a finite-dimensional inner product space over $F \in {R, C}$ with $dim V = n$ , and $T \in L (V)$ is a linear operator.

Foundation: Inner Product Spaces

See Inner product spaces.

Inner Product & ONB

An inner product is a map $⟨ \cdot, \cdot ⟩ : V \times V \to F$ that is linear in one slot, conjugate-symmetric, and positive definite.

Every finite-dim inner product space admits an orthonormal basis via Gram-Schmidt. Key consequence: in an ONB ${e_{i}}$ , coordinates are inner products: $x = \sum_{i} ⟨ x, e_{i} ⟩ e_{i}$ , and Parseval gives $∥ x ∥^{2} = \sum_{i} ∣ ⟨ x, e_{i} ⟩ ∣^{2}$ .

Riesz Representation (Finite-Dim)

Also see Counterfactual Invariance.

For every linear functional $φ : V \to F$ , there is a unique $u \in V$ such that $φ (x) = ⟨ x, u ⟩$ for all $x$ .

This is the engine that gives us the adjoint. The proof is trivial in finite dimensions (just take $u = \sum_{i} \overline{φ (e_{i})} e_{i}$ ), but morally the same theorem in Hilbert spaces requires completeness.

The Adjoint Operator

Definition of the Adjoint

The adjoint $T^{*} \in L (V)$ of $T$ is the unique operator satisfying $⟨ T x, y ⟩ = ⟨ x, T^{*} y ⟩ \forall x, y \in V .$

Existence is by Riesz applied to $x \mapsto ⟨ T x, y ⟩$ for each fixed $y$ . In matrix terms (w.r.t. an ONB): $[T^{*}] = \overline{[T]^{⊤}} = [T]^{H}$ (conjugate transpose).

Algebraic Properties of $T^{*}$

$(S + T)^{*} = S^{*} + T^{*}$
$(λ T)^{*} = \overline{λ} T^{*}$ (conjugate-linear!)
$(S T)^{*} = T^{*} S^{*}$
$T^{**} = T$
$I^{*} = I$

[!note] The "Four Fundamental Subspaces" $ker T = (im T^{*})^{⊥}$ and $ker T^{*} = (im T)^{⊥}$ . This is the algebraic skeleton of the Fredholm alternative, and the reason SVD partitions $V$ and $W$ into four pieces.

Classes of Operators

The Operator Zoo

Class	Definition	Spectrum Lives On	Matrix Form (some ONB)
Self-adjoint (Hermitian)	$T = T^{*}$	$R$	Real diagonal
Skew-adjoint	$T = - T^{*}$	$i R$	Purely imaginary diag.
Unitary ( $C$ ) / Orthogonal ( $R$ )	$T^{*} T = I$	Unit circle $S^{1}$	Diagonal entries of modulus 1
Normal	$T^{} T = T T^{}$	$C$	Diagonal (in $C$ )
Positive (semi)definite	$T = T^{*}$ and $⟨ T x, x ⟩ \geq 0$	$R_{\geq 0}$	Non-negative diagonal

[!note] Normality is the maximal class for which the spectral theorem holds. Self-adjoint, skew-adjoint, and unitary are all special cases of normal. The "if and only if" version of the complex spectral theorem characterizes exactly the normal operators.

Why Self-Adjoint Eigenvalues Are Real

Suppose $T v = λ v$ with $v \neq = 0$ and $T = T^{*}$ . Then $λ ⟨ v, v ⟩ = ⟨ T v, v ⟩ = ⟨ v, T^{*} v ⟩ = ⟨ v, T v ⟩ = \overline{λ} ⟨ v, v ⟩ .$ Hence $λ = \overline{λ} \in R$ . Same argument: normal $T$ satisfies $∥ T v ∥ = ∥ T^{*} v ∥$ (computed via $⟨ T v, T v ⟩ = ⟨ T^{*} T v, v ⟩ = ⟨ T T^{*} v, v ⟩$ ).

Eigenvectors of Normal Operators Belong to Conjugate Eigenvalues of $T^{*}$

Lemma. If $T$ is normal and $T v = λ v$ , then $T^{*} v = \overline{λ} v$ .

Proof. $T - λ I$ is normal, so $∥ (T - λ I) v ∥ = ∥ (T - λ I)^{*} v ∥ = ∥ (T^{*} - \overline{λ} I) v ∥$ . The LHS is zero.

This is the lynchpin for the normal spectral theorem — it lets you upgrade triangular form to diagonal form. Without it Schur's theorem is the end of the road.

Orthogonality of Eigenspaces

Lemma. If $T$ is normal, eigenvectors with distinct eigenvalues are orthogonal.

Proof. Let $T v = λ v$ , $T w = μ w$ with $λ \neq = μ$ . Then $λ ⟨ v, w ⟩ = ⟨ T v, w ⟩ = ⟨ v, T^{*} w ⟩ = ⟨ v, \overline{μ} w ⟩ = μ ⟨ v, w ⟩ .$ Since $λ \neq = μ$ , $⟨ v, w ⟩ = 0$ .

Existence of Eigenvalues

See Autovalori e Autovettori.

Complex Case: Eigenvalues Always Exist

Over $C$ , the characteristic polynomial $det (T - λ I)$ has a root by Fundamental Theorem of Algebra. So every operator on a complex finite-dim space has at least one eigenvalue. This is the only place characteristic polynomials are essential — and we even sidestep them in Axler-style treatments.

Real Case: Self-Adjoint Forces an Eigenvalue

Over $R$ a generic operator may have no eigenvalues (e.g., 90° rotation in $R^{2}$ ). But self-adjointness restores existence. Two standard proofs:

Proof 1 (Complexification). Embed $V ↪ V_{C} := V \otimes_{R} C$ , extend $T$ $C$ -linearly. The extension is still self-adjoint, has a complex eigenvalue, but by the previous argument the eigenvalue is real, and one extracts a real eigenvector.

Proof 2 (Axler's quadratic-factor argument). For any $v \neq = 0$ , the vectors $v, T v, T^{2} v, \dots, T^{n} v$ are linearly dependent, giving $p (T) v = 0$ for some real polynomial $p$ . Factor $p$ over $R$ into linear and irreducible quadratic factors. Show that no irreducible quadratic $T^{2} + b T + c I$ (with $b^{2} < 4 c$ ) can be invertible when $T = T^{*}$ : $⟨(T^{2} + b T + c I) v, v ⟩ = ∥ T v ∥^{2} + b ⟨ T v, v ⟩ + c ∥ v ∥^{2} \geq ∥ T v ∥^{2} - 2 c ∥ T v ∥∥ v ∥ + c ∥ v ∥^{2} = (∥ T v ∥ - c ∥ v ∥)^{2} \geq 0,$ strictly positive on $v \neq = 0$ when $b^{2} < 4 c$ . Hence $p$ must have a linear factor, giving a real eigenvalue.

[!tip] Variational route to existence The eigenvalue of largest absolute value can be obtained as $λ_{1} = max_{∥ v ∥ = 1} ⟨ T v, v ⟩$ when $T$ is self-adjoint. Compactness of the unit sphere + continuity of the Rayleigh quotient does the work. This generalizes to compact self-adjoint operators on Hilbert spaces — see Courant-Fischer Min-Max.

The Invariant-Subspace Reduction

Invariant Subspaces

$W \subseteq V$ is $T$ -invariant if $T (W) \subseteq W$ . The restriction $T ∣_{W} \in L (W)$ inherits operator structure.

The Key Reduction Lemma

Lemma. If $T$ is normal (or self-adjoint) and $W$ is $T$ -invariant, then $W^{⊥}$ is also $T$ -invariant. Moreover $T ∣_{W}$ is normal (self-adjoint).

Sketch (self-adjoint case). Take $u \in W^{⊥}$ , $w \in W$ . Then $⟨ T u, w ⟩ = ⟨ u, T w ⟩ = 0$ since $T w \in W$ and $u ⊥ W$ .

This lemma is the inductive engine of the real spectral theorem: peel off one eigenvector at a time, restrict to its orthogonal complement, induct on dimension.

Schur's Triangularization Theorem

We have seen something similar in past courses, but I don't remember the name.

Schur Decomposition

Theorem (Schur). For every $T \in L (V)$ on a complex finite-dim inner product space, there exists an ONB ${e_{1}, \dots, e_{n}}$ such that $[T]$ is upper triangular.

Equivalently: $T = U Λ U^{*}$ where $U$ is unitary and $Λ$ is upper triangular. The diagonal of $Λ$ is the spectrum.

Proof (induction on $n$ ).

By FTA, $T$ has an eigenvalue $λ$ with eigenvector $e_{1}$ (normalize).
Let $W = span (e_{1})$ . Then $W^{⊥}$ has dim $n - 1$ .
$W^{⊥}$ is not in general $T$ -invariant, but: define $π : V \to W^{⊥}$ the orthogonal projection and consider $S := π \circ T ∣_{W^{⊥}}$ . By induction, $S$ is upper triangular in some ONB ${e_{2}, \dots, e_{n}}$ of $W^{⊥}$ .
Stack ${e_{1}, e_{2}, \dots, e_{n}}$ ; the matrix of $T$ is upper triangular.

[!note] Schur ≠ Spectral Schur works for every operator, but only gives triangular form. The leap to diagonal form needs normality.

The Spectral Theorems

Complex Spectral Theorem (Normal Operators)

Theorem. $T \in L (V)$ on a complex finite-dim inner product space is normal iff there exists an ONB of $V$ consisting of eigenvectors of $T$ .

Proof.

( $\Leftarrow$ ) Trivial: in an eigen-ONB, $[T]$ is diagonal, and diagonal matrices commute with their conjugate transpose.
( $\Rightarrow$ ) By Schur, write $[T] = U^{*} Λ U$ with $Λ$ upper triangular in ONB ${e_{i}}$ . Normality is preserved by unitary conjugation, so $Λ$ itself is normal. Claim: a normal upper-triangular matrix is diagonal.

Compute $(Λ Λ^{*})_{11} = \sum_{j} ∣ λ_{1 j} ∣^{2}$ but $(Λ^{*} Λ)_{11} = ∣ λ_{11} ∣^{2}$ . Equating forces $λ_{1 j} = 0$ for $j > 1$ . Induct.

Real Spectral Theorem (Self-Adjoint Operators)

Theorem. $T \in L (V)$ on a real finite-dim inner product space is self-adjoint iff there exists an ONB of $V$ consisting of eigenvectors of $T$ (all eigenvalues real).

Proof (the "induction route").

By the existence theorem, $T$ has a real eigenvalue $λ_{1}$ with unit eigenvector $e_{1}$ .
$W = span (e_{1})$ is $T$ -invariant. By the reduction lemma, $W^{⊥}$ is $T$ -invariant and $T ∣_{W^{⊥}}$ is self-adjoint.
Induct on dimension.

The converse is again trivial: a real diagonal matrix is symmetric.

Unified Statement (Functional Form)

Spectral Resolution. If $T$ is normal (resp. self-adjoint, $R$ ), then $T = \sum_{i = 1}^{k} λ_{i} P_{i},$ where $λ_{1}, \dots, λ_{k}$ are the distinct eigenvalues, and $P_{i}$ is the orthogonal projection onto $ker (T - λ_{i} I)$ . The projections satisfy:

$P_{i}^{2} = P_{i} = P_{i}^{*}$ (orthogonal projection)

$P_{i} P_{j} = 0$ for $i \neq = j$

$\sum_{i} P_{i} = I$ (resolution of identity)

This is the form that generalizes: in infinite dimensions the finite sum becomes a projection-valued measure $E$ on the spectrum, and $T = \int λ d E (λ)$ .

Consequences and Extensions

Functional Calculus

Once $T = \sum λ_{i} P_{i}$ , for any function $f : σ (T) \to C$ define $f (T) := \sum_{i} f (λ_{i}) P_{i} .$ This is consistent with the polynomial calculus and gives meaning to $T$ (for $T ⪰ 0$ ), $e^{T}$ , $lo g T$ , etc.

[!tip] Used everywhere in ML & physics Matrix exponential $e^{t A}$ for solving linear ODEs; $Σ$ for whitening in PCA; $lo g det$ regularizers in Bayesian deep learning. All are silently invoking the spectral theorem.

Polar Decomposition

Every $T \in L (V)$ factors as $T = U ∣ T ∣$ where $∣ T ∣ := T^{*} T$ is positive (defined via functional calculus on the self-adjoint $T^{*} T$ ) and $U$ is a partial isometry (unitary if $T$ is invertible). This is the operator analog of $z = e^{i θ} ∣ z ∣$ for $z \in C$ .

Singular Value Decomposition

For $T : V \to W$ (not even square!), apply the spectral theorem to $T^{*} T ⪰ 0$ : $T = U Σ V^{*},$ where $Σ$ has the singular values $σ_{i} = λ_{i} (T^{*} T)$ on the diagonal. SVD is the spectral theorem in drag — it is what you get when you have no operator to diagonalize but you can diagonalize $T^{*} T$ .

Courant–Fischer Min-Max Theorem

For $T$ self-adjoint with eigenvalues $λ_{1} \geq \dots \geq λ_{n}$ : $λ_{k} = max_{S \subseteq V d i m S = k} min_{v \in S ∥ v ∥ = 1} ⟨ T v, v ⟩ .$ Equivalently, eigenvalues are critical values of the Rayleigh quotient $R (v) = ⟨ T v, v ⟩ /∥ v ∥^{2}$ on the unit sphere. This is the variational characterization that survives into infinite dimensions and underlies PCA, graph Laplacian methods, and quantum mechanics ground-state computations.

Simultaneous Diagonalization

Theorem. A family ${T_{α}}$ of normal operators is simultaneously diagonalizable (common eigen-ONB) iff the operators pairwise commute.

This is the linear-algebraic shadow of why commuting observables in quantum mechanics share eigenbases. Non-commutativity ⟹ uncertainty.

Infinite-Dimensional Generalizations (Sketch)

Compact Self-Adjoint Operators

Hilbert-Schmidt / Spectral Theorem for Compact Self-Adjoint. A compact self-adjoint operator $T$ on a Hilbert space $H$ has a countable ONB of eigenvectors with real eigenvalues $λ_{n} \to 0$ .

The finite-dim proof essentially transports: replace "max over unit sphere" with "max over unit ball" (compactness saves you), peel off the top eigenvector, induct on the closed invariant subspace. The accumulation at $0$ is forced by compactness.

Bounded Self-Adjoint Operators

Spectral Theorem (Bounded SA). For bounded self-adjoint $T$ on Hilbert space $H$ , there exists a unique projection-valued measure $E$ on $σ (T) \subseteq R$ such that $T = \int_{σ (T)} λ d E (λ)$ .

Eigenvalues may not exist anymore (e.g., multiplication by $x$ on $L^{2} ([0, 1])$ has empty point spectrum). The continuous spectrum is absorbed into the integral via the PVM.

Unbounded Self-Adjoint Operators

Now domains matter. The right notion is essentially self-adjoint (closure equals adjoint). The spectral theorem still holds — this is what justifies quantum-mechanical observables like position and momentum even though they're unbounded.

Connections & Synthesis

Why Normal Is The Right Class

The pattern is: diagonalizability in an ONB = commutation of the operator with its adjoint. The structural reason is that an eigen-ONB simultaneously diagonalizes $T$ and $T^{*}$ , and any two matrices diagonal in the same basis commute. The converse — that commutation forces simultaneous diagonalization — is exactly the spectral theorem.

Question	Required Property of $T$
Has an eigenvalue?	Char poly has root (always over $C$ )
Triangularizable (unitarily)?	Always over $C$ (Schur)
Diagonalizable (any basis)?	Minimal poly has distinct roots
Diagonalizable in ONB?	Normal (over $C$ ) / self-adjoint (over $R$ )

Connection to AI / ML Practice

The spectral theorem silently underpins:

PCA: eigendecomposition of $Σ = E [x x^{⊤}]$ , which is positive semi-definite.
Graph Laplacian methods (spectral clustering, Cheeger inequality): $L = D - A$ is self-adjoint, eigenvectors of small eigenvalues encode community structure.
Markov chain mixing: the second-largest eigenvalue of a reversible transition matrix governs mixing time. (Reversibility = self-adjointness w.r.t. the stationary measure inner product.)
Hessian analysis in optimization / loss landscapes: eigenvalues of $\nabla^{2} L$ describe local curvature; negative eigenvalues = saddle directions.
Kernel methods: a positive-definite kernel $k (x, y)$ corresponds to a self-adjoint operator on $L^{2} (μ)$ — Mercer's theorem is the compact self-adjoint spectral theorem in disguise.

[!note] Relation to multi-agent dynamics In a multi-agent setting where agents update beliefs via a doubly-stochastic communication matrix $W$ , the second eigenvalue $λ_{2} (W)$ controls consensus speed. If $W$ is symmetric (undirected, equal-weight communication), the spectral theorem gives an orthogonal eigen-decomposition and clean convergence bounds. This connects to the broader pattern: whenever a multi-agent interaction has an underlying symmetric / reversible / detailed-balance structure, spectral methods give tight results; without symmetry one falls back to weaker tools (Jordan form, Perron-Frobenius, pseudo-spectra).

What's Missing From This Picture

Non-normal operators: enormous theory of Jordan Canonical Form, generalized eigenvectors, pseudo-spectra. Most operators in practice (transition matrices for non-reversible chains, transformer attention matrices) are non-normal, and the spectral theorem doesn't apply — one needs different machinery.
Operator algebras: the spectral theorem is the first chapter of $C^{*}$ -algebra theory; the Gelfand-Naimark theorem generalizes spectral resolution to commutative $C^{*}$ -algebras, identifying them with $C (X)$ for compact Hausdorff $X$ .
Spectral theory of differential operators: Sturm-Liouville theory, Weyl's theorem, the connection to PDEs and quantum mechanics.

The Spectral Theorem: A Theorem-Chain Construction#

Strategic Overview#

Foundation: Inner Product Spaces#

Inner Product & ONB#

Riesz Representation (Finite-Dim)#

The Adjoint Operator#

Definition of the Adjoint#

Algebraic Properties of T∗#

Classes of Operators#

The Operator Zoo#

Why Self-Adjoint Eigenvalues Are Real#

Eigenvectors of Normal Operators Belong to Conjugate Eigenvalues of T∗#

Orthogonality of Eigenspaces#

Existence of Eigenvalues#

Complex Case: Eigenvalues Always Exist#

Real Case: Self-Adjoint Forces an Eigenvalue#

The Invariant-Subspace Reduction#

Invariant Subspaces#

The Key Reduction Lemma#

Schur's Triangularization Theorem#

Schur Decomposition#

The Spectral Theorems#

Complex Spectral Theorem (Normal Operators)#

Real Spectral Theorem (Self-Adjoint Operators)#

Unified Statement (Functional Form)#

Consequences and Extensions#

Functional Calculus#

Polar Decomposition#

Singular Value Decomposition#

Courant–Fischer Min-Max Theorem#

Simultaneous Diagonalization#

Infinite-Dimensional Generalizations (Sketch)#

Compact Self-Adjoint Operators#

Bounded Self-Adjoint Operators#

Unbounded Self-Adjoint Operators#

Connections & Synthesis#

Why Normal Is The Right Class#

Connection to AI / ML Practice#

What's Missing From This Picture#