The Spectral Theorem: A Theorem-Chain Construction
Strategic Overview
The spectral theorem is not one theorem but a family. Here we build the finite-dimensional versions — both for self-adjoint operators on real inner product spaces and normal operators on complex inner product spaces — and end by sketching the path to the infinite-dimensional (bounded / unbounded / compact) generalizations.
Two genuinely different proof routes are possible. The thing to remember is that the complex case is cleaner because the Fundamental Theorem of Algebra hands you an eigenvalue for free; the real case requires an extra trick (complexification or a quadratic-factor argument) to extract an eigenvalue from a self-adjoint operator.
| Route | Key Step | Works Over |
|---|---|---|
| Complex / Schur | Triangularize, then show normality forces diagonal | |
| Real / Induction | Find one real eigenvalue, peel off, induct on |
Throughout, is a finite-dimensional inner product space over with , and is a linear operator.
Foundation: Inner Product Spaces
See Inner product spaces.
Inner Product & ONB
An inner product is a map that is linear in one slot, conjugate-symmetric, and positive definite.
Every finite-dim inner product space admits an orthonormal basis via Gram-Schmidt. Key consequence: in an ONB , coordinates are inner products: , and Parseval gives .
Riesz Representation (Finite-Dim)
Also see Counterfactual Invariance.
For every linear functional , there is a unique such that for all .
This is the engine that gives us the adjoint. The proof is trivial in finite dimensions (just take ), but morally the same theorem in Hilbert spaces requires completeness.
The Adjoint Operator
Definition of the Adjoint
The adjoint of is the unique operator satisfying
Existence is by Riesz applied to for each fixed . In matrix terms (w.r.t. an ONB): (conjugate transpose).
Algebraic Properties of
- (conjugate-linear!)
[!note] The "Four Fundamental Subspaces" and . This is the algebraic skeleton of the Fredholm alternative, and the reason SVD partitions and into four pieces.
Classes of Operators
The Operator Zoo
| Class | Definition | Spectrum Lives On | Matrix Form (some ONB) |
|---|---|---|---|
| Self-adjoint (Hermitian) | Real diagonal | ||
| Skew-adjoint | Purely imaginary diag. | ||
| Unitary () / Orthogonal () | Unit circle | Diagonal entries of modulus 1 | |
| Normal | Diagonal (in ) | ||
| Positive (semi)definite | and | Non-negative diagonal |
[!note] Normality is the maximal class for which the spectral theorem holds. Self-adjoint, skew-adjoint, and unitary are all special cases of normal. The "if and only if" version of the complex spectral theorem characterizes exactly the normal operators.
Why Self-Adjoint Eigenvalues Are Real
Suppose with and . Then Hence . Same argument: normal satisfies (computed via ).
Eigenvectors of Normal Operators Belong to Conjugate Eigenvalues of
Lemma. If is normal and , then .
Proof. is normal, so . The LHS is zero.
This is the lynchpin for the normal spectral theorem — it lets you upgrade triangular form to diagonal form. Without it Schur's theorem is the end of the road.
Orthogonality of Eigenspaces
Lemma. If is normal, eigenvectors with distinct eigenvalues are orthogonal.
Proof. Let , with . Then Since , .
Existence of Eigenvalues
Complex Case: Eigenvalues Always Exist
Over , the characteristic polynomial has a root by Fundamental Theorem of Algebra. So every operator on a complex finite-dim space has at least one eigenvalue. This is the only place characteristic polynomials are essential — and we even sidestep them in Axler-style treatments.
Real Case: Self-Adjoint Forces an Eigenvalue
Over a generic operator may have no eigenvalues (e.g., 90° rotation in ). But self-adjointness restores existence. Two standard proofs:
Proof 1 (Complexification). Embed , extend -linearly. The extension is still self-adjoint, has a complex eigenvalue, but by the previous argument the eigenvalue is real, and one extracts a real eigenvector.
Proof 2 (Axler's quadratic-factor argument). For any , the vectors are linearly dependent, giving for some real polynomial . Factor over into linear and irreducible quadratic factors. Show that no irreducible quadratic (with ) can be invertible when : strictly positive on when . Hence must have a linear factor, giving a real eigenvalue.
[!tip] Variational route to existence The eigenvalue of largest absolute value can be obtained as when is self-adjoint. Compactness of the unit sphere + continuity of the Rayleigh quotient does the work. This generalizes to compact self-adjoint operators on Hilbert spaces — see Courant-Fischer Min-Max.
The Invariant-Subspace Reduction
Invariant Subspaces
is -invariant if . The restriction inherits operator structure.
The Key Reduction Lemma
Lemma. If is normal (or self-adjoint) and is -invariant, then is also -invariant. Moreover is normal (self-adjoint).
Sketch (self-adjoint case). Take , . Then since and .
This lemma is the inductive engine of the real spectral theorem: peel off one eigenvector at a time, restrict to its orthogonal complement, induct on dimension.
Schur's Triangularization Theorem
We have seen something similar in past courses, but I don't remember the name.
Schur Decomposition
Theorem (Schur). For every on a complex finite-dim inner product space, there exists an ONB such that is upper triangular.
Equivalently: where is unitary and is upper triangular. The diagonal of is the spectrum.
Proof (induction on ).
- By FTA, has an eigenvalue with eigenvector (normalize).
- Let . Then has dim .
- is not in general -invariant, but: define the orthogonal projection and consider . By induction, is upper triangular in some ONB of .
- Stack ; the matrix of is upper triangular.
[!note] Schur ≠ Spectral Schur works for every operator, but only gives triangular form. The leap to diagonal form needs normality.
The Spectral Theorems
Complex Spectral Theorem (Normal Operators)
Theorem. on a complex finite-dim inner product space is normal iff there exists an ONB of consisting of eigenvectors of .
Proof.
-
() Trivial: in an eigen-ONB, is diagonal, and diagonal matrices commute with their conjugate transpose.
-
() By Schur, write with upper triangular in ONB . Normality is preserved by unitary conjugation, so itself is normal. Claim: a normal upper-triangular matrix is diagonal.
Compute but . Equating forces for . Induct.
Real Spectral Theorem (Self-Adjoint Operators)
Theorem. on a real finite-dim inner product space is self-adjoint iff there exists an ONB of consisting of eigenvectors of (all eigenvalues real).
Proof (the "induction route").
- By the existence theorem, has a real eigenvalue with unit eigenvector .
- is -invariant. By the reduction lemma, is -invariant and is self-adjoint.
- Induct on dimension.
The converse is again trivial: a real diagonal matrix is symmetric.
Unified Statement (Functional Form)
Spectral Resolution. If is normal (resp. self-adjoint, ), then where are the distinct eigenvalues, and is the orthogonal projection onto . The projections satisfy:
- (orthogonal projection)
- for
- (resolution of identity)
This is the form that generalizes: in infinite dimensions the finite sum becomes a projection-valued measure on the spectrum, and .
Consequences and Extensions
Functional Calculus
Once , for any function define This is consistent with the polynomial calculus and gives meaning to (for ), , , etc.
[!tip] Used everywhere in ML & physics Matrix exponential for solving linear ODEs; for whitening in PCA; regularizers in Bayesian deep learning. All are silently invoking the spectral theorem.
Polar Decomposition
Every factors as where is positive (defined via functional calculus on the self-adjoint ) and is a partial isometry (unitary if is invertible). This is the operator analog of for .
Singular Value Decomposition
For (not even square!), apply the spectral theorem to : where has the singular values on the diagonal. SVD is the spectral theorem in drag — it is what you get when you have no operator to diagonalize but you can diagonalize .
Courant–Fischer Min-Max Theorem
For self-adjoint with eigenvalues : Equivalently, eigenvalues are critical values of the Rayleigh quotient on the unit sphere. This is the variational characterization that survives into infinite dimensions and underlies PCA, graph Laplacian methods, and quantum mechanics ground-state computations.
Simultaneous Diagonalization
Theorem. A family of normal operators is simultaneously diagonalizable (common eigen-ONB) iff the operators pairwise commute.
This is the linear-algebraic shadow of why commuting observables in quantum mechanics share eigenbases. Non-commutativity ⟹ uncertainty.
Infinite-Dimensional Generalizations (Sketch)
Compact Self-Adjoint Operators
Hilbert-Schmidt / Spectral Theorem for Compact Self-Adjoint. A compact self-adjoint operator on a Hilbert space has a countable ONB of eigenvectors with real eigenvalues .
The finite-dim proof essentially transports: replace "max over unit sphere" with "max over unit ball" (compactness saves you), peel off the top eigenvector, induct on the closed invariant subspace. The accumulation at is forced by compactness.
Bounded Self-Adjoint Operators
Spectral Theorem (Bounded SA). For bounded self-adjoint on Hilbert space , there exists a unique projection-valued measure on such that .
Eigenvalues may not exist anymore (e.g., multiplication by on has empty point spectrum). The continuous spectrum is absorbed into the integral via the PVM.
Unbounded Self-Adjoint Operators
Now domains matter. The right notion is essentially self-adjoint (closure equals adjoint). The spectral theorem still holds — this is what justifies quantum-mechanical observables like position and momentum even though they're unbounded.
Connections & Synthesis
Why Normal Is The Right Class
The pattern is: diagonalizability in an ONB = commutation of the operator with its adjoint. The structural reason is that an eigen-ONB simultaneously diagonalizes and , and any two matrices diagonal in the same basis commute. The converse — that commutation forces simultaneous diagonalization — is exactly the spectral theorem.
| Question | Required Property of |
|---|---|
| Has an eigenvalue? | Char poly has root (always over ) |
| Triangularizable (unitarily)? | Always over (Schur) |
| Diagonalizable (any basis)? | Minimal poly has distinct roots |
| Diagonalizable in ONB? | Normal (over ) / self-adjoint (over ) |
Connection to AI / ML Practice
The spectral theorem silently underpins:
- PCA: eigendecomposition of , which is positive semi-definite.
- Graph Laplacian methods (spectral clustering, Cheeger inequality): is self-adjoint, eigenvectors of small eigenvalues encode community structure.
- Markov chain mixing: the second-largest eigenvalue of a reversible transition matrix governs mixing time. (Reversibility = self-adjointness w.r.t. the stationary measure inner product.)
- Hessian analysis in optimization / loss landscapes: eigenvalues of describe local curvature; negative eigenvalues = saddle directions.
- Kernel methods: a positive-definite kernel corresponds to a self-adjoint operator on — Mercer's theorem is the compact self-adjoint spectral theorem in disguise.
[!note] Relation to multi-agent dynamics In a multi-agent setting where agents update beliefs via a doubly-stochastic communication matrix , the second eigenvalue controls consensus speed. If is symmetric (undirected, equal-weight communication), the spectral theorem gives an orthogonal eigen-decomposition and clean convergence bounds. This connects to the broader pattern: whenever a multi-agent interaction has an underlying symmetric / reversible / detailed-balance structure, spectral methods give tight results; without symmetry one falls back to weaker tools (Jordan form, Perron-Frobenius, pseudo-spectra).
What's Missing From This Picture
- Non-normal operators: enormous theory of Jordan Canonical Form, generalized eigenvectors, pseudo-spectra. Most operators in practice (transition matrices for non-reversible chains, transformer attention matrices) are non-normal, and the spectral theorem doesn't apply — one needs different machinery.
- Operator algebras: the spectral theorem is the first chapter of -algebra theory; the Gelfand-Naimark theorem generalizes spectral resolution to commutative -algebras, identifying them with for compact Hausdorff .
- Spectral theory of differential operators: Sturm-Liouville theory, Weyl's theorem, the connection to PDEs and quantum mechanics.