Sobolev Spaces
Motivation & Setup
PDE theory and the calculus of variations require function spaces in which (i) differentiation makes sense for non-smooth functions, (ii) the space is complete under an -flavored norm, and (iii) one can embed into or Hölder spaces. The classical spaces fail (ii); pure fails (i). Sobolev spaces are the fix: they replace pointwise differentiation with the weak derivative and complete under the natural -Sobolev norm.
Throughout: open, , . Use multi-index , , .
Why Classical Spaces Fail
- with is complete but ignores structure; minimizing sequences of integral energies need not converge in .
- with the -Sobolev norm is not complete; its completion is precisely .
- The Dirichlet energy is minimized naturally over , not over .
[!note] The Direct Method The Direct Method of the Calculus of Variations needs: lower semicontinuity + coercivity + a reflexive space to extract weakly convergent minimizing subsequences. Sobolev spaces are the canonical reflexive choice (when ).
Weak Derivatives
Definition: Weak Derivative
Given and multi-index , a function is the -th weak derivative of if
Notation: . The definition is the distributional derivative restricted to those distributions representable by an function.
Uniqueness: a.e. uniqueness follows from the fundamental lemma of the calculus of variations (du Bois-Reymond).
[!tip] Recognizing weak derivatives If , weak and classical derivatives coincide. If has a jump, the weak derivative does not exist as a function — it picks up a Dirac mass and lives only as a distribution. Example: has .
Calculus Rules for Weak Derivatives
The following hold in provided RHS makes sense:
- Linearity: .
- Product rule (chain): if and , then and .
- Chain rule (Stampacchia): if , , , and , then with .
- Truncation: with (almost everywhere).
The truncation rule underlies the maximum principle for weak solutions.
The Spaces and
Definition: Sobolev Space
Norm:
Equivalent norms abound — e.g., .
Definition: — the Hilbert Case
For write . Inner product:
This is the only giving a Hilbert space — it's the workhorse for linear elliptic PDE and spectral theory.
Definition: — Zero Boundary Trace
Heuristically: functions with "zero boundary values up to order ". On all of , . On bounded with reasonable boundary, the inclusion is strict (constants live in the latter, not the former).
Banach / Hilbert / Reflexive / Separable Properties
| Property | , | ||
|---|---|---|---|
| Banach | ✓ | ✓ | ✓ |
| Hilbert | only | ✗ | ✓ |
| Separable | ✓ | ✗ | ✓ |
| Reflexive | ✓ iff | ✗ | ✓ |
| Uniformly convex | ✗ | ✓ |
Completeness proof sketch: a Cauchy sequence has Cauchy in for each ; pass to -limits, then verify the limits satisfy the weak-derivative identity by passing the limit through the integration-by-parts identity (dominated convergence on the test function side).
Approximation by Smooth Functions
Mollifiers
, , , . Set . Define on . Then and on (the weak derivative commutes with convolution).
Local Approximation
If , , then in . The proof is Friedrichs's classical mollification argument.
Theorem: Meyers–Serrin ""
For every open and , is dense in .
The historical name: used to denote the completion of under . Meyers–Serrin (1964) proved for arbitrary open sets, with no boundary regularity assumed. The proof partitions into shells and mollifies on each, gluing with a partition of unity.
[!note] What can fail at the boundary Density of (note the closure) requires to have, e.g., the segment property (a weak boundary regularity). Without it, may fail to be dense in in the subspace sense — though Meyers–Serrin density is unaffected.
Extension Theorems
Extension Operator
A bounded linear with . Such an operator lets you transfer Fourier-analytic and global- results back to .
Existence: exists when is sufficiently regular — Lipschitz suffices for ; suffices in general. Constructions: reflection across the boundary (half-space), then partition of unity + local charts.
[!note] Calderón–Stein extension For bounded Lipschitz , Stein (1970) constructed a universal extension operator that works simultaneously for all , , . Without boundary regularity, extension can fail.
Traces
The trace problem: is only defined a.e., so the boundary (measure-zero set) sees no canonical value. Trace theorems make this work via continuity.
Theorem: Trace Operator
Let be bounded with boundary, . There exists a bounded linear
Range: maps onto — a fractional Sobolev space. The kernel is :
Trace as Loss of Regularity
The "" reflects a general principle: restricting to a codimension- surface costs derivatives. Heuristic via Fourier: restriction in corresponds to summing over the normal frequency, which behaves like a Riemann sum against , hence the deficit.
[!question] Why no trace for on the kernel side? still equals , but , and the trace operator is surjective onto — the fractional space "collapses". For traces land in directly (it's just Lipschitz restriction).
Sobolev Embedding Theorems
The heart of the theory. Set the Sobolev conjugate for . Three regimes by vs. .
Theorem: Gagliardo–Nirenberg–Sobolev ()
Let be bounded with Lipschitz boundary, . Then , i.e.
On for , the sharp form is
Sharp constant: Talenti–Aubin (1976). Extremizers exist only for and are translates/dilates of the Aubin–Talenti profile (mod scaling).
General order: for when .
Theorem: Morrey ()
If , then
In particular for on Lipschitz domains. This is the Morrey embedding, with explicit Hölder modulus.
The Borderline Case
The naive guess fails: . Counterexample (n=2): on the unit ball, in but unbounded.
- One has for all but not .
- Trudinger–Moser inequality: there exists such that
Sharp constant (Moser, 1971). This is the natural limit replacing .
Summary Table of Embeddings
| Regime | Embedding | Borderline |
|---|---|---|
| , | best | |
| all | Trudinger–Moser exponential | |
| , | requires strict |
Proof Sketch — GNS via the Identity
The slickest route. For :
Apply this in each direction, multiply the inequalities, take the -th root, integrate, use generalized Hölder, get
This is the case. For general , apply the inequality to with chosen so the exponent matches . Standard, e.g., Evans Ch. 5.6.
Compactness: Rellich–Kondrachov
Theorem: Rellich–Kondrachov
Let be bounded with Lipschitz boundary, . Then the embedding
At the critical exponent the embedding is continuous but not compact. For compact embeddings into all (resp. into for and ).
Consequences for Variational Methods
Compactness is the workhorse for:
- Existence of minimizers (extract weakly convergent subsequence → strongly convergent in for sub-critical ).
- Discrete spectrum of self-adjoint elliptic operators (Laplacian on bounded domain with Dirichlet BC → compact resolvent → countable eigenvalues ).
- Concentration-compactness (Lions) handles loss at critical exponents.
[!note] Loss of compactness at the critical exponent Failure of (with ) is the source of all critical-exponent phenomena: Yamabe problem, prescribed scalar curvature, Brezis–Nirenberg. Bubbles converge weakly but not strongly.
Poincaré-Type Inequalities
These convert control of into control of — essential for coercivity of the Dirichlet form.
Poincaré Inequality (Zero-Trace Version)
For , bounded, :
Best constant for is , where is the first Dirichlet eigenvalue of on .
Consequence: on , the seminorm is an equivalent norm.
Poincaré–Wirtinger (Zero-Mean Version)
For , bounded connected Lipschitz:
Best constant: where is the first Neumann eigenvalue (the second eigenvalue of the Neumann Laplacian; the first is 0 with constant eigenfunction).
[!tip] When you need a Poincaré Any time you want coercivity for as a norm. If you have Dirichlet BC, use the zero-trace version. If you're working on (e.g., Neumann problems), use Poincaré–Wirtinger.
Fractional Sobolev Spaces
Two main definitions; for they coincide and equal the Bessel potential space.
Slobodeckij Seminorm (Gagliardo)
For , :
For , write with , , and require for .
Bessel Potential Spaces
Via Fourier transform on :
i.e. .
| Property | (Slobodeckij) | (Bessel) |
|---|---|---|
| Defined via | Difference quotients | Fourier multipliers |
| Equals for | ✓ | ✓ |
| Equals for | — | only if |
| Interpolation behavior | Real interp. (Besov-like) | Complex interp. |
In fact, (Slobodeckij) coincides with the Besov space , not with — the latter is Triebel–Lizorkin. The mismatch for , is fundamental.
Trace Spaces Revisited
The trace operator uses precisely the Slobodeckij version because is a manifold without group structure where Fourier is awkward. The image is genuinely fractional unless , where everything coincides nicely.
Dual Spaces and Negative-Order Sobolev Spaces
— Dual of
Characterization: iff for some (in distributional sense). Norm: infimum of over such decompositions.
This is the natural target space for the Laplacian:
(Lax–Milgram applied to the bilinear form .)
Negative Fractional Spaces
For , where . These appear when applying differential operators of order to functions, the result lying in negative-order spaces.
Bounded Variation and the Subtleties
where is the space of finite Radon measures. : allows the gradient to be a measure, e.g., for a set of finite perimeter has (the reduced boundary times the outer normal).
replaces in image processing (Rudin–Osher–Fatemi total variation denoising), free-boundary problems, and minimal surface theory. It is not reflexive, only weak- compact (Banach–Alaoglu on the dual of a separable space).
Connections, Comparisons, Applications
Sobolev vs. Hölder vs. Lipschitz Spaces
| Space | Norm strength | When useful |
|---|---|---|
| of derivatives | Variational PDE, weak solutions | |
| Hölder modulus | Regularity theory, Schauder estimates | |
| Bounded th derivative | Lipschitz / semiconcave regularity | |
| Measure-valued gradient | Free discontinuities, -coercivity | |
| Bessel/Fourier | Pseudodifferential, harmonic analysis |
Why Is Privileged
- Only giving Hilbert structure → Riesz representation, Lax–Milgram, spectral theory.
- Fourier transform is unitary on → exact characterization of via .
- Coincidence of all definitions: .
- Self-dual scale: .
Connection to Statistical Learning / NN Approximation
- Barron spaces and RKHSs sit inside Sobolev spaces — used to bound NN approximation rates.
- Curse of dimensionality manifests through Sobolev embeddings: to approximate functions in requires , giving the classical rate.
- Neural ODEs / flow-based models implicitly minimize Sobolev-type norms on the velocity field (e.g., FFJORD).
- For Stein discrepancies / score matching, Sobolev regularity of the data density controls convergence.
Connection to Optimal Transport & Functional Inequalities
- The log-Sobolev inequality controls hypercontractivity and exponential ergodicity of diffusions.
- Bakry–Émery -calculus uses Sobolev-type spaces on metric measure spaces.
- Sobolev–type inequalities ↔ isoperimetric inequalities (Federer–Fleming), via and BV.
- The Otto–Villani theorem links log-Sobolev → Talagrand → Wasserstein concentration.
[!note] Tangent to Angelo's work Sobolev structure underlies most rigorous analysis of continuous-time/space multi-agent systems and continuous strategy spaces. Mechanism-design impossibility results (Myerson–Satterthwaite, Gibbard–Satterthwaite) live in discrete/measurable settings — but their continuous analogues (e.g. matching with continuous types, Hotelling-style games) often require Sobolev-regular type distributions, and welfare functionals' minimizers naturally live in -type spaces over the strategy simplex. The Cooperation Gap framework's "approximate mechanism" notion could be re-cast as a closeness condition in a Sobolev-type seminorm over the policy space.
Canonical References
- Evans, Partial Differential Equations, Ch. 5 — the standard graduate intro.
- Brezis, Functional Analysis, Sobolev Spaces, and PDEs — clean, modern.
- Adams & Fournier, Sobolev Spaces — encyclopedic.
- Maz'ya, Sobolev Spaces with Applications to Elliptic PDE — deep and technical, capacity theory.
- Di Nezza–Palatucci–Valdinoci, Hitchhiker's guide to the fractional Sobolev spaces (2012) — best intro to .
- Triebel, Theory of Function Spaces I–III — for Besov / Triebel–Lizorkin scales.