Sobolev Spaces

Motivation & Setup

PDE theory and the calculus of variations require function spaces in which (i) differentiation makes sense for non-smooth functions, (ii) the space is complete under an -flavored norm, and (iii) one can embed into or Hölder spaces. The classical spaces fail (ii); pure fails (i). Sobolev spaces are the fix: they replace pointwise differentiation with the weak derivative and complete under the natural -Sobolev norm.

Throughout: open, , . Use multi-index , , .

Why Classical Spaces Fail

  • with is complete but ignores structure; minimizing sequences of integral energies need not converge in .
  • with the -Sobolev norm is not complete; its completion is precisely .
  • The Dirichlet energy is minimized naturally over , not over .

[!note] The Direct Method The Direct Method of the Calculus of Variations needs: lower semicontinuity + coercivity + a reflexive space to extract weakly convergent minimizing subsequences. Sobolev spaces are the canonical reflexive choice (when ).

Weak Derivatives

Definition: Weak Derivative

Given and multi-index , a function is the -th weak derivative of if

Notation: . The definition is the distributional derivative restricted to those distributions representable by an function.

Uniqueness: a.e. uniqueness follows from the fundamental lemma of the calculus of variations (du Bois-Reymond).

[!tip] Recognizing weak derivatives If , weak and classical derivatives coincide. If has a jump, the weak derivative does not exist as a function — it picks up a Dirac mass and lives only as a distribution. Example: has .

Calculus Rules for Weak Derivatives

The following hold in provided RHS makes sense:

  • Linearity: .
  • Product rule (chain): if and , then and .
  • Chain rule (Stampacchia): if , , , and , then with .
  • Truncation: with (almost everywhere).

The truncation rule underlies the maximum principle for weak solutions.

The Spaces and

Definition: Sobolev Space

Norm:

Equivalent norms abound — e.g., .

Definition: — the Hilbert Case

For write . Inner product:

This is the only giving a Hilbert space — it's the workhorse for linear elliptic PDE and spectral theory.

Definition: — Zero Boundary Trace

Heuristically: functions with "zero boundary values up to order ". On all of , . On bounded with reasonable boundary, the inclusion is strict (constants live in the latter, not the former).

Banach / Hilbert / Reflexive / Separable Properties

Property,
Banach
Hilbertonly
Separable
Reflexive✓ iff
Uniformly convex

Completeness proof sketch: a Cauchy sequence has Cauchy in for each ; pass to -limits, then verify the limits satisfy the weak-derivative identity by passing the limit through the integration-by-parts identity (dominated convergence on the test function side).

Approximation by Smooth Functions

Mollifiers

, , , . Set . Define on . Then and on (the weak derivative commutes with convolution).

Local Approximation

If , , then in . The proof is Friedrichs's classical mollification argument.

Theorem: Meyers–Serrin ""

For every open and , is dense in .

The historical name: used to denote the completion of under . Meyers–Serrin (1964) proved for arbitrary open sets, with no boundary regularity assumed. The proof partitions into shells and mollifies on each, gluing with a partition of unity.

[!note] What can fail at the boundary Density of (note the closure) requires to have, e.g., the segment property (a weak boundary regularity). Without it, may fail to be dense in in the subspace sense — though Meyers–Serrin density is unaffected.

Extension Theorems

Extension Operator

A bounded linear with . Such an operator lets you transfer Fourier-analytic and global- results back to .

Existence: exists when is sufficiently regular — Lipschitz suffices for ; suffices in general. Constructions: reflection across the boundary (half-space), then partition of unity + local charts.

[!note] Calderón–Stein extension For bounded Lipschitz , Stein (1970) constructed a universal extension operator that works simultaneously for all , , . Without boundary regularity, extension can fail.

Traces

The trace problem: is only defined a.e., so the boundary (measure-zero set) sees no canonical value. Trace theorems make this work via continuity.

Theorem: Trace Operator

Let be bounded with boundary, . There exists a bounded linear

Range: maps onto — a fractional Sobolev space. The kernel is :

Trace as Loss of Regularity

The "" reflects a general principle: restricting to a codimension- surface costs derivatives. Heuristic via Fourier: restriction in corresponds to summing over the normal frequency, which behaves like a Riemann sum against , hence the deficit.

[!question] Why no trace for on the kernel side? still equals , but , and the trace operator is surjective onto — the fractional space "collapses". For traces land in directly (it's just Lipschitz restriction).

Sobolev Embedding Theorems

The heart of the theory. Set the Sobolev conjugate for . Three regimes by vs. .

Theorem: Gagliardo–Nirenberg–Sobolev ()

Let be bounded with Lipschitz boundary, . Then , i.e.

On for , the sharp form is

Sharp constant: Talenti–Aubin (1976). Extremizers exist only for and are translates/dilates of the Aubin–Talenti profile (mod scaling).

General order: for when .

Theorem: Morrey ()

If , then

In particular for on Lipschitz domains. This is the Morrey embedding, with explicit Hölder modulus.

The Borderline Case

The naive guess fails: . Counterexample (n=2): on the unit ball, in but unbounded.

  • One has for all but not .
  • Trudinger–Moser inequality: there exists such that

Sharp constant (Moser, 1971). This is the natural limit replacing .

Summary Table of Embeddings

RegimeEmbeddingBorderline
, best
all Trudinger–Moser exponential
, requires strict

Proof Sketch — GNS via the Identity

The slickest route. For :

Apply this in each direction, multiply the inequalities, take the -th root, integrate, use generalized Hölder, get

This is the case. For general , apply the inequality to with chosen so the exponent matches . Standard, e.g., Evans Ch. 5.6.

Compactness: Rellich–Kondrachov

Theorem: Rellich–Kondrachov

Let be bounded with Lipschitz boundary, . Then the embedding

At the critical exponent the embedding is continuous but not compact. For compact embeddings into all (resp. into for and ).

Consequences for Variational Methods

Compactness is the workhorse for:

  • Existence of minimizers (extract weakly convergent subsequence → strongly convergent in for sub-critical ).
  • Discrete spectrum of self-adjoint elliptic operators (Laplacian on bounded domain with Dirichlet BC → compact resolvent → countable eigenvalues ).
  • Concentration-compactness (Lions) handles loss at critical exponents.

[!note] Loss of compactness at the critical exponent Failure of (with ) is the source of all critical-exponent phenomena: Yamabe problem, prescribed scalar curvature, Brezis–Nirenberg. Bubbles converge weakly but not strongly.

Poincaré-Type Inequalities

These convert control of into control of — essential for coercivity of the Dirichlet form.

Poincaré Inequality (Zero-Trace Version)

For , bounded, :

Best constant for is , where is the first Dirichlet eigenvalue of on .

Consequence: on , the seminorm is an equivalent norm.

Poincaré–Wirtinger (Zero-Mean Version)

For , bounded connected Lipschitz:

Best constant: where is the first Neumann eigenvalue (the second eigenvalue of the Neumann Laplacian; the first is 0 with constant eigenfunction).

[!tip] When you need a Poincaré Any time you want coercivity for as a norm. If you have Dirichlet BC, use the zero-trace version. If you're working on (e.g., Neumann problems), use Poincaré–Wirtinger.

Fractional Sobolev Spaces

Two main definitions; for they coincide and equal the Bessel potential space.

Slobodeckij Seminorm (Gagliardo)

For , :

For , write with , , and require for .

Bessel Potential Spaces

Via Fourier transform on :

i.e. .

Property (Slobodeckij) (Bessel)
Defined viaDifference quotientsFourier multipliers
Equals for
Equals for only if
Interpolation behaviorReal interp. (Besov-like)Complex interp.

In fact, (Slobodeckij) coincides with the Besov space , not with — the latter is Triebel–Lizorkin. The mismatch for , is fundamental.

Trace Spaces Revisited

The trace operator uses precisely the Slobodeckij version because is a manifold without group structure where Fourier is awkward. The image is genuinely fractional unless , where everything coincides nicely.

Dual Spaces and Negative-Order Sobolev Spaces

— Dual of

Characterization: iff for some (in distributional sense). Norm: infimum of over such decompositions.

This is the natural target space for the Laplacian:

(Lax–Milgram applied to the bilinear form .)

Negative Fractional Spaces

For , where . These appear when applying differential operators of order to functions, the result lying in negative-order spaces.

Bounded Variation and the Subtleties

where is the space of finite Radon measures. : allows the gradient to be a measure, e.g., for a set of finite perimeter has (the reduced boundary times the outer normal).

replaces in image processing (Rudin–Osher–Fatemi total variation denoising), free-boundary problems, and minimal surface theory. It is not reflexive, only weak- compact (Banach–Alaoglu on the dual of a separable space).

Connections, Comparisons, Applications

Sobolev vs. Hölder vs. Lipschitz Spaces

SpaceNorm strengthWhen useful
of derivativesVariational PDE, weak solutions
Hölder modulusRegularity theory, Schauder estimates
Bounded th derivativeLipschitz / semiconcave regularity
Measure-valued gradientFree discontinuities, -coercivity
Bessel/FourierPseudodifferential, harmonic analysis

Why Is Privileged

  • Only giving Hilbert structure → Riesz representation, Lax–Milgram, spectral theory.
  • Fourier transform is unitary on → exact characterization of via .
  • Coincidence of all definitions: .
  • Self-dual scale: .

Connection to Statistical Learning / NN Approximation

  • Barron spaces and RKHSs sit inside Sobolev spaces — used to bound NN approximation rates.
  • Curse of dimensionality manifests through Sobolev embeddings: to approximate functions in requires , giving the classical rate.
  • Neural ODEs / flow-based models implicitly minimize Sobolev-type norms on the velocity field (e.g., FFJORD).
  • For Stein discrepancies / score matching, Sobolev regularity of the data density controls convergence.

Connection to Optimal Transport & Functional Inequalities

  • The log-Sobolev inequality controls hypercontractivity and exponential ergodicity of diffusions.
  • Bakry–Émery -calculus uses Sobolev-type spaces on metric measure spaces.
  • Sobolev–type inequalities ↔ isoperimetric inequalities (Federer–Fleming), via and BV.
  • The Otto–Villani theorem links log-Sobolev → Talagrand → Wasserstein concentration.

[!note] Tangent to Angelo's work Sobolev structure underlies most rigorous analysis of continuous-time/space multi-agent systems and continuous strategy spaces. Mechanism-design impossibility results (Myerson–Satterthwaite, Gibbard–Satterthwaite) live in discrete/measurable settings — but their continuous analogues (e.g. matching with continuous types, Hotelling-style games) often require Sobolev-regular type distributions, and welfare functionals' minimizers naturally live in -type spaces over the strategy simplex. The Cooperation Gap framework's "approximate mechanism" notion could be re-cast as a closeness condition in a Sobolev-type seminorm over the policy space.

Canonical References

  • Evans, Partial Differential Equations, Ch. 5 — the standard graduate intro.
  • Brezis, Functional Analysis, Sobolev Spaces, and PDEs — clean, modern.
  • Adams & Fournier, Sobolev Spaces — encyclopedic.
  • Maz'ya, Sobolev Spaces with Applications to Elliptic PDE — deep and technical, capacity theory.
  • Di Nezza–Palatucci–Valdinoci, Hitchhiker's guide to the fractional Sobolev spaces (2012) — best intro to .
  • Triebel, Theory of Function Spaces I–III — for Besov / Triebel–Lizorkin scales.