Sobolev Spaces

Motivation & Setup

PDE theory and the calculus of variations require function spaces in which (i) differentiation makes sense for non-smooth functions, (ii) the space is complete under an $L^{p}$ -flavored norm, and (iii) one can embed into $L^{q}$ or Hölder spaces. The classical $C^{k}$ spaces fail (ii); pure $L^{p}$ fails (i). Sobolev spaces $W^{k, p}$ are the fix: they replace pointwise differentiation with the weak derivative and complete $C^{k}$ under the natural $L^{p}$ -Sobolev norm.

Throughout: $Ω \subset R^{n}$ open, $1 \leq p \leq \infty$ , $k \in N_{0}$ . Use multi-index $α = (α_{1}, \dots, α_{n})$ , $∣ α ∣ = \sum α_{i}$ , $D^{α} = \partial_{1}^{α_{1}} \dots \partial_{n}^{α_{n}}$ .

Why Classical Spaces Fail

$C^{k} (\overset{ˉ}{Ω})$ with $∥ \cdot ∥_{C^{k}}$ is complete but ignores $L^{p}$ structure; minimizing sequences of integral energies need not converge in $C^{k}$ .
$C^{k} (\overset{ˉ}{Ω})$ with the $L^{p}$ -Sobolev norm is not complete; its completion is precisely $W^{k, p}$ .
The Dirichlet energy $E (u) = \frac{1}{2} \int_{Ω} ∣\nabla u ∣^{2}$ is minimized naturally over $H^{1}$ , not over $C^{1}$ .

[!note] The Direct Method The Direct Method of the Calculus of Variations needs: lower semicontinuity + coercivity + a reflexive space to extract weakly convergent minimizing subsequences. Sobolev spaces are the canonical reflexive choice (when $1 < p < \infty$ ).

Weak Derivatives

Definition: Weak Derivative

Given $u \in L_{loc}^{1} (Ω)$ and multi-index $α$ , a function $v \in L_{loc}^{1} (Ω)$ is the $α$ -th weak derivative of $u$ if

\int_{Ω} u D^{α} φ d x = (- 1)^{∣ α ∣} \int_{Ω} v φ d x \forall φ \in C_{c}^{\infty} (Ω) .

Notation: $v = D^{α} u$ . The definition is the distributional derivative restricted to those distributions representable by an $L_{loc}^{1}$ function.

Uniqueness: a.e. uniqueness follows from the fundamental lemma of the calculus of variations (du Bois-Reymond).

[!tip] Recognizing weak derivatives If $u \in C^{k}$ , weak and classical derivatives coincide. If $u$ has a jump, the weak derivative does not exist as a function — it picks up a Dirac mass and lives only as a distribution. Example: $u = 1_{[0, \infty)}$ has $u^{'} = δ_{0} \in / L_{loc}^{1}$ .

Calculus Rules for Weak Derivatives

The following hold in $W^{k, p}$ provided RHS makes sense:

Linearity: $D^{α} (a u + b v) = a D^{α} u + b D^{α} v$ .
Product rule (chain): if $u \in W^{1, p}$ and $η \in C_{c}^{\infty}$ , then $η u \in W^{1, p}$ and $\partial_{i} (η u) = η \partial_{i} u + u \partial_{i} η$ .
Chain rule (Stampacchia): if $F \in C^{1} (R)$ , $F^{'} \in L^{\infty}$ , $F (0) = 0$ , and $u \in W^{1, p}$ , then $F (u) \in W^{1, p}$ with $\partial_{i} F (u) = F^{'} (u) \partial_{i} u$ .
Truncation: $u^{+} = max (u, 0) \in W^{1, p}$ with $\nabla u^{+} = \nabla u \cdot 1_{{u > 0}}$ (almost everywhere).

The truncation rule underlies the maximum principle for weak solutions.

The Spaces $W^{k, p}$ and $H^{k}$

Definition: Sobolev Space $W^{k, p} (Ω)$

W^{k, p} (Ω) = {u \in L^{p} (Ω) : D^{α} u \in L^{p} (Ω) for all ∣ α ∣ \leq k} .

Norm:

∥ u ∥_{W^{k, p}} = (∣ α ∣ \leq k \sum ∥ D^{α} u ∥_{L^{p}}^{p})^{1/ p} (1 \leq p < \infty), ∥ u ∥_{W^{k, \infty}} = ∣ α ∣ \leq k max ∥ D^{α} u ∥_{L^{\infty}} .

Equivalent norms abound — e.g., $∥ u ∥_{L^{p}} + \sum_{∣ α ∣ = k} ∥ D^{α} u ∥_{L^{p}}$ .

Definition: $H^{k}$ — the Hilbert Case

For $p = 2$ write $H^{k} (Ω) := W^{k, 2} (Ω)$ . Inner product:

⟨ u, v ⟩_{H^{k}} = ∣ α ∣ \leq k \sum \int_{Ω} D^{α} u D^{α} v d x .

This is the only $p$ giving a Hilbert space — it's the workhorse for linear elliptic PDE and spectral theory.

Definition: $W_{0}^{k, p} (Ω)$ — Zero Boundary Trace

W_{0}^{k, p} (Ω) := \overline{C_{c}^{\infty} (Ω)}^{∥ \cdot ∥_{W^{k, p}}} .

Heuristically: functions with "zero boundary values up to order $k - 1$ ". On all of $R^{n}$ , $W_{0}^{k, p} (R^{n}) = W^{k, p} (R^{n})$ . On bounded $Ω$ with reasonable boundary, the inclusion $W_{0}^{k, p} ⊊ W^{k, p}$ is strict (constants live in the latter, not the former).

Banach / Hilbert / Reflexive / Separable Properties

Property	$W^{k, p}$ , $1 \leq p < \infty$	$W^{k, \infty}$	$H^{k}$
Banach	✓	✓	✓
Hilbert	only $p = 2$	✗	✓
Separable	✓	✗	✓
Reflexive	✓ iff $1 < p < \infty$	✗	✓
Uniformly convex	$1 < p < \infty$	✗	✓

Completeness proof sketch: a Cauchy sequence ${u_{m}} \subset W^{k, p}$ has ${D^{α} u_{m}}$ Cauchy in $L^{p}$ for each $∣ α ∣ \leq k$ ; pass to $L^{p}$ -limits, then verify the limits satisfy the weak-derivative identity by passing the limit through the integration-by-parts identity (dominated convergence on the test function side).

Approximation by Smooth Functions

Mollifiers

$η \in C_{c}^{\infty} (R^{n})$ , $η \geq 0$ , $\int η = 1$ , $supp η \subset B_{1} (0)$ . Set $η_{ε} (x) = ε^{- n} η (x / ε)$ . Define $u_{ε} = η_{ε} * u$ on $Ω_{ε} = {x \in Ω : dist (x, \partial Ω) > ε}$ . Then $u_{ε} \in C^{\infty} (Ω_{ε})$ and $D^{α} u_{ε} = η_{ε} * D^{α} u$ on $Ω_{ε}$ (the weak derivative commutes with convolution).

Local Approximation

If $u \in W_{loc}^{k, p} (Ω)$ , $1 \leq p < \infty$ , then $u_{ε} \to u$ in $W_{loc}^{k, p} (Ω)$ . The proof is Friedrichs's classical mollification argument.

Theorem: Meyers–Serrin " $H = W$ "

For every open $Ω \subset R^{n}$ and $1 \leq p < \infty$ , $C^{\infty} (Ω) \cap W^{k, p} (Ω)$ is dense in $W^{k, p} (Ω)$ .

The historical name: $H^{k, p}$ used to denote the completion of $C^{\infty} \cap W^{k, p}$ under $∥ \cdot ∥_{W^{k, p}}$ . Meyers–Serrin (1964) proved $H^{k, p} = W^{k, p}$ for arbitrary open sets, with no boundary regularity assumed. The proof partitions $Ω$ into shells and mollifies on each, gluing with a partition of unity.

[!note] What can fail at the boundary Density of $C^{\infty} (\overset{ˉ}{Ω})$ (note the closure) requires $Ω$ to have, e.g., the segment property (a weak boundary regularity). Without it, $C^{\infty} (Ω) \cap W^{k, p}$ may fail to be dense in $W^{k, p}$ in the $C^{\infty} (\overset{ˉ}{Ω})$ subspace sense — though Meyers–Serrin density is unaffected.

Extension Theorems

Extension Operator

A bounded linear $E : W^{k, p} (Ω) \to W^{k, p} (R^{n})$ with $E u ∣_{Ω} = u$ . Such an operator lets you transfer Fourier-analytic and global- $R^{n}$ results back to $Ω$ .

Existence: $E$ exists when $\partial Ω$ is sufficiently regular — Lipschitz suffices for $k = 1$ ; $C^{k}$ suffices in general. Constructions: reflection across the boundary (half-space), then partition of unity + local charts.

[!note] Calderón–Stein extension For bounded Lipschitz $Ω$ , Stein (1970) constructed a universal extension operator $E$ that works simultaneously for all $W^{k, p}$ , $k \in N_{0}$ , $1 \leq p \leq \infty$ . Without boundary regularity, extension can fail.

Traces

The trace problem: $u \in W^{1, p} (Ω)$ is only defined a.e., so the boundary $\partial Ω$ (measure-zero set) sees no canonical value. Trace theorems make this work via continuity.

Theorem: Trace Operator

Let $Ω$ be bounded with $C^{1}$ boundary, $1 \leq p < \infty$ . There exists a bounded linear

T : W^{1, p} (Ω) ⟶ L^{p} (\partial Ω), T u = u ∣_{\partial Ω} if u \in C (\overset{ˉ}{Ω}) .

Range: $T$ maps $W^{1, p}$ onto $W^{1 - 1/ p, p} (\partial Ω)$ — a fractional Sobolev space. The kernel is $W_{0}^{1, p} (Ω)$ :

W_{0}^{1, p} (Ω) = ker T = {u \in W^{1, p} (Ω) : T u = 0} .

Trace as Loss of $1/ p$ Regularity

The " $1 - 1/ p$ " reflects a general principle: restricting to a codimension- $d$ surface costs $d / p$ derivatives. Heuristic via Fourier: restriction in $R^{n}$ corresponds to summing over the normal frequency, which behaves like a Riemann sum against $∣ ξ ∣^{- d}$ , hence the $d / p$ deficit.

[!question] Why no trace for $p = 1$ on the kernel side? $W_{0}^{1, 1} (Ω)$ still equals $ker T$ , but $W^{1 - 1, 1} = L^{1}$ , and the trace operator $W^{1, 1} \to L^{1} (\partial Ω)$ is surjective onto $L^{1} (\partial Ω)$ — the fractional space "collapses". For $p = \infty$ traces land in $W^{1, \infty} (\partial Ω)$ directly (it's just Lipschitz restriction).

Sobolev Embedding Theorems

The heart of the theory. Set the Sobolev conjugate $p^{*} := \frac{n p}{n - p}$ for $1 \leq p < n$ . Three regimes by $k p$ vs. $n$ .

Theorem: Gagliardo–Nirenberg–Sobolev ( $k p < n$ )

Let $Ω$ be bounded with Lipschitz boundary, $1 \leq p < n$ . Then $W^{1, p} (Ω) ↪ L^{p^{*}} (Ω)$ , i.e.

∥ u ∥_{L^{p^{*}} (Ω)} \leq C ∥ u ∥_{W^{1, p} (Ω)} .

On $R^{n}$ for $u \in C_{c}^{\infty} (R^{n})$ , the sharp form is

∥ u ∥_{L^{p^{*}} (R^{n})} \leq C (n, p) ∥\nabla u ∥_{L^{p} (R^{n})} .

Sharp constant: Talenti–Aubin (1976). Extremizers exist only for $1 < p < n$ and are translates/dilates of the Aubin–Talenti profile $U (x) = (1 + ∣ x ∣^{p / (p - 1)})^{- (n - p) / p}$ (mod scaling).

General order: $W^{k, p} (Ω) ↪ L^{q} (Ω)$ for $\frac{1}{q} = \frac{1}{p} - \frac{k}{n}$ when $k p < n$ .

Theorem: Morrey ( $k p > n$ )

If $k p > n$ , then

W^{k, p} (Ω) ↪ C^{m, α} (\overset{ˉ}{Ω}), m + α = k - n / p, m \in N_{0}, α \in (0, 1] .

In particular $W^{1, p} ↪ C^{0, 1 - n / p}$ for $p > n$ on Lipschitz domains. This is the Morrey embedding, with explicit Hölder modulus.

The Borderline Case $k p = n$

The naive guess $L^{\infty}$ fails: $W^{1, n} \neq ↪ L^{\infty}$ . Counterexample (n=2): $u (x) = lo g lo g (1 + 1/∣ x ∣)$ on the unit ball, in $W^{1, 2}$ but unbounded.

One has $W^{1, n} ↪ L^{q}$ for all $q \in [n, \infty)$ but not $q = \infty$ .
Trudinger–Moser inequality: there exists $α_{n} > 0$ such that

∥\nabla u ∥_{L^{n}} \leq 1 sup \int_{Ω} exp (α_{n} ∣ u ∣^{n / (n - 1)}) d x < \infty.

Sharp constant $α_{n} = n ω_{n - 1}^{1/ (n - 1)}$ (Moser, 1971). This is the natural limit replacing $L^{\infty}$ .

Summary Table of Embeddings

Regime	Embedding	Borderline
$k p < n$	$W^{k, p} ↪ L^{q}$ , $\frac{1}{q} = \frac{1}{p} - \frac{k}{n}$	$q = p^{*} = n p / (n - p)$ best
$k p = n$	$W^{k, p} ↪ L^{q}$ all $q < \infty$	Trudinger–Moser exponential
$k p > n$	$W^{k, p} ↪ C^{m, α}$ , $m + α = k - n / p$	$α = 1$ requires $k p > n$ strict

Proof Sketch — GNS via the $p = 1$ Identity

The slickest route. For $u \in C_{c}^{\infty} (R^{n})$ :

∣ u (x) ∣ \leq \int_{- \infty}^{x_{i}} ∣ \partial_{i} u (\dots, t, \dots) ∣ d t .

Apply this in each direction, multiply the $n$ inequalities, take the $(n - 1)$ -th root, integrate, use generalized Hölder, get

∥ u ∥_{L^{n / (n - 1)}} \leq i = 1 \prod n ∥ \partial_{i} u ∥_{L^{1}}^{1/ n} \leq ∥\nabla u ∥_{L^{1}} .

This is the $p = 1$ case. For general $1 < p < n$ , apply the $p = 1$ inequality to $v = ∣ u ∣^{γ}$ with $γ$ chosen so the $L^{n / (n - 1)}$ exponent matches $p^{*}$ . Standard, e.g., Evans Ch. 5.6.

Compactness: Rellich–Kondrachov

Theorem: Rellich–Kondrachov

Let $Ω \subset R^{n}$ be bounded with Lipschitz boundary, $1 \leq p < n$ . Then the embedding

W^{1, p} (Ω) ↪↪ L^{q} (Ω) is compact for every 1 \leq q < p^{*} .

At the critical exponent $q = p^{*}$ the embedding is continuous but not compact. For $p \geq n$ compact embeddings into $L^{q}$ all $q < \infty$ (resp. into $C^{0, α}$ for $p > n$ and $α < 1 - n / p$ ).

Consequences for Variational Methods

Compactness is the workhorse for:

Existence of minimizers (extract weakly convergent subsequence → strongly convergent in $L^{q}$ for sub-critical $q$ ).
Discrete spectrum of self-adjoint elliptic operators (Laplacian on bounded domain with Dirichlet BC → compact resolvent → countable eigenvalues $λ_{k} \to \infty$ ).
Concentration-compactness (Lions) handles loss at critical exponents.

[!note] Loss of compactness at the critical exponent Failure of $H^{1} ↪↪ L^{2^{*}}$ (with $2^{*} = 2 n / (n - 2)$ ) is the source of all critical-exponent phenomena: Yamabe problem, prescribed scalar curvature, Brezis–Nirenberg. Bubbles $u_{ε} (x) = ε^{- (n - 2) /2} U (x / ε)$ converge weakly but not strongly.

Poincaré-Type Inequalities

These convert control of $\nabla u$ into control of $u$ — essential for coercivity of the Dirichlet form.

Poincaré Inequality (Zero-Trace Version)

For $u \in W_{0}^{1, p} (Ω)$ , $Ω$ bounded, $1 \leq p < \infty$ :

∥ u ∥_{L^{p} (Ω)} \leq C (Ω, p) ∥\nabla u ∥_{L^{p} (Ω)} .

Best constant for $p = 2$ is $1/ λ_{1}$ , where $λ_{1}$ is the first Dirichlet eigenvalue of $- Δ$ on $Ω$ .

Consequence: on $W_{0}^{1, p}$ , the seminorm $∥\nabla u ∥_{L^{p}}$ is an equivalent norm.

Poincaré–Wirtinger (Zero-Mean Version)

For $u \in W^{1, p} (Ω)$ , $Ω$ bounded connected Lipschitz:

u - \frac{1}{∣Ω∣} \int_{Ω} u_{L^{p} (Ω)} \leq C (Ω, p) ∥\nabla u ∥_{L^{p} (Ω)} .

Best $p = 2$ constant: $1/ μ_{1}$ where $μ_{1}$ is the first Neumann eigenvalue (the second eigenvalue of the Neumann Laplacian; the first is 0 with constant eigenfunction).

[!tip] When you need a Poincaré Any time you want coercivity for $\int ∣\nabla u ∣^{p}$ as a norm. If you have Dirichlet BC, use the zero-trace version. If you're working on $W^{1, p} / R$ (e.g., Neumann problems), use Poincaré–Wirtinger.

Fractional Sobolev Spaces $W^{s, p}$

Two main definitions; for $p = 2$ they coincide and equal the Bessel potential space.

Slobodeckij Seminorm (Gagliardo)

For $s \in (0, 1)$ , $1 \leq p < \infty$ :

[u]_{W^{s, p} (Ω)}^{p} := \int_{Ω} \int_{Ω} \frac{∣ u ( x ) - u ( y ) ∣ ^{p}}{∣ x - y ∣ ^{n + s p}} d x d y,

∥ u ∥_{W^{s, p}}^{p} := ∥ u ∥_{L^{p}}^{p} + [u]_{W^{s, p}}^{p} .

For $s > 1$ , write $s = k + σ$ with $k \in N$ , $σ \in (0, 1)$ , and require $D^{α} u \in W^{σ, p}$ for $∣ α ∣ = k$ .

Bessel Potential Spaces $H^{s, p}$

Via Fourier transform on $R^{n}$ :

H^{s, p} (R^{n}) := {u \in S^{'} (R^{n}) : (1 - Δ)^{s /2} u \in L^{p} (R^{n})},

i.e. $(1 - Δ)^{s /2} u (ξ) = (1 + ∣ ξ ∣^{2})^{s /2} u (ξ)$ .

Property	$W^{s, p}$ (Slobodeckij)	$H^{s, p}$ (Bessel)
Defined via	Difference quotients	Fourier multipliers
Equals $W^{s, p}$ for $p = 2$	✓	✓
Equals $W^{s, p}$ for $p \neq = 2$	—	only if $s \in N$
Interpolation behavior	Real interp. (Besov-like)	Complex interp.

In fact, $W^{s, p}$ (Slobodeckij) coincides with the Besov space $B_{p, p}^{s}$ , not with $H^{s, p} = F_{p, 2}^{s}$ — the latter is Triebel–Lizorkin. The mismatch for $p \neq = 2$ , $s \in / N$ is fundamental.

Trace Spaces Revisited

The trace operator $T : W^{k, p} (Ω) \to W^{k - 1/ p, p} (\partial Ω)$ uses precisely the Slobodeckij version because $\partial Ω$ is a manifold without group structure where Fourier is awkward. The image is genuinely fractional unless $p = 2$ , where everything coincides nicely.

Dual Spaces and Negative-Order Sobolev Spaces

$H^{- 1} (Ω)$ — Dual of $H_{0}^{1}$

H^{- 1} (Ω) := (H_{0}^{1} (Ω))^{*} .

Characterization: $f \in H^{- 1}$ iff $f = f_{0} + \sum_{i = 1}^{n} \partial_{i} f_{i}$ for some $f_{0}, \dots, f_{n} \in L^{2} (Ω)$ (in distributional sense). Norm: infimum of $(\sum ∥ f_{i} ∥_{L^{2}}^{2})^{1/2}$ over such decompositions.

This is the natural target space for the Laplacian:

- Δ : H_{0}^{1} (Ω) \to H^{- 1} (Ω) is an isomorphism .

(Lax–Milgram applied to the bilinear form $a (u, v) = \int \nabla u \cdot \nabla v$ .)

Negative Fractional Spaces

For $s > 0$ , $W^{- s, p} (Ω) := (W_{0}^{s, p^{'}} (Ω))^{*}$ where $p^{'} = p / (p - 1)$ . These appear when applying differential operators of order $> k$ to $W^{k, p}$ functions, the result lying in negative-order spaces.

Bounded Variation $B V$ and the $W^{1, 1}$ Subtleties

$B V (Ω)$

B V (Ω) := {u \in L^{1} (Ω) : D u \in M (Ω; R^{n})}

where $M$ is the space of finite Radon measures. $W^{1, 1} ⊊ B V$ : $B V$ allows the gradient to be a measure, e.g., $1_{E}$ for $E$ a set of finite perimeter has $D 1_{E} = ν_{E} H^{n - 1} ⌊ \partial^{*} E$ (the reduced boundary times the outer normal).

$B V$ replaces $W^{1, 1}$ in image processing (Rudin–Osher–Fatemi total variation denoising), free-boundary problems, and minimal surface theory. It is not reflexive, only weak- $*$ compact (Banach–Alaoglu on the dual of a separable space).

Connections, Comparisons, Applications

Sobolev vs. Hölder vs. Lipschitz Spaces

Space	Norm strength	When useful
$W^{k, p}$	$L^{p}$ of $k$ derivatives	Variational PDE, weak solutions
$C^{k, α}$	Hölder modulus	Regularity theory, Schauder estimates
$W^{k, \infty} = C^{k - 1, 1}$	Bounded $k$ th derivative	Lipschitz / semiconcave regularity
$B V$	Measure-valued gradient	Free discontinuities, $L^{1}$ -coercivity
$H^{s, p}$	Bessel/Fourier	Pseudodifferential, harmonic analysis

Why $p = 2$ Is Privileged

Only $p$ giving Hilbert structure → Riesz representation, Lax–Milgram, spectral theory.
Fourier transform is unitary on $L^{2}$ → exact characterization of $H^{s}$ via $\int (1 + ∣ ξ ∣^{2})^{s} ∣ \overset{u}{^} ∣^{2}$ .
Coincidence of all definitions: $W^{s, 2} = H^{s, 2} = B_{2, 2}^{s} = F_{2, 2}^{s}$ .
Self-dual scale: $(H^{s})^{*} = H^{- s}$ .

Connection to Statistical Learning / NN Approximation

Barron spaces and RKHSs sit inside Sobolev spaces — used to bound NN approximation rates.
Curse of dimensionality manifests through Sobolev embeddings: to approximate $W^{k, p}$ functions in $L^{\infty}$ requires $k > n / p$ , giving the classical $n^{- k / d}$ rate.
Neural ODEs / flow-based models implicitly minimize Sobolev-type norms on the velocity field (e.g., FFJORD).
For Stein discrepancies / score matching, Sobolev regularity of the data density controls convergence.

Connection to Optimal Transport & Functional Inequalities

The log-Sobolev inequality $Ent_{μ} (f^{2}) \leq 2 c \int ∣\nabla f ∣^{2} d μ$ controls hypercontractivity and exponential ergodicity of diffusions.
Bakry–Émery $Γ_{2}$ -calculus uses Sobolev-type spaces on metric measure spaces.
Sobolev–type inequalities ↔ isoperimetric inequalities (Federer–Fleming), via $W^{1, 1}$ and BV.
The Otto–Villani theorem links log-Sobolev → Talagrand $T_{2}$ → Wasserstein concentration.

[!note] Tangent to Angelo's work Sobolev structure underlies most rigorous analysis of continuous-time/space multi-agent systems and continuous strategy spaces. Mechanism-design impossibility results (Myerson–Satterthwaite, Gibbard–Satterthwaite) live in discrete/measurable settings — but their continuous analogues (e.g. matching with continuous types, Hotelling-style games) often require Sobolev-regular type distributions, and welfare functionals' minimizers naturally live in $H^{1}$ -type spaces over the strategy simplex. The Cooperation Gap framework's "approximate mechanism" notion could be re-cast as a closeness condition in a Sobolev-type seminorm over the policy space.

Canonical References

Evans, Partial Differential Equations, Ch. 5 — the standard graduate intro.
Brezis, Functional Analysis, Sobolev Spaces, and PDEs — clean, modern.
Adams & Fournier, Sobolev Spaces — encyclopedic.
Maz'ya, Sobolev Spaces with Applications to Elliptic PDE — deep and technical, capacity theory.
Di Nezza–Palatucci–Valdinoci, Hitchhiker's guide to the fractional Sobolev spaces (2012) — best intro to $W^{s, p}$ .
Triebel, Theory of Function Spaces I–III — for Besov / Triebel–Lizorkin scales.

Sobolev Spaces#

Motivation & Setup#

Why Classical Spaces Fail#

Weak Derivatives#

Definition: Weak Derivative#

Calculus Rules for Weak Derivatives#

The Spaces Wk,p and Hk#

Definition: Sobolev Space Wk,p(Ω)#

Definition: Hk — the Hilbert Case#

Definition: W0k,p​(Ω) — Zero Boundary Trace#

Banach / Hilbert / Reflexive / Separable Properties#

Approximation by Smooth Functions#

Mollifiers#

Local Approximation#

Theorem: Meyers–Serrin "H=W"#

Extension Theorems#

Extension Operator#

Traces#

Theorem: Trace Operator#

Trace as Loss of 1/p Regularity#

Sobolev Embedding Theorems#

Theorem: Gagliardo–Nirenberg–Sobolev (kp<n)#

Theorem: Morrey (kp>n)#

The Borderline Case kp=n#

Summary Table of Embeddings#

Proof Sketch — GNS via the p=1 Identity#

Compactness: Rellich–Kondrachov#

Theorem: Rellich–Kondrachov#

Consequences for Variational Methods#

Poincaré-Type Inequalities#

Poincaré Inequality (Zero-Trace Version)#

Poincaré–Wirtinger (Zero-Mean Version)#

Fractional Sobolev Spaces Ws,p#

Slobodeckij Seminorm (Gagliardo)#

Bessel Potential Spaces Hs,p#

Trace Spaces Revisited#

Dual Spaces and Negative-Order Sobolev Spaces#

H−1(Ω) — Dual of H01​#

Negative Fractional Spaces#

Bounded Variation BV and the W1,1 Subtleties#

BV(Ω)#

Connections, Comparisons, Applications#

Sobolev vs. Hölder vs. Lipschitz Spaces#

Why p=2 Is Privileged#

Connection to Statistical Learning / NN Approximation#

Connection to Optimal Transport & Functional Inequalities#

Canonical References#