NailTheTest

Graduate Mathematics — Advanced

Advanced Measure Theory

Q: How is Lebesgue measure constructed rigorously on the real line?

The construction of Lebesgue measure proceeds in three stages. First, the outer measure lambda-star of a set E is defined as the infimum of the total length of all countable open interval covers of E. This outer measure is defined on every subset of R but is not additive on all sets. Second, a set E is called Lebesgue measurable (in the sense of Caratheodory) if for every set A, the outer measure of A equals the outer measure of the intersection of A with E plus the outer measure of the intersection of A with the complement of E. Third, the restriction of outer measure to the collection of Lebesgue measurable sets gives a complete, sigma-finite, translation-invariant measure. Every Borel set is Lebesgue measurable, and the resulting measure assigns to each interval its length.

Q: What is a signed measure and how does the Hahn decomposition theorem work?

A signed measure on a measurable space (X, M) is a countably additive function from M to the extended reals that takes the value plus or minus infinity on at most one side, and satisfies nu of the empty set equals zero. The Hahn decomposition theorem states that every signed measure nu admits a partition of X into a positive set P and a negative set N, where P and N are disjoint, their union is X, every measurable subset of P has non-negative measure, and every measurable subset of N has non-positive measure. The Jordan decomposition then writes nu as the difference of two non-negative measures, nu-plus and nu-minus, where nu-plus of E is nu of the intersection of E with P and nu-minus of E is the negative of nu of the intersection of E with N. The total variation measure is nu-plus plus nu-minus.

Q: What exactly does the Radon-Nikodym theorem assert and how is it proved?

The Radon-Nikodym theorem states: let mu and nu be sigma-finite measures on (X, M) with nu absolutely continuous with respect to mu, meaning mu(E) equal to zero implies nu(E) equal to zero. Then there exists a non-negative measurable function f such that for every measurable set E, nu(E) equals the integral over E of f with respect to mu. The function f is called the Radon-Nikodym derivative of nu with respect to mu, written d(nu) divided by d(mu), and it is unique up to sets of mu-measure zero. The standard proof uses the Hilbert space structure of L^2: consider the functional that sends h to the integral of h with respect to nu, which is bounded on L^2(mu). By the Riesz representation theorem for Hilbert spaces, there is a unique g in L^2(mu) representing it, and from this one extracts the Radon-Nikodym derivative.

Q: What is the Riesz-Fischer theorem and why does it prove L^p is complete?

The Riesz-Fischer theorem states that the space L^p(mu) is complete for every 1 at most p at most infinity and every measure mu. The proof for finite p goes as follows. Take a Cauchy sequence (f sub n) in L^p. Extract a subsequence (f sub n sub k) with the L^p norm of f sub n sub k minus f sub n sub k plus 1 less than 1 divided by 2 to the k. Define g as the pointwise sum of the absolute values of the differences. By Minkowski and monotone convergence, g is in L^p, so g is finite almost everywhere. Therefore the telescoping series defining the pointwise limit f converges absolutely almost everywhere. One shows f is in L^p and that f sub n converges to f in the L^p norm using dominated convergence. For L^infinity the argument uses the fact that a countable union of null sets is null, so the uniform limit of essentially bounded functions is essentially bounded.

Q: What does the Lebesgue differentiation theorem say and what is its significance?

The Lebesgue differentiation theorem states: if f is a locally integrable function on R to the n, then for almost every point x, the limit as the radius r goes to zero of the average of f over the ball of radius r centered at x equals f(x). More precisely, the limit of (1 divided by the Lebesgue measure of B(x,r)) times the integral over B(x,r) of f(y) with respect to y equals f(x) for almost every x. Points where this holds are called Lebesgue points of f. The theorem has two major consequences: first, every monotone function on R is differentiable almost everywhere; second, if F(x) is the integral of f from a to x, then the derivative of F equals f almost everywhere, which is the precise version of the fundamental theorem of calculus for Lebesgue integration.

Q: What is the Riesz representation theorem for measures and how does it connect functionals to measures?

The Riesz representation theorem (in its measure-theoretic form) states: if X is a locally compact Hausdorff space and I is a positive linear functional on the space of continuous compactly supported functions from X to R, then there exists a unique regular Borel measure mu on X such that I(f) equals the integral of f with respect to mu for every such f. Regularity means: inner regular (mu(E) equals the supremum of mu(K) over compact subsets K of E) and outer regular (mu(E) equals the infimum of mu(U) over open sets U containing E). This theorem is the bridge between functional analysis and measure theory: it identifies positive linear functionals on function spaces with measures, and is the foundation for the rigorous development of distributions, spectral theory, and harmonic analysis.

Q: How does absolute continuity of a function relate to absolute continuity of its associated measure?

A function F on the interval [a,b] is absolutely continuous in the classical sense if for every epsilon greater than zero there exists delta greater than zero such that for any finite collection of disjoint subintervals whose total length is less than delta, the total variation of F on those subintervals is less than epsilon. A measure nu on the Borel sets of [a,b] is absolutely continuous with respect to Lebesgue measure lambda if lambda(E) equal to zero implies nu(E) equal to zero. The two notions are equivalent in the following precise sense: a function F is absolutely continuous on [a,b] if and only if F can be written as the integral from a to x of some Lebesgue integrable function f, in which case the measure nu defined by nu([a,x]) equal to F(x) minus F(a) is absolutely continuous with respect to Lebesgue measure and its Radon-Nikodym derivative is f. Functions of bounded variation that are not absolutely continuous correspond to measures with a singular part.

Q: What is Fatou's lemma and why is it useful even when the limit function is not integrable?

Fatou's lemma states: if (f sub n) is a sequence of non-negative measurable functions, then the integral of the liminf of f sub n is at most the liminf of the integrals of f sub n. The inequality can be strict: if f sub n is the indicator function of the interval [n, n plus 1], then each integral is 1 but the liminf of f sub n is 0 everywhere, giving integral of liminf equal to 0 while liminf of integrals equals 1. Fatou's lemma requires no integrability hypothesis on the limit because it works only with non-negative functions. It is the key lemma used to prove the monotone convergence theorem, and it is the tool of choice when you suspect a limit function might not be integrable: you can still bound the integral of the limit by the liminf of the sequence of integrals. Reversed Fatou (for non-positive functions) gives the limsup inequality.

A rigorous treatment of the deeper results in measure theory: generated sigma-algebras, Lebesgue measure via Caratheodory extension, signed measures and Hahn decomposition, the Radon-Nikodym theorem, completeness of L^p spaces via Riesz-Fischer, Fubini-Tonelli for product measures, regularity of Borel measures, the Riesz representation theorem, and the Lebesgue differentiation theorem.

1. Generated Sigma-Algebras and the Borel Hierarchy
2. Construction of Lebesgue Measure via Caratheodory Extension
3. Measurable Functions: Composition, Limits, and Approximation
4. The Lebesgue Integral: Simple Functions to Integrable Functions
5. The Three Convergence Theorems
6. L^p Spaces: Holder, Minkowski, and Riesz-Fischer
7. Signed Measures and Hahn-Jordan Decomposition
8. Absolute Continuity and the Radon-Nikodym Theorem
9. Product Measures and Fubini-Tonelli
10. Regularity of Borel Measures and the Riesz Representation Theorem
11. Differentiation of Measures and the Lebesgue Differentiation Theorem
12. Frequently Asked Questions

1. Generated Sigma-Algebras and the Borel Hierarchy

A sigma-algebra on a set X is a collection of subsets closed under complementation and countable unions, and containing the empty set. The power set of X is always a sigma-algebra. The trivial sigma-algebra consisting only of the empty set and X is the smallest possible. The interesting ones lie in between.

Generated Sigma-Algebras

Given any collection C of subsets of X, the sigma-algebra generated by C is the intersection of all sigma-algebras on X that contain every member of C. This intersection is itself a sigma-algebra — the smallest one containing C. We write it sigma(C) or M(C).

Although the definition is abstract (it does not construct the generated sigma-algebra explicitly), it is nonetheless well-defined because the power set is a sigma-algebra containing C, so the intersection is over a nonempty family. When C is countable, one can describe sigma(C) concretely by transfinite induction over the countable ordinals, but for uncountable C such an explicit description is generally impossible.

The Borel Sigma-Algebra

The Borel sigma-algebra on R, denoted B(R), is sigma(open sets). Equivalently it is generated by any of the following families: all open intervals, all closed intervals, all half-open intervals of the form (negative infinity, a), all half-open intervals of the form (negative infinity, a], or all open sets. A set in B(R) is called a Borel set.

The Borel hierarchy classifies Borel sets by complexity. The open sets are called G (from the German offene Menge). The closed sets are called F (from the French ferme). Countable intersections of open sets are called G-delta sets. Countable unions of closed sets are called F-sigma sets. The next level gives G-delta-sigma (countable unions of G-delta sets) and F-sigma-delta sets, and so on through all countable ordinals. The full Borel sigma-algebra is the union of this hierarchy.

The Lebesgue Sigma-Algebra and Completion

The Borel sigma-algebra is not complete: there exist null sets (sets of Lebesgue outer measure zero) that are not Borel sets. The Lebesgue sigma-algebra L is the completion of B(R) with respect to Lebesgue measure: L consists of all sets of the form B union N where B is a Borel set and N is a subset of a Borel null set. Lebesgue measure is then extended to L by setting the measure of B union N equal to the Lebesgue measure of B. This completion process ensures that every subset of a null set is measurable — a convenient and natural condition.

The Vitali set is the classic example of a non-measurable set. It is constructed using the Axiom of Choice by choosing one representative from each equivalence class of R modulo Q (the rationals). The resulting set cannot be assigned a Lebesgue measure consistent with translation invariance and countable additivity. This shows that B(R) and L are strict subsets of the power set of R, and that the Axiom of Choice is in some tension with measure theory.

Key Fact: Separability of B(R)

The Borel sigma-algebra B(R) is generated by a countable collection — for instance, by all open intervals with rational endpoints. This countability at the generating level is the reason many constructions in real analysis and probability theory work without set-theoretic complications.

2. Construction of Lebesgue Measure via Caratheodory Extension

The Caratheodory extension theorem is the standard machinery for constructing measures. The idea is to begin with a pre-measure defined on a simple collection of sets (such as intervals), extend it to an outer measure on all subsets, and then identify the measurable sets as those that interact cleanly with the outer measure.

Outer Measure

An outer measure on X is a function from the power set of X to the non-negative extended reals satisfying: the outer measure of the empty set is zero; monotonicity (if A is contained in B then the outer measure of A is at most the outer measure of B); and countable subadditivity (the outer measure of a countable union is at most the sum of the outer measures).

For Lebesgue outer measure on R, define the outer measure of E as the infimum of the sum of lengths of a countable open interval cover of E. More precisely, the outer measure of E equals the infimum over all sequences of open intervals (a sub k, b sub k) covering E of the sum of (b sub k minus a sub k) over all k. This outer measure assigns to each interval exactly its length, and is defined on every subset of R.

Caratheodory Measurability

A subset E of X is called Caratheodory measurable (with respect to an outer measure lambda-star) if for every set A, the outer measure of A equals the outer measure of the intersection of A with E plus the outer measure of the intersection of A with the complement of E. This condition says E splits every set A optimally — there is no waste in the outer measure.

The Caratheodory theorem states: the collection M of all Caratheodory measurable sets is a sigma-algebra, and the restriction of lambda-star to M is a complete measure. For Lebesgue outer measure, M is exactly the Lebesgue sigma-algebra L, and the restriction is Lebesgue measure lambda.

Properties of Lebesgue Measure

Lebesgue measure lambda on R (or R to the n) satisfies:

Translation invariance: the measure of E plus t equals the measure of E for every t.
Scaling: the measure of t times E equals the absolute value of t to the n times the measure of E in R to the n.
Sigma-finiteness: R is a countable union of bounded intervals, each of finite measure, so Lebesgue measure is sigma-finite.
Completeness: every subset of a null set is measurable, with measure zero.
Uniqueness: Lebesgue measure is the unique complete, translation-invariant measure on the Lebesgue sigma-algebra that assigns length 1 to the unit interval.

Caratheodory Extension Theorem (General Form)

If mu-zero is a pre-measure on an algebra A of subsets of X (meaning mu-zero is countably additive on A for disjoint unions that remain in A), then mu-zero extends to a measure on sigma(A). The extension is unique if mu-zero is sigma-finite. This general form constructs not only Lebesgue measure but also all Borel measures via their values on generating intervals.

3. Measurable Functions: Composition, Limits, and Approximation

A function f from a measurable space (X, M) to (Y, N) is measurable if the preimage of every set in N belongs to M. For real-valued functions, N is taken to be the Borel sigma-algebra on R, and f is measurable if and only if the set where f is greater than a is in M for every real a. Equivalently one can use the sets where f is at least a, or f is less than a, or f is at most a.

Closure Properties

Measurable functions are closed under the following operations:

Arithmetic: sums, differences, products, and (where defined) quotients of measurable functions are measurable.
Composition: if f is measurable from (X, M) to (Y, N) and g is measurable from (Y, N) to (Z, P), then g composed with f is measurable from (X, M) to (Z, P). In particular, if f is measurable and g is Borel measurable, then g composed with f is measurable.
Pointwise limits: the pointwise supremum, infimum, limsup, liminf, and limit (where it exists) of a sequence of measurable functions are measurable. This is a major advantage over Riemann-integrable functions, which are not closed under pointwise limits.
Positive and negative parts: the positive part f-plus equal to the maximum of f and zero, and the negative part f-minus equal to the maximum of negative f and zero, are measurable whenever f is.

Simple Function Approximation

A simple function is a finite linear combination of indicator functions of measurable sets. The indicator function of a set A is the function that equals one on A and zero elsewhere. Simple functions are the building blocks of the Lebesgue integral.

Every non-negative measurable function f is the pointwise increasing limit of a sequence of non-negative simple functions. Explicitly, define f sub n as the function that equals (k minus 1) divided by 2 to the n on the set where (k minus 1) divided by 2 to the n is at most f is less than k divided by 2 to the n, for k ranging from 1 to n times 2 to the n, and equals n on the set where f is at least n. Then f sub n increases to f pointwise everywhere.

For a general measurable function, write f as f-plus minus f-minus and approximate each part separately. The sequence f sub n converges uniformly to f on the set where f is bounded.

Almost Everywhere

A property holds almost everywhere (a.e.) with respect to a measure mu if it holds on a set whose complement has measure zero. Most measure-theoretic results hold almost everywhere rather than everywhere. Two functions that agree almost everywhere are identified in L^p spaces.

4. The Lebesgue Integral: Simple Functions to Integrable Functions

The Lebesgue integral is built in three stages: first for simple functions, then for non-negative measurable functions, then for general integrable functions.

Stage One: Simple Functions

Let (X, M, mu) be a measure space. If phi is a non-negative simple function with standard representation — a finite sum of c sub k times the indicator function of A sub k, where the A sub k are disjoint measurable sets and the c sub k are non-negative real numbers — then the Lebesgue integral of phi is defined as the finite sum of c sub k times mu(A sub k), with the convention that zero times infinity equals zero. This is well-defined regardless of the representation chosen.

Stage Two: Non-Negative Measurable Functions

If f is a non-negative measurable function, its integral with respect to mu is defined as the supremum of the integrals of all non-negative simple functions phi with phi pointwise at most f. This supremum may be infinite. The integral over a measurable set E is defined as the integral of f times the indicator function of E.

Key properties at this stage include linearity (the integral of af plus bg equals a times the integral of f plus b times the integral of g, for non-negative a, b), monotonicity (if f is at most g everywhere then the integral of f is at most the integral of g), and countable additivity (the integral over a countable disjoint union of sets equals the sum of the integrals over the individual sets).

Stage Three: Integrable Functions

A measurable function f (which may take both positive and negative values) is integrable if the integral of the absolute value of f is finite. In this case, write f as f-plus minus f-minus where both f-plus and f-minus are non-negative, and define the integral of f as the integral of f-plus minus the integral of f-minus. Both terms are finite by the integrability hypothesis, so the integral is a well-defined real number.

The collection of all integrable functions on (X, M, mu) is a vector space, and the integral is a linear functional on it. The triangle inequality for integrals states that the absolute value of the integral of f is at most the integral of the absolute value of f. This is the measure-theoretic analogue of the inequality for sums.

Comparison with the Riemann Integral

If f is Riemann integrable on [a, b], then f is Lebesgue integrable and the two integrals agree. The converse is false: the indicator function of the rationals in [0, 1] (the Dirichlet function) is not Riemann integrable but has Lebesgue integral zero, since the rationals form a null set. A bounded function on [a, b] is Riemann integrable if and only if it is continuous almost everywhere — a characterization due to Lebesgue.

5. The Three Convergence Theorems

The power of the Lebesgue integral lies in its superior convergence theorems, which allow interchange of limits and integrals under mild hypotheses. The three fundamental theorems are Fatou's lemma, the monotone convergence theorem, and the dominated convergence theorem.

Fatou's Lemma

Let (f sub n) be a sequence of non-negative measurable functions. Then the integral of the limit inferior of f sub n is at most the limit inferior of the integrals of f sub n. In symbols: the integral of liminf(f sub n) is at most liminf of the integral of f sub n.

The inequality can be strict. If f sub n equals the indicator function of [n, n plus 1] on the real line, then the liminf is identically zero (so the integral of the liminf is zero) but each integral of f sub n equals one (so the liminf of the integrals equals one). Fatou's lemma requires no integrability hypothesis on the individual f sub n or their limit — only non-negativity. This makes it the sharpest tool for establishing that a limit function is integrable.

Monotone Convergence Theorem

Let (f sub n) be a sequence of non-negative measurable functions that increase pointwise (f sub n is at most f sub n plus one everywhere) to a limit f. Then the limit of the integrals of f sub n equals the integral of f. Equality holds, not just an inequality.

The proof follows almost immediately from Fatou's lemma: since f sub n is at most f for all n, the integrals are at most the integral of f, giving the limsup inequality. Fatou applied to the non-negative sequence gives the liminf inequality in the other direction. Together, the limit exists and equals the integral of f.

The monotone convergence theorem is the foundation of the entire Lebesgue integral: it is used to prove that simple function approximations converge to the correct value, to establish linearity of the integral for non-negative functions, and to derive the dominated convergence theorem.

Dominated Convergence Theorem

Let (f sub n) be a sequence of measurable functions converging pointwise almost everywhere to f. Suppose there exists an integrable function g (the dominating function) such that the absolute value of f sub n is at most g almost everywhere for all n. Then f is integrable and the limit of the integrals of f sub n equals the integral of f.

The proof applies Fatou's lemma to the non-negative sequences (g minus f sub n) and (g plus f sub n) separately. The conclusion follows by elementary algebra. The dominating function is essential: without it the theorem fails, as shown by f sub n equal to the indicator function of [0, n] divided by n, which converges pointwise to zero but has constant integral 1.

The dominated convergence theorem justifies differentiating under the integral sign. If f(x, t) is differentiable in t and the partial derivative with respect to t is dominated by an integrable function of x (uniformly in t), then the derivative of the integral equals the integral of the derivative.

Vitali Convergence Theorem

For L^p spaces (p finite), there is a refined result called the Vitali convergence theorem: a sequence (f sub n) in L^p converges in L^p norm to f if and only if the sequence converges in measure to f and the sequence is uniformly integrable (in the L^p sense, meaning the p-th power is equiintegrable). Uniform integrability replaces the role of the single dominating function in the dominated convergence theorem.

6. L^p Spaces: Holder, Minkowski, and Riesz-Fischer

For 1 at most p at most infinity and a measure space (X, M, mu), the L^p space consists of equivalence classes of measurable functions f for which the integral of the p-th power of the absolute value of f is finite (for finite p), or for which f is essentially bounded (for p equal to infinity). Two functions are equivalent if they agree almost everywhere.

The L^p Norm

The L^p norm of f, written as the norm of f sub p, is defined as the p-th root of the integral of the absolute value of f to the p, for finite p. The L^infinity norm is the essential supremum: the infimum of all M such that the absolute value of f is at most M almost everywhere. These norms make L^p a normed vector space.

Holder's Inequality

Let 1 at most p, q at most infinity with 1 divided by p plus 1 divided by q equal to 1 (p and q are conjugate exponents; for p equal to 1, q equals infinity and vice versa). If f is in L^p and g is in L^q, then fg is in L^1 and the absolute value of the integral of fg is at most the L^p norm of f times the L^q norm of g. For p equal to q equal to 2, this is the Cauchy-Schwarz inequality in the L^2 context.

The proof of Holder's inequality uses Young's inequality: for non-negative reals a and b and conjugate exponents p and q, a times b is at most a to the p divided by p plus b to the q divided by q. Young's inequality follows from the concavity of the logarithm. Applying it pointwise to the normalized functions gives Holder's inequality.

Minkowski's Inequality

For 1 at most p at most infinity and f, g in L^p, the L^p norm of f plus g is at most the L^p norm of f plus the L^p norm of g. This is the triangle inequality for the L^p norm. The proof (for finite p greater than 1) applies Holder's inequality to the expansion of the integral of the absolute value of f plus g to the p, bounding it using the L^p norms of f and g separately. For p equal to 1, the triangle inequality follows directly from the triangle inequality for real numbers.

The Riesz-Fischer Theorem: Completeness of L^p

The Riesz-Fischer theorem states that every Cauchy sequence in L^p (for 1 at most p at most infinity) converges to an element of L^p in the L^p norm. That is, L^p is a Banach space — a complete normed vector space.

The standard proof for finite p is as follows. Given a Cauchy sequence (f sub n), extract a rapidly converging subsequence (f sub n sub k) with the L^p norm of f sub n sub k minus f sub n sub k plus one less than 1 divided by 2 to the k for each k. Define g as the sum over k of the absolute values of f sub n sub k plus 1 minus f sub n sub k. By Minkowski and the monotone convergence theorem, the L^p norm of g is finite, so g is finite almost everywhere. Therefore the telescoping series converges absolutely almost everywhere to a limit f. By dominated convergence (dominating function is g plus the absolute value of f sub n sub 1), f is in L^p and the subsequence converges to f in L^p. Since the original sequence is Cauchy and a subsequence converges, the full sequence converges.

L^2 as a Hilbert Space

The space L^2(mu) is a Hilbert space with inner product given by the integral of f times g. The L^2 norm is the square root of this inner product. Orthonormal bases exist: for L^2([0, 1] with Lebesgue measure), the functions e to the 2 pi i n x for integer n form a complete orthonormal set, and every L^2 function has a convergent Fourier series in L^2. The Parseval identity states that the L^2 norm squared equals the sum of the squared absolute values of the Fourier coefficients.

Duality of L^p Spaces

For 1 less than p less than infinity, the dual space of L^p (the space of bounded linear functionals on L^p) is isometrically isomorphic to L^q where p and q are conjugate exponents. Every bounded linear functional I on L^p has the form I(f) equal to the integral of fg for a unique g in L^q, and the norm of I as a functional equals the L^q norm of g. For p equal to 1 the dual is L^infinity, and for L^infinity the dual is strictly larger than L^1.

7. Signed Measures and Hahn-Jordan Decomposition

A signed measure on (X, M) is a countably additive function from M to the extended reals satisfying nu of the empty set equals zero. At most one of plus infinity and minus infinity may be in the range (to avoid indeterminate forms of the type infinity minus infinity). Signed measures arise naturally as differences of two measures, as indefinite integrals of signed functions, and in the statement of the Radon-Nikodym theorem.

Positive and Negative Sets

A set P is positive (for nu) if nu(E) is non-negative for every measurable subset E of P. A set N is negative if nu(E) is non-positive for every measurable subset E of N. A null set is one that is both positive and negative — every measurable subset has measure zero.

Hahn Decomposition Theorem

Every signed measure nu on (X, M) admits a Hahn decomposition: there exist disjoint measurable sets P and N with P union N equal to X such that P is positive and N is negative for nu. The pair (P, N) is called a Hahn decomposition. It is not unique — the difference between two Hahn decompositions is a null set — but the decomposition is essentially unique.

The proof proceeds by setting P equal to the complement of N, where N is constructed as follows. Among all negative sets, choose one with the most negative measure possible (a careful transfinite or sequential argument is needed here to handle the case of infinitely many candidates), and verify that the complement is positive.

Jordan Decomposition

Given a Hahn decomposition (P, N), define nu-plus(E) as nu(E intersected with P) and nu-minus(E) as negative nu(E intersected with N). Then nu-plus and nu-minus are non-negative measures, at least one of which is finite, and nu equals nu-plus minus nu-minus. This is the Jordan decomposition of nu. It is unique: if nu equals mu-1 minus mu-2 for non-negative measures mu-1 and mu-2, then nu-plus is at most mu-1 and nu-minus is at most mu-2, with equality if and only if mu-1 and mu-2 are mutually singular.

The total variation measure of nu is defined as the absolute value of nu, equal to nu-plus plus nu-minus. The total variation norm of nu is the total variation of X, written as the absolute value of nu of X. The collection of all signed measures of finite total variation on (X, M) forms a Banach space under the total variation norm.

Mutual Singularity

Two measures mu and nu are mutually singular (written mu is perpendicular to nu) if there exists a measurable set E such that mu(E) equals zero and nu of the complement of E equals zero. In other words, the two measures live on disjoint sets. In the Jordan decomposition, nu-plus and nu-minus are mutually singular: nu-plus lives on P and nu-minus lives on N. Mutual singularity is one of the two extreme cases in the Lebesgue decomposition theorem.

Lebesgue Decomposition Theorem

If nu and mu are sigma-finite measures on (X, M), then there is a unique decomposition nu equal to nu-AC plus nu-S, where nu-AC is absolutely continuous with respect to mu and nu-S is mutually singular with mu. The absolutely continuous part has a Radon-Nikodym derivative with respect to mu, while the singular part is concentrated on a mu-null set. For Borel measures on R, a measure can be decomposed into absolutely continuous (with density with respect to Lebesgue measure), discrete (point masses), and singular continuous (supported on a Cantor-type set of Lebesgue measure zero but with no point masses) parts.

8. Absolute Continuity and the Radon-Nikodym Theorem

Absolute continuity of measures is the key hypothesis under which one measure can be expressed as an integral with respect to another. A measure nu is absolutely continuous with respect to mu (written nu is much less than mu, or nu is AC with respect to mu) if mu(E) equal to zero implies nu(E) equal to zero for every measurable E.

Radon-Nikodym Theorem

Let mu and nu be sigma-finite measures on (X, M) with nu absolutely continuous with respect to mu. Then there exists a non-negative measurable function f (the Radon-Nikodym derivative) such that for every measurable set E, nu(E) equals the integral over E of f with respect to mu. The function f is unique up to mu-almost-everywhere equality.

The function f is also called the density of nu with respect to mu. The notation d(nu) divided by d(mu) is used by analogy with the ordinary derivative. If nu is absolutely continuous with respect to Lebesgue measure, its density is the probability density function in the sense of classical probability.

Proof via Hilbert Space Methods

The elegant modern proof uses the Hilbert space L^2(mu plus nu). Consider the functional L that sends a function h in L^2(mu plus nu) to the integral of h with respect to nu. Since nu is at most mu plus nu, the Cauchy-Schwarz inequality shows L is bounded. By the Riesz representation theorem for Hilbert spaces, there is a unique function g in L^2(mu plus nu) with L(h) equal to the integral of gh with respect to mu plus nu for all h.

Analysis of g shows that 0 is at most g and g is at most 1 almost everywhere with respect to mu plus nu. Define f as g divided by 1 minus g on the set where g is strictly less than 1, and zero elsewhere (the set where g equals 1 turns out to have mu-measure zero). Then nu(E) equals the integral over E of f with respect to mu. The sigma-finiteness hypothesis is needed to ensure that the relevant function spaces are well-behaved.

Chain Rule and Applications

Radon-Nikodym derivatives satisfy a chain rule: if nu is absolutely continuous with respect to mu, and lambda is absolutely continuous with respect to nu, then lambda is absolutely continuous with respect to mu, and d(lambda) divided by d(mu) equals d(lambda) divided by d(nu) times d(nu) divided by d(mu), with equality holding mu-almost everywhere.

In probability theory, conditional expectation is a Radon-Nikodym derivative. If (Omega, F, P) is a probability space and G is a sub-sigma-algebra of F, the conditional expectation of a random variable X given G is the Radon-Nikodym derivative of the measure E defined by E(A) equal to the integral over A of X dP (for A in G) with respect to the restriction of P to G.

Absolute Continuity of Functions vs. Measures

A function F on [a, b] is absolutely continuous if and only if its distributional derivative is a function in L^1([a, b]) — equivalently, F is the integral of an L^1 function. This characterizes the functions to which the fundamental theorem of calculus applies: F'(x) exists almost everywhere, F' is in L^1, and F(b) minus F(a) equals the integral from a to b of F'. The class of absolutely continuous functions is strictly between the class of Lipschitz functions and the class of functions of bounded variation.

9. Product Measures and Fubini-Tonelli

The product of two measure spaces (X, M, mu) and (Y, N, nu) is constructed as follows. The product sigma-algebra M tensor N is the sigma-algebra on X times Y generated by all rectangles A times B with A in M and B in N. The product measure mu tensor nu is the unique measure on M tensor N satisfying (mu tensor nu) of A times B equal to mu(A) times nu(B) for all rectangles. Its existence and uniqueness (under sigma-finiteness) follow from the Caratheodory extension theorem.

Sections of Functions and Sets

For a function f on X times Y and a point x in X, the x-section of f is the function on Y defined by f sub x of y equal to f(x, y). Similarly for y-sections. For a measurable set E in M tensor N, every x-section and y-section is measurable (in N and M respectively). The function from X to R defined by nu(E sub x) is measurable in M, and similarly for the y-sections. These section measurability results are the technical heart of Fubini-Tonelli.

Tonelli's Theorem

If (X, M, mu) and (Y, N, nu) are sigma-finite measure spaces and f is a non-negative M tensor N-measurable function, then: the function from x to the integral of f sub x with respect to nu is M-measurable; the function from y to the integral of f(x, y) with respect to mu(x) is N-measurable; and the double integral of f with respect to mu tensor nu equals the iterated integral obtained by integrating first over Y then over X, and also equals the iterated integral obtained in the opposite order. No integrability hypothesis is required — the equality holds with all terms possibly infinite.

Fubini's Theorem

If f is integrable on (X times Y, M tensor N, mu tensor nu), then for mu-almost every x, the section f sub x is nu-integrable; the function x to the integral of f sub x d(nu) is mu-integrable; and the double integral equals both iterated integrals. The hypothesis of integrability on the product is essential.

The practical strategy for applying these theorems: given a function f of possibly mixed sign, first apply Tonelli to the absolute value of f to check whether f is integrable on the product (if one iterated integral of the absolute value is finite, f is in L^1 of the product). If so, apply Fubini to switch the order of integration.

Counterexample: Fubini Fails Without Integrability

On the unit square [0, 1] times [0, 1], define f(x, y) equal to (x squared minus y squared) divided by (x squared plus y squared) to the second power at points other than the origin, and zero at the origin. The integral of f(x, y) dy from 0 to 1 is 1 divided by (1 plus x squared), so the outer integral over x from 0 to 1 gives pi divided by 4. But the integral of f(x, y) dx from 0 to 1 is negative 1 divided by (1 plus y squared), so the outer integral over y gives negative pi divided by 4. The two iterated integrals disagree because the integral of the absolute value of f over the product is infinite — the Fubini hypothesis fails.

10. Regularity of Borel Measures and the Riesz Representation Theorem

On topological spaces, the Borel sigma-algebra is generated by the open sets. A Borel measure is a measure on the Borel sigma-algebra. Regularity conditions describe how well a Borel measure is approximated by compact and open sets, connecting the measure-theoretic and topological structures.

Outer and Inner Regularity

A Borel measure mu on a locally compact Hausdorff space X is called outer regular if, for every Borel set E, mu(E) equals the infimum of mu(U) over all open sets U containing E. It is called inner regular (or tight) if mu(E) equals the supremum of mu(K) over all compact sets K contained in E. A measure that is both outer regular and inner regular on all open sets of finite measure is called a Radon measure. On locally compact, sigma-compact Hausdorff spaces (such as R to the n), every locally finite Borel measure is a Radon measure.

Riesz Representation Theorem

Let X be a locally compact Hausdorff space and let I be a positive linear functional on the space of continuous, compactly supported real-valued functions on X. Then there exists a unique Radon measure mu on the Borel sigma-algebra of X such that I(f) equals the integral of f with respect to mu for every continuous compactly supported function f.

The proof constructs the measure using I: define the outer measure of an open set U as the supremum of I(f) over all continuous compactly supported functions f with values in [0, 1] and support contained in U. The outer measure of a general set E is the infimum of the outer measures of open sets containing E. The Caratheodory argument then yields the desired measure.

The Riesz representation theorem has several important corollaries: the dual of the space of continuous functions vanishing at infinity on a locally compact Hausdorff space is the space of finite Radon measures (by Jordan decomposition, extended to signed measures); spectral measures in functional analysis and quantum mechanics arise as Radon measures via the spectral theorem; and the construction of Haar measure on locally compact groups proceeds via a Riesz-type argument applied to the space of continuous functions.

Regularity of Lebesgue Measure

Lebesgue measure on R to the n is a Radon measure. Every Lebesgue measurable set E satisfies: the measure of E equals the infimum of the measures of open sets containing E (outer regularity), and equals the supremum of the measures of compact sets contained in E (inner regularity). This regularity is what makes Lebesgue measure well-behaved for approximation arguments — you can always approximate a measurable set from the outside by open sets and from the inside by compact sets.

11. Differentiation of Measures and the Lebesgue Differentiation Theorem

A central question in analysis is: given a measure nu on R to the n, in what sense can we differentiate nu with respect to Lebesgue measure lambda? The theory of differentiation of measures answers this and yields the Lebesgue differentiation theorem as a special case.

The Hardy-Littlewood Maximal Function

The Hardy-Littlewood maximal function Mf of a locally integrable function f on R to the n is defined at each point x as the supremum over all radii r greater than zero of the average of the absolute value of f over the open ball of radius r centered at x. That is, Mf(x) equals the supremum over r of (1 divided by the Lebesgue measure of the ball of radius r) times the integral of the absolute value of f over the ball of radius r centered at x.

The Hardy-Littlewood maximal inequality states: for any f in L^1 and any t greater than zero, the Lebesgue measure of the set where Mf exceeds t is at most C divided by t times the L^1 norm of f, where C depends only on the dimension n. This weak-type bound is the key tool for proving the Lebesgue differentiation theorem.

Lebesgue Differentiation Theorem

If f is a locally integrable function on R to the n, then for almost every x (with respect to Lebesgue measure), the limit as r approaches zero of (1 divided by the Lebesgue measure of the ball of radius r centered at x) times the integral over the ball of radius r centered at x of f(y) d(lambda)(y) equals f(x).

The proof uses the Hardy-Littlewood maximal inequality. First the result is verified for continuous compactly supported functions (where it is trivial). Then for general f in L^1 local, approximate f by continuous functions and use the maximal inequality to show the approximation error at almost every point goes to zero.

A point x where the limit exists and equals f(x) is called a Lebesgue point of f. The theorem asserts that almost every point is a Lebesgue point.

Differentiation of Monotone Functions

A fundamental consequence of the differentiation theory is that every monotone increasing function F on [a, b] is differentiable almost everywhere. The derivative F' is non-negative and integrable on [a, b], and the integral from a to b of F' is at most F(b) minus F(a), with equality if and only if F is absolutely continuous.

Functions of bounded variation are differences of monotone functions, so they too are differentiable almost everywhere. The derivative of a function of bounded variation need not recover the function via integration — this fails exactly when the measure associated to the function has a singular part. The Cantor function is the canonical example: it is monotone increasing, continuous, has derivative zero almost everywhere, yet increases from 0 to 1 on [0, 1]. The associated measure (the Cantor measure) is singular with respect to Lebesgue measure.

Fundamental Theorem of Calculus for Lebesgue Integration

The complete version of the fundamental theorem of calculus in the Lebesgue setting has two parts. The first part says: if f is in L^1([a, b]) and F(x) equals F(a) plus the integral from a to x of f(t) dt, then F is absolutely continuous on [a, b], F is differentiable almost everywhere, and F' equals f almost everywhere. The second part says: a function F is absolutely continuous on [a, b] if and only if it is of the form F(a) plus the integral from a to x of some L^1 function, in which case the derivative of F (which exists a.e.) is that L^1 function.

Differentiation of General Measures

More generally, if nu is a locally finite Borel measure on R to the n, define the symmetric derivative of nu with respect to Lebesgue measure at x as the limit (when it exists) of nu of the ball of radius r at x divided by lambda of the ball of radius r at x as r approaches zero. If nu is absolutely continuous with respect to Lebesgue measure with density f, the Lebesgue differentiation theorem says the symmetric derivative equals f almost everywhere. If nu is singular with respect to Lebesgue measure, the symmetric derivative equals zero almost everywhere (by the Lebesgue differentiation theorem applied to the Radon-Nikodym decomposition).

12. Frequently Asked Questions

What is a generated sigma-algebra and how is it constructed?

Given any collection C of subsets of a set X, the sigma-algebra generated by C, written sigma(C), is the smallest sigma-algebra on X that contains every set in C. It is constructed as the intersection of all sigma-algebras on X that contain C — since the power set always qualifies, this intersection is over a nonempty family and is itself a sigma-algebra. The Borel sigma-algebra B(R) is generated by the open intervals. The Lebesgue sigma-algebra is the completion of B(R) with respect to Lebesgue measure, obtained by adding all subsets of Borel null sets.

How is Lebesgue measure constructed rigorously on the real line?

In three stages. First, the Lebesgue outer measure of a set E is the infimum over all countable covers of E by open intervals of the total length of the cover. This outer measure is defined on every subset of R. Second, a set E is Caratheodory measurable if for every set A, the outer measure of A equals the outer measure of A intersected with E plus the outer measure of A intersected with the complement of E. Third, the restriction of outer measure to the Caratheodory measurable sets is Lebesgue measure — a complete, translation-invariant, sigma-finite measure assigning length to every interval.

What is a signed measure and how does the Hahn decomposition theorem work?

A signed measure nu on (X, M) is a countably additive function from M to the extended reals, with nu of the empty set equal to zero, and at most one of the values plus infinity and minus infinity in its range. The Hahn decomposition theorem asserts that X can be partitioned into a positive set P (every measurable subset has non-negative nu-measure) and a negative set N (every measurable subset has non-positive nu-measure). From the Hahn decomposition, the Jordan decomposition writes nu as nu-plus minus nu-minus, where nu-plus(E) equals nu of E intersected with P and nu-minus(E) equals the negative of nu of E intersected with N. The total variation of nu is the measure nu-plus plus nu-minus.

What exactly does the Radon-Nikodym theorem assert and how is it proved?

The Radon-Nikodym theorem: if mu and nu are sigma-finite measures with nu absolutely continuous with respect to mu (meaning mu(E) equal to zero implies nu(E) equal to zero), then there exists a non-negative measurable function f such that nu(E) equals the integral over E of f d(mu) for every measurable E. The function f is the Radon-Nikodym derivative, written d(nu) divided by d(mu), and is unique mu-almost everywhere. The modern proof uses the Hilbert space L^2(mu plus nu): the functional sending h to the integral of h d(nu) is bounded, so the Riesz theorem gives a representing element g, from which f is extracted by the formula f equals g divided by 1 minus g.

What is the Riesz-Fischer theorem and why does it prove L^p is complete?

The Riesz-Fischer theorem states that L^p(mu) is complete (a Banach space) for every 1 at most p at most infinity. The proof for finite p: given a Cauchy sequence, extract a subsequence (f sub n sub k) with L^p norm of consecutive differences less than 1 over 2 to the k. The sum g of the absolute values of these differences has finite L^p norm by Minkowski and monotone convergence, so g is finite a.e. The pointwise telescoping series then converges absolutely a.e. to a limit f. Dominated convergence (dominator is g plus the absolute value of f sub n sub 1) gives f is in L^p and f sub n sub k converges to f in L^p. Since the original sequence is Cauchy, it converges to f in L^p as well.

What hypotheses are required for Fubini and Tonelli, and what can go wrong without them?

Tonelli's theorem requires only that f be non-negative and measurable on the product space — no integrability needed — and concludes that the double integral and both iterated integrals are equal (all possibly infinite). Fubini's theorem requires that f be integrable on the product (integral of absolute value of f with respect to the product measure is finite), and concludes equality of the double and iterated integrals. Without integrability, the iterated integrals can both exist but disagree. The canonical counterexample is f(x, y) equal to (x squared minus y squared) divided by (x squared plus y squared) to the second power on the unit square: the two iterated integrals are pi divided by 4 and negative pi divided by 4 respectively, because f is not absolutely integrable on the product.

What does the Lebesgue differentiation theorem say and what is its significance?

For any locally integrable function f on R to the n, at almost every point x, the limit as r goes to zero of the average of f over the ball of radius r centered at x equals f(x). Points satisfying this are called Lebesgue points of f, and almost every point is a Lebesgue point. The theorem has two major consequences: every monotone function on R is differentiable almost everywhere; and the derivative of the indefinite integral of an L^1 function equals the function almost everywhere — the measure-theoretic fundamental theorem of calculus. The proof uses the Hardy-Littlewood maximal inequality as its main tool.

What is the Riesz representation theorem for measures and how does it connect functionals to measures?

On a locally compact Hausdorff space X, every positive linear functional I on the space of continuous compactly supported functions corresponds to a unique Radon measure mu satisfying I(f) equals the integral of f d(mu) for all such f. The theorem identifies positive linear functionals with regular Borel measures. Signed functionals correspond (by Jordan decomposition) to signed Radon measures, and complex-valued functionals correspond to complex measures. This bridge between functional analysis and measure theory underlies the spectral theorem for self-adjoint operators, the definition of distributions, and the construction of Haar measure on locally compact groups.

How does absolute continuity of a function relate to absolute continuity of its associated measure?

A function F on [a, b] is absolutely continuous in the classical sense if for every epsilon greater than zero there exists delta greater than zero such that for any finite collection of disjoint subintervals of total length less than delta, the sum of the oscillations of F on those subintervals is less than epsilon. A Borel measure nu on [a, b] is absolutely continuous with respect to Lebesgue measure if every set of Lebesgue measure zero has nu-measure zero. The two notions match: F is absolutely continuous on [a, b] if and only if F equals F(a) plus the integral from a to x of some L^1 function f, in which case the measure nu with nu([a, x]) equal to F(x) minus F(a) is absolutely continuous with respect to Lebesgue measure with Radon-Nikodym derivative f.

What is Fatou's lemma and why is it useful even when the limit function is not integrable?

Fatou's lemma: for any sequence (f sub n) of non-negative measurable functions, the integral of the limit inferior of f sub n is at most the limit inferior of the integrals of f sub n. No integrability hypothesis is needed. The inequality can be strict: if f sub n is the indicator function of [n, n plus 1], each integral is 1 but the liminf of f sub n is identically zero. The lemma is the foundation from which the monotone convergence theorem and dominated convergence theorem are derived. It is useful for establishing lower bounds on integrals of limits, and for showing that limit functions lie in L^p even when no single dominating function is available — if the liminf of the integrals of the p-th powers of f sub n is finite, then the liminf function is in L^p.

Master Advanced Mathematics with NailTheTest

Our advanced mathematics curriculum spans every topic from precalculus through graduate-level analysis. Practice with worked examples, track your progress, and build the deep understanding needed for graduate programs, research, and mathematical careers.

Browse All Math Topics Start with Real Analysis