Martingales: concentration inequalities & sequential analysis (2018)
Course syllabus
Time: Tue, Thu 10:30-11:50am
Location: Wean 4625
Participants This course is intended for advanced PhD students with strong mathematical background.
Prerequisites The first mini is a prerequisite for the second mini. There are no formal prerequisites for the first
mini, but non-PhD students must email the instructor for permission to enroll in the course. Students are expected
to have completed at least one intermediate statistics course (like 10-705 at CMU), and preferably an advanced
statistics course. Students must be familiar with
- basic concentration inequalities (such as Markov, Chebysheff, Hoeffding)
- basics of martingales (filtrations, stopping times, Doob’s maximal inequalities and optional stopping theorems)
- basics of testing and estimation (probability ratio tests, risk, type-1 error, power)
- basics of convex analysis (strict convexity, Legendre-Fenchel transform)
36-771: Martingales 1 (Concentration inequalities)
Martingales are a central topic in statistics, but are even more relevant today due to modern applications to sequential learning and decision making problems. The first mini will present a unified derivation of a wide-variety of new and old concentration inequalities for martingales. We
will prove inequalities for scalars and matrices, that hold under a wide variety of nonparametric assumptions.
Note to those who have already taken Advanced Probability or Advanced Statistics, you will also learn about
(a) self-normalized exponential concentration inequalities for heavy-tailed distributions like those with only two
moments, and even for those with no moments (like the Cauchy), (b) concentration of continuous time processes
like Brownian motions (and possibly, Poisson processes, Levy processes), (c) concentration of martingales in smooth
Banach spaces (useful for vector and matrix concentration).
36-772: Martingales 2 (Sequential analysis)
The second mini will focus on deriving guarantees for a variety
of important problems in sequential analysis using the tools developed in the first mini, as well as new tools such
as uniform nonasymptotic versions of the law of the iterated logarithm for scalars and matrices. Applications
include sequential analogs of the t-test that are valid without a Gaussian assumption, best-arm identification in
multi-armed bandits, average treatment effect estimation in sequential clinical trials, sequential covariance matrix estimation, and other such problems.
Lecture notes
L0: Classical asymptotic and nonasymptotic theorems
L1: Filtrations and stopping times
L2: Martingales, Doob's and Ville's maximal inequalities
L3: The canonical supermartingale assumption, and sub-psi processes
L4: The mother theorem
L5: Proof of the mother theorem
L6: Matrix exponentials, Lieb's inequality, proof of connector lemma
L7: Sufficient conditions for Assumption 1
L8: Improving Cramer-Chernoff's and Freedman's inequalities (and matrix extensions by Tropp), Hermitian dilation
L9,10: Improving Pinelis' inequalities in Banach spaces
L11: Improving concentration of continuous-time martingales (Poisson, Levy processes, etc)
L12: Improving de la Pena's (and Bercu/Delyon/Rio's) self-normalized inequalities
L13: Proving Ville's inequality and summary of mini
L14: Confidence sequences
L15: The inadequacy of linear boundaries
L16: Stitching for subGamma/subGaussian LIL boundaries
L17: Universality of subGamma boundaries
L18: Conjugate mixture boundaries
L19: Discrete mixtures and inverted stitching
L20: Relationship to sequential hypothesis testing, always valid p-values
L21: Best-arm identification in multi-armed bandits (no notes)
L22: Sequential average treatment effect estimation (see paper)
L23: Sequential covariance matrix estimation, matrix LIL (see paper)
L24: Extensions to continuous time and Banach spaces, and summary of mini (see paper)
L25: Project presentations
References:
There is no single textbook we will follow. Some related textbooks include:
- Large deviations techniques and applications, by Dembo and Zeitouni
- Stopped random walks: limit theorems and applications, by Gut
- Self-normalized processes, by de la Pena, Lai and Shao
- Concentration inequalities for sums and martingales, by Bercu, Delyon and Rio
- Concentration inequalities: a nonasymptotic theory of independence, by Boucheron, Lugosi and Massart
Two relevant papers are: Exponential line crossing inequalities and
Uniform, nonparametric, nonasymptotic confidence sequences.
The one and only homework for Mini 1
The one and only homework for Mini 2
Some interesting martingale history