Martingales: concentration inequalities & sequential analysis (2018)

Course syllabus

Time: Tue, Thu 10:30-11:50am
Location: Wean 4625

Participants This course is intended for advanced PhD students with strong mathematical background.
Prerequisites The first mini is a prerequisite for the second mini. There are no formal prerequisites for the first mini, but non-PhD students must email the instructor for permission to enroll in the course. Students are expected to have completed at least one intermediate statistics course (like 10-705 at CMU), and preferably an advanced statistics course. Students must be familiar with

basic concentration inequalities (such as Markov, Chebysheff, Hoeffding)
basics of martingales (filtrations, stopping times, Doob’s maximal inequalities and optional stopping theorems)
basics of testing and estimation (probability ratio tests, risk, type-1 error, power)
basics of convex analysis (strict convexity, Legendre-Fenchel transform)

36-771: Martingales 1 (Concentration inequalities)
Martingales are a central topic in statistics, but are even more relevant today due to modern applications to sequential learning and decision making problems. The first mini will present a unified derivation of a wide-variety of new and old concentration inequalities for martingales. We will prove inequalities for scalars and matrices, that hold under a wide variety of nonparametric assumptions. Note to those who have already taken Advanced Probability or Advanced Statistics, you will also learn about (a) self-normalized exponential concentration inequalities for heavy-tailed distributions like those with only two moments, and even for those with no moments (like the Cauchy), (b) concentration of continuous time processes like Brownian motions (and possibly, Poisson processes, Levy processes), (c) concentration of martingales in smooth Banach spaces (useful for vector and matrix concentration).

36-772: Martingales 2 (Sequential analysis)
The second mini will focus on deriving guarantees for a variety of important problems in sequential analysis using the tools developed in the first mini, as well as new tools such as uniform nonasymptotic versions of the law of the iterated logarithm for scalars and matrices. Applications include sequential analogs of the t-test that are valid without a Gaussian assumption, best-arm identification in multi-armed bandits, average treatment effect estimation in sequential clinical trials, sequential covariance matrix estimation, and other such problems.

Lecture notes

L0: Classical asymptotic and nonasymptotic theorems
L1: Filtrations and stopping times
L2: Martingales, Doob's and Ville's maximal inequalities
L3: The canonical supermartingale assumption, and sub-psi processes
L4: The mother theorem
L5: Proof of the mother theorem
L6: Matrix exponentials, Lieb's inequality, proof of connector lemma
L7: Sufficient conditions for Assumption 1
L8: Improving Cramer-Chernoff's and Freedman's inequalities (and matrix extensions by Tropp), Hermitian dilation
L9,10: Improving Pinelis' inequalities in Banach spaces
L11: Improving concentration of continuous-time martingales (Poisson, Levy processes, etc)
L12: Improving de la Pena's (and Bercu/Delyon/Rio's) self-normalized inequalities
L13: Proving Ville's inequality and summary of mini

L14: Confidence sequences
L15: The inadequacy of linear boundaries
L16: Stitching for subGamma/subGaussian LIL boundaries
L17: Universality of subGamma boundaries
L18: Conjugate mixture boundaries
L19: Discrete mixtures and inverted stitching
L20: Relationship to sequential hypothesis testing, always valid p-values
L21: Best-arm identification in multi-armed bandits (no notes)
L22: Sequential average treatment effect estimation (see paper)
L23: Sequential covariance matrix estimation, matrix LIL (see paper)
L24: Extensions to continuous time and Banach spaces, and summary of mini (see paper)
L25: Project presentations

References: There is no single textbook we will follow. Some related textbooks include:

Large deviations techniques and applications, by Dembo and Zeitouni
Stopped random walks: limit theorems and applications, by Gut
Self-normalized processes, by de la Pena, Lai and Shao
Concentration inequalities for sums and martingales, by Bercu, Delyon and Rio
Concentration inequalities: a nonasymptotic theory of independence, by Boucheron, Lugosi and Massart

Two relevant papers are: Exponential line crossing inequalities and Uniform, nonparametric, nonasymptotic confidence sequences.

The one and only homework for Mini 1
The one and only homework for Mini 2

Some interesting martingale history