Deivation Inequalities II: From the Chernoff Bounds to the Bounded-Difference Inequality

36-465/665, Spring 2021

18 February 2021 (Lecture 6)

\[ \newcommand{\Prob}[1]{\mathbb{P}\left( #1 \right)} \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\Risk}{r} \newcommand{\EmpRisk}{\hat{r}} \newcommand{\Loss}{\ell} \newcommand{\OptimalStrategy}{\sigma} \DeclareMathOperator*{\argmin}{argmin} \newcommand{\ModelClass}{S} \newcommand{\OptimalModel}{s^*} \DeclareMathOperator{\tr}{tr} \newcommand{\Indicator}[1]{\mathbb{1}\left\{ #1 \right\}} \newcommand{\myexp}[1]{\exp{\left( #1 \right)}} \]

Last time

This time

Sums of INID variables

Hoeffding’s inequality

Hoeffding’s bound: \(M_{Z}(t) \leq e^{t\mu} e^{t^2(b-a)^2/8}\)

Hoefdding’s inequality (2)

Hoeffding’s inequality (3)

Example: log loss

Example: log loss (2)

What about functions that aren’t sums or averages?

The Efron-Stein Inequality

Suppose \(X_1, \ldots X_n\) are independent but not necessarily identically distributed, and \(Z=f(X_1, \ldots X_n)\). Say \(Y_1, \ldots Y_n\) are independent copies of the \(X_i\), and define \(Z_i=f(X_1, \ldots X_{i-1}, Y_i, X_{i+1}, \ldots X_n)\). Then \[ \Var{Z} \leq \frac{1}{2}\sum_{i=1}^{n}{\Expect{(Z-Z_i)^2}} \]

The bounded difference property

Bounded difference + Efron-Stein

Hoeffding + bounded differences = McDiarmid

The bounded difference or McDiarmid inequality

If \(f\) has the bounded difference property, then \[ \Prob{|Z-\Expect{Z}|\geq\epsilon} \leq 2\myexp{-\frac{2\epsilon^2}{\sum_{i=1}^{n}{d_i^2}}} \]

Making the bounded-differences inequality useful

Example: Empirical vs. true distribution

Example: Empirical vs. true distribution (2)

Summing up

Going forward: