Low-Regret Learning I

36-465/665, Spring 2021

27 April 2021 (Lecture 23)

\[ \newcommand{\Prob}[1]{\mathbb{P}\left( #1 \right)} \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\Risk}{r} \newcommand{\EmpRisk}{\hat{\Risk}} \newcommand{\Loss}{\ell} \newcommand{\OptimalStrategy}{\sigma} \DeclareMathOperator*{\argmin}{argmin} \newcommand{\ModelClass}{S} \newcommand{\OptimalModel}{s^*} \DeclareMathOperator{\tr}{tr} \newcommand{\Indicator}[1]{\mathbb{1}\left\{ #1 \right\}} \newcommand{\myexp}[1]{\exp{\left( #1 \right)}} \newcommand{\eqdist}{\stackrel{d}{=}} \newcommand{\Rademacher}{\mathcal{R}} \newcommand{\EmpRademacher}{\hat{\Rademacher}} \newcommand{\Growth}{\Pi} \newcommand{\VCD}{\mathrm{VCdim}} \newcommand{\OptDomain}{\Theta} \newcommand{\OptDim}{p} \newcommand{\optimand}{\theta} \newcommand{\altoptimand}{\optimand^{\prime}} \newcommand{\ObjFunc}{M} \newcommand{\outputoptimand}{\optimand_{\mathrm{out}}} \newcommand{\Hessian}{\mathbf{h}} \newcommand{\Penalty}{\Omega} \newcommand{\Lagrangian}{\mathcal{L}} \newcommand{\HoldoutRisk}{\tilde{\Risk}} \DeclareMathOperator{\sgn}{sgn} \newcommand{\Margin}{M} \newcommand{\CumLoss}{L} \newcommand{\EnsembleAction}{\overline{a}} \newcommand{\CumEnsembleLoss}{\overline{\CumLoss}} \newcommand{\Regret}{R} \]

Previously

Low regret learning

Risk vs. Regret

The game (“prediction with expert advice”)

Multiplicative weight training

Regret of the ensemble (vs. the best expert)

Sub-linear regret

A regret bound

If \(\Loss(y,a) \in [0,1]\) and is convex in \(a\), and we use multiplicative weight training with learning rate \(\beta\) over \(q\) experts, then \[ \Regret_n \leq \frac{n\beta}{8} + \frac{\log{q}}{\beta} \] and, with the right choice of \(\beta\), \[ \Regret_n \leq \sqrt{\frac{n}{2}\log{q}} \]

Proving the regret bound

Lower bound

\[\begin{eqnarray} \log{\frac{W_n}{W_0}} & = & \log{\sum_{i=1}^{q}{w_{i,t}}} - \log{\sum_{i=1}^{q}{w_{i,0}}}\\ & = & \log{\sum_{i=1}^{n}{\myexp{-\beta \CumLoss_{i,n}}}} - \log{q}\\ & \geq & \log{\max_{i \in 1:q}{\myexp{-\beta \CumLoss_{i,n}}}} - \log{q}\\ & = & -\beta\min_{i \in 1:q}{\CumLoss_{i,n}} - \log{q} \end{eqnarray}\]

Proving the regret bound (2)

Upper bound

\[\begin{eqnarray} \frac{W_n}{W_0} & = & \frac{W_n}{W_{n-1}}\frac{W_{n-1}}{W_{n-2}} \ldots \frac{W_1}{W_0}\\ \log{\frac{W_n}{W_0}} & = & \sum_{t=1}^{n}{\log{\frac{W_{t}}{W_{t-1}}}}\\ \log{\frac{W_{t}}{W_{t-1}}} & = & \log{\frac{\sum_{i=1}^{q}{w_{i,t}}}{\sum_{j=1}^{q}{w_{j,t-1}}}}\\ & = & \log{\frac{\sum_{i=1}^{q}{w_{i,t-1}\myexp{-\beta \Loss(y_t, s_i(t))}}}{\sum_{j=1}^{q}{w_{j,t-1}}}}\\ & = & \log{\sum_{i=1}^{q}{u_{i,t-1} \myexp{-\beta \Loss(y_t, s_i(t))}}} \end{eqnarray}\]

Proving the regret bound (3)

Upper bound

Proving the regret bound (4)

Upper bound

\[\begin{eqnarray} \log{\frac{W_n}{W_0}} & = & \sum_{t=1}^{n}{\log{\frac{W_{t}}{W_{t-1}}}}\\ & \leq & \sum_{t=1}^{n}{\left(-\beta \Loss(y_t, \EnsembleAction_t) + \frac{\beta^2}{8}\right)}\\ & = & -\beta\CumEnsembleLoss_n + \frac{n \beta^2}{8} \end{eqnarray}\]

Combine upper and lower bounds

\[\begin{eqnarray} -\beta\min_{i \in 1:q}{\CumLoss_{i,n}} - \log{q} & \leq & \log{\frac{W_n}{W_0}} \leq -\beta\CumEnsembleLoss_n + \frac{n \beta^2}{8}\\ \beta\left(\CumEnsembleLoss_n - \min_{i \in 1:q}{\CumLoss_{i,n}}\right) & \leq & \frac{n \beta^2}{8} + \log{q}\\ \Regret_n & \leq & \frac{n \beta}{8} + \frac{\log{q}}{\beta} ~ \Box \end{eqnarray}\]

Tricks, modifications

Unbounded horizons, changing learning rates

Summing up