10-702 Statistical Machine Learning

Instructor: Larry Wasserman
Time: MW 1:30-2:50
Place: Wean Hall 4623

TA: Jingrui He
Office hours: Thursdays 10-12
Place: Wean Hall 8102

TA: Robin Sabhnani
Office hours: Thursdays 6:30-8:30pm
Place: Newell Simon Hall 3122

Course secretary: Sharon Cavlovich
Office: Wean Hall 5315

Introduction to Statistical Learning Theory (Bousquet, Boucheron and Lugosi) Here

Midterm: Monday during class time in the usual classroom

Practice Test Here

Course description

Statistical Machine Learning is a second graduate level course in machine learning, assuming students have taken Machine Learning (10-701) and Intermediate Statistics (36-705). The term ``statistical'' in the title reflects the emphasis on statistical analysis and methodology, which is the predominant approach in modern machine learning.

The course combines methodology with theoretical foundations. It is intended for students who want to practice the ``art'' of designing good learning algorithms, and also understand the ``science'' of analyzing an algorithm's statistical properties and performance guarantees. Theorems are presented together with practical aspects of methodology and intuition to help students develop tools for selecting appropriate methods and approaches to problems in their own research. The course includes topics in statistical theory that are now becoming important for researchers in machine learning, including consistency, minimax estimation, and concentration of measure.

Prerequisites

Machine Learning 10-701 and Intermediate Statistics 36-705, or Probability and Statistics 36-725 and 36-726.

The syllabus includes information about assignments, exams and grading.

Zoubin's Lectures

Lecture 1 Lecture 2 Lecture 3 Lecture 4 Lecture 5

Lecture Notes

There is no required text for the course. Lecture notes will be regularly distributed (but not posted on the web). These are draft chapters and sections from a book in progress.
Comments, corrections, and other input on the drafts are highly encouraged.
Secondary References:
Chris Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer-Verlag, 2001.

Assignments

Assignments are due on Fridays at 3:00 p.m. Hand in the assignment at course secretary's office in Wean Hall 4609.

Code

The R language and code

Data

Papers

Topics

Topics will be chosen from the following basic outline, as announced in class.

Course Calendar
Week of: Mon Wed Friday
January 14 Stat Theory Review   Stat Theory Review  
21 No Class (MLK day)   Convexity/Optimization   Homework 1
28 Linear Models   Model Selection  
February 4 model selection   linear classification   Homework 2
11 Mixtures   Graphical Models   Project Proposal
18 Graphical Models   Nonparametric Regression   Homework 3
25 Nonparametric Classification   Nonparametric Classification  
March 3 EXAM No Class
10 No Class No Class
17 Advanced Theory   Advanced Theory   Homework 4
24 Advanced Theory   Dimension Reduction  
31 Dimension Reduction   The Bootstrap   Progress report
April 7 Kernel Methods   Kernel Methods   Homework 5
14 Nonparametric Bayes   Nonparametric Bayes  
21 Computation   Computation  
28 Computation   Last Class Submit Project
May 5     Homework 6