Instructor: |
Alessandro Rinaldo (email: arinaldo at cmu dot edu) |
Course Description and Objectives: |
This course is the first of a sequence of two minis intended to introduce Ph.d. students in Statistics and Machine Learning to some of the mathematical tools used in the analysis of high-dimensional statistical models. In big-data problems, it is often the case that the statistical complexity of a single datum is increasing in the sample size. In these instances, the classical statistical modeling approach, which is predicated on the assumption that the number of model parameters remains fixed as the sample size grows unbounded, is no longer applicable. In contrast, high-dimensional statistical models allow for the number of model parameters to change with the sample size, so that, as more data are gathered, models of increasing complexity and predictive power can be used. In order to formalize statistical tasks and evaluate the theoretical performance of statistical procedures in high-dimensional settings, it is necessary to use a different suite of techniques and theories than in traditional statistics. The learning objectives of this course are two-fold. The first goal is present various concentration inequalities techniques to derive finite sample upper bounds on the performance of high-dimensional algorithms. The second goal is to exemplify the use of such techniques on various problems borrowed from the current literature, including high-dimensional regression and compressed sensing, matrix completion, high-dimensional graphical modeling, community detection, network analysis, etc. |
Syllabus |
Course work, class logistics and grading criteria are detailed in the syllabus. |
Lectures: |
Tuesday and Thursday, 1:30pm - 2:50pm, PHA18A. |
Office hour: |
By appointment. |
Schedule: |
Lecture schedule, scribe notes and homework assignments available here. |
References |
References and reading material available here. (The list will be regularly updated and expanded as the class progresses). |