36-462/36-662, Data Mining

Cosma Shalizi

Lecture 1, 26 August 2019 — Welcome to the course

Welcome!

Agenda for today

What is data mining?

What are we going to learn about

So many things!

Information retrieval

Dimension reduction

Information measures

Nearest neighbors

Clustering and Classifiers

Prediction and decision trees

Recommendation engines

Waste, fraud and abuse

Waste, fraud and abuse

Where did this come from?

Where did this really come from?

Where did this really come from?

Where did this really come from?

What will you need to know?

Course mechanics

Class meetings

In-class exercises

Reading

Reading: Textbook

Principles of Data Mining

Principles of Data Mining

Homework

Homework

Grading

Time expectations

Cheating, collaboration & plagiarism

Homework format

Switch to R Studio

Specifically welcome.Rmd

Some lessons from the demo

Next time: Information retrieval