Why statisticians learn to program
- Independence: otherwise, you rely on someone else giving you exactly the right tool
- Honesty: otherwise, you end up distorting your problem to match the tools you have
- Clarity: often, turning your ideas into something a machine can do refines your thinking
- Fun: these were the best of times (the worst of times)
Cool example: catching an intruder
How this class will work
- Professor Ryan Tibshirani
- TAs Shannon Gallagher, Kwhangho Kim, Kevin Lin, Alan Mishler, Zixu Wang
- No programming knowledge assumed
- Some statistics knowledge assumed
- Focus almost entirely on R
- Class will be cumulative, so keep up with the material and assignments!
- Each class period: 1 or 2 “mini-lectures” (10-20 minutes), then “lab” (30-40 minutes)
- Lab work due at 11:59pm on each class day
- Graded \(\approx\) half on attendence, half on completion
- Homework due at 6pm on each Sunday
- Final group project (2-3 weeks)
- Grading breakdown:
- Labs: 30%
- Homework: 50%
- Final project: 20%
- Class website: http://www.stat.cmu.edu/~ryantibs/statcomp-F16/
- Piazza group: for announcements and discussions
- Blackboard: used to collect submissions, and keep track of grades
R, R Studio, R Markdown
- R is a programming language for statistical computing
- R Studio is an integrated development environment for R programming
- R Markdown is a markup language for combining R code with text
All 3 are free, and all 3 will be used extensively in this course
(survey, demo)
Read the syllabus
It’s on the course website
Things you should have done, by the end of this week
- Install R Studio on your laptop
- Get comfortable with R Studio, knitting Rmd documents into HTML files
- Get comfortable with navigating R help files
- Get comfortable with installing R packages
- Complete “Interactive R Online” introductory course, by 6pm on Sunday
Note: you should do the labs, as usual (but they won’t be graded this week)