General information

Course overview

Computational data analysis is an essential part of modern statistics. Competent statisticians must not just be able to run existing programs, but to understand the principles on which they work. They must also be able to read, modify, and write code, so that they can assemble the computational tools needed to solve their data analysis problems, rather than distorting problems to fit tools provided by otshers. This class is an introduction to statistically-oriented programming, targeted at statistics majors, without assuming extensive programming background.

Students will learn the core of ideas of programming—data structures, functions, iteration, debugging, logical design, and abstraction—through writing code to assist in statistical analyses. Students will learn how to write maintainable code, as well as debug and test code for correctness. They will learn how to set up and run stochastic simulations, how to fit basic statistical models and assess the results, and how to work with and filter large data sets. Since code is an important form of communication among scientists, students will also learn how to comment and organize code.

The class will be taught in the R programming language.

Course website

The course website is http://www.stat.cmu.edu/~ryantibs/statcomp-F19/. The course schedule, lecture notes, labs, supplementary materials, etc., will be posted there.

Prerequisites

This is an introduction to programming for statistics students. Prior exposure to statistical thinking, to data analysis, and to basic probability concepts is essential. Previous programming experience is not assumed. Formally, the prerequisites are “Computing at Carnegie Mellon”, 36-202 or 36-208, and 36-225.

Course mechanics

This class will be run in a flipped format. Instead of having regular lectures Monday, Wednesday, and Friday (our schedule class times), the week will be structured as follows.

  • Monday, it is up to you to digest the lecture slides, which will be released on the course website. You have different options for doing so:
    • you can come to the usual class period (11:30am - 12:20pm in Baker 136A), where the Professor will give the lecture; or
    • you can come to the recitation section on Tuesday (time and location given on the class website) held by the TAs; or
    • you can go through the slides on your own.
  • There will be a short quiz, consisting of true/false and multiple choice questions about on the lecture material, due 11:59pm on Tuesday night, which will be taken on Canvas.
  • Wednesday and Friday, it is up to you work on the lab, which will be released on the course website. You have different options for doing so:
    • you can come to the usual class period (11:30am - 12:20pm in Baker 136A), where the Professor and some TAs will be available for help; or
    • you can go to one of the office hours (times and locations given on the class website) held by the TAs; or
    • you can do the lab on your own.
  • The lab will be due 11:59pm on Sunday (the end of the week). The submission will be done on Canvas, following instructions given on the lab.
  • Each week, you will need to do an in-person check-in. This can be done by coming to lecture on Monday, recitation on Tuesday, lab on Wednesdays or Friday, or any of the office hours. You can just show up and say, “Hi, it’s [NAME], all is good” (you don’t need to stay). This check-in will be worth a small amount of points on the week’s lab.

Grading

Grades will be calculated as follows:

  • Labs: 60%
  • Quizzes: 20%
  • Exams: 20%

Here are the cutoffs for letter grades, based on total percentages:

  • A: 90% or higher
  • B: 80% to 89%
  • C: 70% to 79%
  • D: 60% to 69%
  • R: 59% or lower, on a case by case basis

The Professor may adjust these cutoffs, but only in the direction that favors the students. For example, the cutoff for an “A” may end up being adjusted to be lower than 90%, but not higher.

R and RStudio

R is a free, open-source programming language for statistical computing. All of our work in this class will be done using R. You will need regular, reliable access to a computer running an up-to-date version of R. If this is a problem, then let the Professor or TAs know right away.

RStudio is a free, open-source R programming environment. It contains a built-in code editor, many features to make working with R easier, and works the same way across different operating systems. Most importantly it integrates R Markdown seamlessly. You will use RStudio for the labs and final.

Getting help

Labs

Coming to labs are the best way to get help. You will be able to ask questions of the Professor and TAs for the entire time.

Office hours

Office hours will be held by the Professor and TAs, and the times will be spread out over the week. The times and locations can be found on the course website.

Piazza

Piazza will be used for questions and discussion on the class contents. Class announcements will also be made through Piazza. The link for the Piazza group is given on the course website.

Piazza can be a very successful medium for helpful, class-wide discussions, but without rules, discussions can also quickly get out of hand. Here are the rules for our Piazza group:

  1. Be considerate to others (respectful language, no sarcasm).
  2. Before posting a question, check that it (or a related question) has a not already been posted. If it has, then use the existing thread for further questions or discussion.
  3. For questions about the labs, “What is wrong with this code?” is not an acceptable question. Code that is part of your solution cannot be posted to Piazza.
  4. Along with your posted question, explain step-by-step what you tried to answer your own question (without posting your solution code).
  5. Private questions on Piazza (an option for questions that only Professor and TAs can see) are discouraged since they will not be able to be answered in a reliable/timely manner.

Rule #2 above is highlighted because it is important and in our experience it is the usually the first rule to be forgotten. Read Piazza first, then post! Duplicated posts can snowball and then Piazza can quickly become ineffective.

Content deemed inappropriate—by the above rules and otherwise—will be taken down by the Professor or TAs.

Email

Email will be used for questions on class administration (class policies, exceptional circumstances, etc.), rather than class contents. Please direct such inquiries to the Head TA. The subject line of all emails should begin with “[36-350]”. The Professor will be available for issues that cannot be resolved first with the Head TA.

Evaluation

Quizzes

Quizzes will be short (about 8-10 questions), and consist of true/false and multiple choice questions. They will be completed on Canvas, due 11:59pm on Tuesday each week, with the links given on the course website. Quizzes are supposed to be a relatively easy recap of the material covered in the week’s lecture materials. After you submit the quiz, you will immediately see your score, and the correct answers. The system allows you to retake the quiz, and then receive an average of your two quiz scores as your final quiz score. So the worst you can do is to get half credit on any given quiz (get all questions wrong the first time, and all questions right the second time).

Labs

Labs will be completed in R Markdown format (file extension Rmd). They will involve writing a combination of code and written prose, and the R Markdown format is crucial since it allows for a combination of the two. Labs will be turned in through Canvas, due 11:59pm on Sunday each week, and they must be submitted only in HTML format, the result of calling “Knit HTML” from RStudio on your R Markdown document. Be careful that you do this, because work submitted in any other format will receive a grade of 0, without exception.

Note also: all code used to produce your results must be shown in your HTML file (e.g., do not use echo=FALSE or include=FALSE as options anywhere).

Students may choose to collaborate with friends on the labs, but must indicate with whom they collaborated. Also, be sure to carefully read the collaboration policy below.

Exam

There will be a final in-class exam. It will be mostly similar in format to the quizzes (true/false and multiple choice questions), and will be comprehensive.

Late work

In general, no late days will be accepted. Instead, your lowest lab score and lowest quiz score will be dropped at the of the semester. In case of truly exceptional situations—such as family emergencies or illness—the Head TA can make exceptions and allow late work (labs or quizzes).

Collaboration, copying, and plagiarism

You are encouraged to discuss course material with your classmates. All work you turn in, however, must be your own. This includes both written explanations, and code. Copying from other students, books, websites, or solutions from previous versions of the class, (1) does nothing to help you learn how to program, (2) is easy for us to detect, and (3) has serious negative consequences for you, as outlined in the university’s policy on cheating and plagiarism. If, after reading the policy, you are unclear on what is acceptable, please ask the Professor.

Accommodations for students with disabilities

If you have a disability and are registered with the Office of Disability Resources, please use their online system to notify us of your accommodations and discuss with us your needs as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, consider contacting them at access@andrew.cmu.edu.

Take care of yourself

Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. Asking for support sooner rather than later is often helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:

If the situation is life threatening, call the police:

If you have questions about this, then please let us know.