Subject: final papers From: Bronwyn Woods Date: 5/11/2012 3:59 PM To: Brian Junker Attached are the final papers with my comments, as well as evaluations of the papers according to the rubric. Team C: They seem to have done appropriate things and I think they understood the analysis they were trying to do, but the paper was generally really sloppy with "(reference needed)" and "___" and sections left unchanged from the draft saying something like "we will send a follow-up but we haven't done it yet". There were also a noticeable number of general typos. Other than that, the biggest weakness was that the results section was swamped by dozens of (probably unnecessary) plots, cut-and-paste regression output, pages of output from post-hoc tests etc. Team D: The paper was actually relatively strong in the sense that they really responded to the comments on the draft. Their discussion of previous work, the questions they wanted to answer, their conclusions etc were decent. However, I'm pretty sure the analysis they did was fairly off from a statistical point of view (mostly right ideas, wrong calculations or something). You may want to take a look at their Results section just to make sure I'm not way off base. TeamDEval.txt Section 1: Introduction 20/20 Good introduction the the topic and to the general results you found and how they fit into the previous work that you looked at. Section 2: Methods 19/20 Overall, a good summary of your methods. Were there differences between your first and second batches? If there weren’t, it’s unclear why they are separate. If there were, you should talk about that. Section 3: Results 8/20 You present appropriate types of analyses for your data, and generally clearly state the conclusions you draw. However, it looks like there are some errors in your understanding of the statistical methods. Weighting strata is appropriate when trying to compute a value over the whole population (for instance, the mean political beliefs of Carnegie Mellon) but not when looking at the means for individual strata. Also, the p-values you get from the t tests you did seem to disagree with the 95% confidence intervals you compute. Significant differences in the means should be reflected by non-overlapping confidence intervals, but all of your CI are clearly overlapping. I suspect there are errors in your calculations somewhere along the line. Doing the calculations in R would indeed have helped avoid such problems. Finally, you mention that you exclude “inappropriate” responses. You state that your questions were multiple choice, so how could a respondent answer inappropriately? Section 4: Discussion 20/20 Your discussion nicely sums up the conclusions you made from your data. List of References 10/10 Appendices 10/10 TOTAL 87/100 TeamCEval.txt Section 1: Introduction 20/20 The introduction does a good job of introducing the question and motivation. Good summary of the literature you referenced. Section 2: Methods 17/20 This section contains the appropriate content, and it is clear you thought about your survey methods. How did the calculations you did in section 2.2 of MOE and strata size impact your decisions when choosing how many people to contact initially? Some of the reporting of the number of people contacted, follow-up procedure and response rate is sloppy. Numbers don’t match up between section 2.3 and 2.5, and it’s unclear why you only sent follow-up emails to a subset of people who hadn’t responded. There were 986 people who hadn’t responded, but you only sent 864 reminders? Section 3: Results 15/20 It is clear that you were thorough in analyzing your data. I think you identified some interesting subtleties, despite having a smaller sample size than you would have liked and a fair bit (I think?) of incomplete data. However, the questions you are trying to answer and the interesting results get swamped by the large number of exploratory plots and cut-and-paste regression/anova/etc output. These plots often do not have descriptive axis labels (the reader doesn’t know what a “3” in “IdealRate” means, for instance) making them hard to interpret. Section 3.5 (the output from your Tukey HSD) belongs in an appendix rather than the main text. The same is probably true for the R regression output. Section 4: Discussion 19/20 This is a good discussion of some of the strengths and potential weaknesses of your survey. There are a couple blanks that you neglected to fill in with the proper information. List of References 8/10 One reference left as “reference needed” (section 1.1) and one footnote retained (section 1.2) Appendices 10/10 TOTAL 89/100 Attachments: TeamDEval.txt 1.8 KB TeamCEval.txt 2.1 KB Team C Final.pdf 764 KB Team D Final.pdf 682 KB