Name:
Andrew ID:
Collaborated with:

This lab is to be completed in class. You can collaborate with your classmates, but you must identify their names above, and you must submit your own lab as an Rmd file on Blackboard, by 11:59pm on the day of the lab.

There are Homework 10 questions dispersed throughout. These must be written up in a separate Rmd document, together with all Homework 10 questions from other labs. Your homework writeup must start as this one: by listing your name, Andrew ID, and who you collaborated with. You must submit your own homework as a knit HTML file on Blackboard, by 11:59pm on Tuesday November 15. This document contains 22 of the 45 total points for Homework 10.

Split-apply-combine practice with the strikes data

Hw10 Bonus. Using the map() function from the maps package, draw a map of the world, with the countries in the strikes.df data frame colored according to their average unemployment rate. For the color palette, use terrain.colors(). For all countries not found in the strikes.df data frame, color them in gray.

Hw10 Q1 (8 points). Using split() and sapply(), compute the average unemployment rate, inflation rates, and strike volume for each year in the strikes.df data set. The output should be a matrix of dimension 3 x 35. Show the columns for 1960, 1977, 1980, 1985. Then, display the average unemployment rate by year and the average inflation rate by year, in the same plot. Label the axes and title the plot appropriately. Include an informative legend.

Hw10 Q2 (4 points). Show how to compute the average inflation rate for each country pre and post 1975, from strikes.df, using a single call to daply(), i.e., without using any auxiliary columns in strikes.df, like the ones you created in yearPre1975, countryPre1975. You will need to have gone through the “Plyr: d\*ply()” mini-lecture to do this question, so you might want to come to this one after class on Wednesday or Friday. (Hint: recall the function I().) Check that the results are the same as those you computed above, with split() and sapply().

Linear regressions over the strikes data

Hw10 Q3 (10 points). Modify your code for computing the coefficients from regresssing strike.volume onto left.parliament, unemployment, and inflation, separately for each country in the strikes.df data frame, so that instead of just reporting the coefficients, you also report their standard errors. (Hint: you will need to figure out how to extract the standard errors from the call summary() on the object returned by lm(). Look at the solution to one of the bonus questions on Hw9.) The output should be a matrix of dimension 8 x 18 (1 row for the intercept, 3 rows for the coefficients of left.parliament, unemployment, inflation, and 4 rows for their standard errors). Display the columns for Belgium, Canada, UK, and USA.

Finally, reproduce your plot from the last question of the coefficients of left.parliament, from the countrywise regressions of strike.volume onto left.parliament, unemployment, inflation. But now on top of each point—denoting a coefficient value of left.parliament for a different country—draw a vertical line segment through this point, extending from the coefficient value minus one standard error to the coefficient value plus one standard error. (Hint: segments().) Make sure that these line segments to not extend past the y limits on your plot. For how many countries do their line segments (from the coefficient value minus one standard error to the coefficient value plus one standard error) not intersect the 0 line? Which ones are they?