Name:
Andrew ID:
Collaborated with:

This lab is to be completed in class. You can collaborate with your classmates, but you must identify their names above, and you must submit your own lab as an Rmd file on Blackboard, by 11:59pm on the day of the lab.

There are Homework 5 questions dispersed throughout. These must be written up in a separate Rmd document, together with all Homework 5 questions from other labs. Your homework writeup must start as this one: by listing your name, Andrew ID, and who you collaborated with. You must submit your own homework as a knit HTML file on Blackboard, by 6pm on Sunday October 9. This document contains 14 of the 45 total points for Homework 5.

Generating a random word table

Write a function called random.wordtab() to generate a random word table according to the above recipe, essentially, just function-izing your code from Hw5 Q6. The inputs should be nchar, the number of total characters generated in the initial random generation of characters (letters or strings), with a default of 1000; and seed, an integer to use in set.seed(), with a default of NULL, meaning that no seed should be set. The return value should be a list, with the following named elements: wordtab, the random word table that was generated; number.unique.words, the number of unique words in the word table; max.word.count, the largest word count in the word table; and max.word.length, the largest character count of any word in the word table (i.e., length of the longest word). Run your function with the default inputs and display the results.

Hw5 Q7 (2 points). Call your function random.wordtab() with nchar set to 1e7 (and seed remaining NULL). Report the number of unique words, the max word count, and the max word length. Then save the word table (not the whole reutrned list, just the word table), using saveRDS(), to a file called “<myandrewID>_wordtab.rds“, where for <myandrewID> is your andrew ID. Submit this file along with your knitted HTML file, when you submit the homework.

Simulating the effect of a drug on tumor reduction

Hw5 Q8 (6 points). Now suppose your drug company told you they only had enough money to enlist 20 subjects in each of the drug / no drug groups, in their clinical trial. They then asked you the following question: how large would mu.drug have to be, the mean proportion of tumor reduction in the drug group, in order to have probability 0.95 of a successful drug trial? Run a simulation, much like your simulation in the last problem, to answer this question. Specifically, for each value of the input mu.drug in between 1.5 and 5, in increments of 0.1, run your function sim.drug.effect(), with n=20, a total of 1000 times. As before, for each of these 1000 trials, record the average difference in tumor reduction from each run; then count the number of successes, i.e., the number of times (out of 1000) that this difference exceeds 100. (Hint: use a double for() loop again.) Plot the number of successes versus the value of mu.drug, and label the axes appropriately. What is the smallest value of mu.drug for which the number of successes exceeds 950?

Hw5 Q9 (6 points). It turns out that the drug company can actually control mu.drug, the mean proportion of tumor reduction among the drug subject, by adjusting the dose concentration of some secret special chemical. But there is no free lunch: the higher concentration of this secret chemical, the more likely a subject is to have liver failure. In particular, suppose that each patient who is on the drug dies with probability mu.drug/100. The FDA has a policy that if one or more subjects die in a clinical trial, then the trial is shut down. In this case, the trial is clearly not counted as a success (even if the average difference in tumor reduction percentage was huge, between surviving members of the two groups).

As in the last question, suppose that the drug company only has enough money to enlist n=20 people in each of the drug / no drug groups in their clinical trial. Adapt your simulations from the last question to incorporate the fact that patients can die from liver failure, in the drug group, as described. (Hint: you can do this with only a careful but minor modification to the code. After computing the average difference reduction in tumor size between the drug / no drug groups from a simulation run, add some code to flip a coin 20 times with the “right” probability for heads, using rbinom(). If the number of heads is greater or equal to 1, then reset the average reduction in tumor size to be 0, because this trial cannot be counted as a success, according go the FDA rules.) As before, plot the number of successes (out of 1000) as a function of mu.drug. Is there any hope here, i.e., is there a value of mu.drug for which we have at least 950 successes?