The file neurological.dat
concerns patients suffering
from a mild neurological disorder. Patients were treated
with one of two drugs or a placebo, and the number recovering
was recorded.
The data consist of five columns: sex, cured or not cured, number on placebo, number on drug A, number on drug B.
> neuro <- read.table("neurological.dat") > neuro V1 V2 V3 V4 V5 1 F 1 40 5 26 2 F 0 43 7 32 3 M 1 11 48 52 4 M 0 6 20 20This is not a good format in S-PLUS. It would be much better to have ``number cured'' and ``number not cured'' in the same row, with each of the treatments in a different row. Manipulating the data into such a form would be complicated, and since the data set is so small we may as well just do it by hand (if you have read the handout on objects, you know how to do this).
> neuro Sex Treatment Cured NotCured 1 F Placebo 40 43 2 F Drug A 5 7 3 F Drug B 26 32 4 M Placebo 11 6 5 M Drug A 48 20 6 M Drug B 52 20Also, we should make sure all of these variables are in the right format:
> neuro$Sex <- as.factor(neuro$Sex) > neuro$Treatment <- as.factor(neuro$Treatment) > neuro$Cured <- as.numeric(neuro$Cured) > neuro$NotCured <- as.numeric(neuro$NotCured)For logistic regression, there are a pair of response variables: the number of successes and the number of failures. Bind the two responses together with
cbind
and put them in the model.
> attach(neuro) > anova(glm(cbind(Cured,NotCured) ~ Sex + Treatment, family=binomial)) Analysis of Deviance Table Binomial model Response: cbind(Cured, NotCured) Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL 5 19.71457 Sex 1 19.07467 4 0.63990 Treatment 2 0.04528 2 0.59462It appears that the two drugs were really no different from the placebo. The big effect was from gender, and looking at the data above, men were much more likely to recover (70%) than women (40%).