36-402, Sec. A, Spring 2019
7 March 2019
\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Prob}[1]{\mathbb{P}\left[ #1 \right]} \newcommand{\Var}[1]{\mathbb{V}\left[ #1 \right]} \]
\[ \frac{e^u}{1+e^u} = \frac{1}{1+e^{-u}} \]
ch <- read.csv("http://www.stat.cmu.edu/~cshalizi/uADA/19/exams/1/ch.csv")
ch <- ch[, -1] # First column is just an index
ch <- na.omit(ch) # Not rec'd for exam but simplifies a few steps
ch.logistic <- glm(start ~ exports + fractionalization * dominance, data = ch,
family = "binomial")
coefficients(ch.logistic)
## (Intercept) exports
## -3.0397185828 0.2824995978
## fractionalization dominance
## 0.0001504523 0.5834949415
## fractionalization:dominance
## -0.0002603135
## [1] 0.06686047
Is this good or bad?
## [1] 0.06686047
EXERCISE: Why does the model have the same error rate as the constant?
frequency.vs.probability <- function(p.lower, p.upper = p.lower + 0.005, model,
events) {
fitted.probs <- fitted(model)
indices <- (fitted.probs >= p.lower) & (fitted.probs < p.upper)
matching.probs <- fitted.probs[indices]
ave.prob <- mean(matching.probs)
frequency <- mean(events[indices])
# 'Law of total variance': Var[Y]=E[Var[Y|X]] + Var[E[Y|X]]
total.var <- mean(matching.probs * (1 - matching.probs)) + var(matching.probs)
se <- sqrt(total.var/sum(indices))
return(c(frequency = frequency, ave.prob = ave.prob, se = se))
}
(Can you add comments?)
Now apply the function a bunch (why these numbers?)
f.vs.p <- sapply(seq(from = 0.04, to = 0.12, by = 0.005), frequency.vs.probability,
model = ch.logistic, events = ch$start)
This is “turned on its side” from a data frame, so let’s fix. (Can you find a more elegant way to do this?)
plot(frequency ~ ave.prob, data = f.vs.p, xlim = c(0, 1), ylim = c(0, 1), xlab = "Predicted probabilities",
ylab = "Observed frequencies")
rug(fitted(ch.logistic), col = "grey")
abline(0, 1, col = "grey")
segments(x0 = f.vs.p$ave.prob, y0 = f.vs.p$ave.prob - 1.96 * f.vs.p$se, y1 = f.vs.p$ave.prob +
1.96 * f.vs.p$se)
\[ g(p) = \alpha + \sum_{j=1}^{p}{f_j(x_j)} \]
\(f_j\) are still partial response functions, but for log-odds of \(Y=1\), not for \(Y\)
We can use IWLS to estimate the \(f_j\): just fit an additive model for the \(z_i\)’s instead of a linear model
We can do IWLS to recover the \(\eta\) function, without transforming the response