Statistical Computing, 36-350
Wednesday October 14, 2015
The original name for glitches and unexpected defects: dates back to at least Edison in 1876, but better story from Grace Hopper in 1947:
Debugging is (largely) a process of differential diagnosis:
Step 0: make if happen again
Step 1: figure out if it’s a pervasive/big problem
Step 2: find out where things are going wrong
traceback()
: where did an error message come from?cat()
, print()
: present intermediate outputswarning()
: message from the code when it’s finished runningstop()
, stopifnot()
: terminate the run if something’s wrong[[
… ]]
vs [
… ]
==
vs =
identical
, all.equal
)Recall the Gamma distribution estimator from a previous lab session. Suppose this was the code:
gamma.est = function(input) {
meaninput = mean(input)
varinput = var(input)
scale = varinput/meaninput
shape = meaninput/scale
output = list(shape=shape,scale=scale)
return(output)
}
First, make sure it works! Test this function on its own. A simple test:
input = rgamma(10000, shape=0.1, scale=10)
gamma.est(input)
## $shape
## [1] 0.1006058
##
## $scale
## [1] 10.34395
traceback()
Literally: traces back through all the function calls leading to the last error
Start your attention at the first of these functions which you wrote (not that it can’t be someone else at fault!)
Often the most useful bit is somewhere in the middle (there may be many low-level functions called, like mean()
)
Now, the jackknife estimator for the parameter standard error:
gamma.jackknife = function(dat) {
n = length(dat)
jack.estimates = sapply(1:n, function(i){
return(gamma.est(dat[-i]))})
sd.of.ests = apply(jack.estimates,1,sd)
return(sd.of.ests*sqrt((n-1)^2/n))
}
What happens?
library(MASS)
#gamma.jackknife(cats$Hwt)
If you uncommented the above line, you would have seen this:
> gamma.jackknife(cats$Hwt[1:3])
Error: is.atomic(x) is not TRUE
Then, if we called traceback()
:
> traceback()
6: stop(sprintf(ngettext(length(r), "%s is not TRUE", "%s are not all TRUE"),
ch), call. = FALSE, domain = NA)
5: stopifnot(is.atomic(x))
4: var(if (is.vector(x)) x else as.double(x), na.rm = na.rm)
3: FUN(newX[, i], ...)
2: apply(jack.estimates, 1, sd) at #5
1: gamma.jackknife(cats$Hwt)
print()
forces values to the screen; stick it in a bunch of places to see if things are you expect
gamma.jackknife.2 = function(dat) {
print("Spot #1"); print(head(dat))
n = length(dat)
jack.estimates = sapply(1:n, function(i){
return(gamma.est(dat[-i]))})
print("Spot #2"); print(jack.estimates[,1])
sd.of.ests = apply(jack.estimates,1,sd)
print("Spot #3"); print(sd.of.ests,"\n")
return(sd.of.ests*sqrt((n-1)^2/n))
}
#gamma.jackknife.2(cats$Hwt)
If you uncommented the above line, you would have seen this:
> gamma.jackknife.2(cats$Hwt)
[1] "Spot #1"
[1] 7.0 7.4 9.5 7.2 7.3 7.6
[1] "Spot #2"
$shape
[1] 19.32514
$scale
[1] 0.5514032
Error: is.atomic(x) is not TRUE
The problem is that gamma.est
gives a list, and so we get a weird list structure, instead of a plain array
Solution: re-write gamma.est
to give a vector, or alternatively, wrap unlist
around its output
gamma.est.2 = function(input) {
meaninput = mean(input)
varinput = var(input)
scale = varinput/meaninput
shape = meaninput/scale
return(c(shape,scale))
}
gamma.jackknife.3 = function(dat) {
n = length(dat)
jack.estimates = sapply(1:n, function(i){
return(gamma.est.2(dat[-i]))})
sd.of.ests = apply(jack.estimates,1,sd)
return(sd.of.ests*sqrt((n-1)^2/n))
}
gamma.jackknife.3(cats$Hwt)
## [1] 2.74062279 0.07829436
warning()
warning()
allows you to print warnings to the console, can be helpful to others who might be using your code, and could encounter errors
quadratic.solver = function(a,b,c) {
determinant = b^2 - 4*a*c
if (determinant < 0) {
warning("Equation has complex roots")
determinant = as.complex(determinant)
}
return(c((-b+sqrt(determinant))/2*a, (-b-sqrt(determinant))/2*a))
}
quadratic.solver(1,0,-1)
## [1] 1 -1
quadratic.solver(1,0,1)
## Warning in quadratic.solver(1, 0, 1): Equation has complex roots
## [1] 0+1i 0-1i
stop()
and stopifnot()
These halt the function execution when we see something not expected. Could prevent bugs downstream, if users pass arguments that aren’t as we expect
We’ve seen examples of these before; they’re smart to put in your code
Start small: if you find encounter a bug in a long piece of code, break it down, and start chunking up small bits and running them, to see if you can figure out what’s wrong
Step through the function: it’s not fancy, but sometimes it’s actually helpful to just step through each line of a function manually. Note: for this, you’re going to have to define all of its arguments appropriately
The browser()
, recover()
and debug()
functions modify how R executes other functions. Let you view and modify the environment of the target function, and step through it. You don’t need to know them for this class, but they can be very helpful
Ordinarily, errors just lead to crashing or the like
R has an error handling system which allows your function to catch, and recover from, errors in functions. See try()
, tryCatch()
tryCatch({
gamma.sds = gamma.jackknife(cats$Hwt)
}, error = function(err) {
gamma.sds = NA
cat(paste("Encountered the following error:\n",
err$message,
"\nContinuing on without jackknife estimates")) })
## Encountered the following error:
## is.atomic(x) is not TRUE
## Continuing on without jackknife estimates
You don’t need to know these for this class, but just note that they can be critically important, so that your code doesn’t completely crash an burn from small errors!