Some repeat advice

Why should we practice debugging? It’s not as fun as writing cool new code, but …

The truth: you’re going to have to debug your own code eventually, because you’re not perfect (none of us are!) and so you can’t write perfect code
Debugging is frustrating and time-consuming, but essential
Writing code that makes it easier to debug later is worth it, even if it takes a bit more time (lots of our design ideas support this)
Simple things you can do to help: use lots of comments, use meaningful variable names!

In the next several slides, we list some common issues that give rise to bugs

Common issues: syntax

Parentheses mismatches
[[ … ]] vs [ … ]
== vs =
Identity of floating-point numbers
Vectors vs single values: code works for one value but not multiple ones, unexpected recycling
Elementwise comparison of structures (use identical(), all.equal())
Silent type conversions

Common issues: logic

Confusing variable names
Confusing function names
Giving unnamed arguments in the wrong order!
R expression does not match the math you mean (left something out, added something)

Common issues: scope and global variables

Relying on a global variable which doesn’t have the right value (or only has the right value in one situation)
Assuming that changing a variable inside the function will change it elsewhere
Confusing variables within a function and those from where the function was called

Beyond stone knives and bear skins?

Actual quote from stackoverflow:

I’ve been a software developer for over twenty years … I’ve never had a problem I could not debug using some careful thought, and well-placed debugging print statements. Many people say that my techniques are primitive, and using a real debugger in an IDE is much better. Yet from my observation, IDE users don’t appear to debug faster or more successfully than I can, using my stone knives and bear skins.

Specialized tools for debugging

R provides you with many debugging tools. Why should we use them, and move past our handy cat() or print() statements?

Let’s see what our primitive hunter found on stackoverflow, after a receiving bunch of comments in response to his quote:

Sweet! … Very illuminating. Debuggers can help me do ad hoc inspection or alteration of variables, code, or any other aspect of the runtime environment, whereas manual debugging requires me to stop, edit, and re-execute.

`browser()`

One of the simplest but most powerful built-in debugging tools: browser(). Place a call to browser() at any point in your function that you want to debug. As in:

my.fun = function(arg1, arg2, arg3) {
  # Some initial code 
  browser()
  # Some final code
}

Then redefine the function in the console, and run it. Once execution gets to the line with browser(), you’ll enter an interactive debug mode

Things to do while browsing

While in the interactive debug mode granted to you by browser(), you can type any normal R code into the console, to be executed within in the function environment, so you can, e.g., investigate the values of variables defined in the function

You can also type:

“n” (or simply return) to execute the next command
“s” to step into the next function
“f” to finish the current loop or function
“c” to continue execution normally
“Q” to stop the function and return to the console

(To print any variables named n, s, f, c, or Q, defined in the function environment, use print(n), print(s), etc.)

Browsing in R Studio

You have buttons to click that do the same thing as “n”, “s”, “f”, “c”, “Q” in the “Console” panel; you can see the locally defined variables in the “Environment” panel; the traceback in the “Traceback” panel

Knitting and debugging

As with traceback(), cat(), print(), used for debugging, you should only run browser() in the console, never in an Rmd code chunk that is supposed to be evaluated when knitting

But, to keep track of your debugging code (that you’ll run in the console), you can still use code chunks in Rmd, you just have to specify eval=FALSE

# As an example, here's a code chunk that we can keep around in this Rmd doc,
# but that will never be evaluated (because eval=FALSE) in the Rmd file, take 
# a look at it!
big.mat = matrix(rnorm(1000)^3, 1000, 1000)
big.mat
# Note that the output of big.mat is not printed to the console, and also
# that big.mat was never actually created! (This code was not evaluated)

Debugging with R Tools