Lecture 16: Functions as Objects

36-350
20 October 2014

Previously...

Writing our own functions
Dividng labor with multiple functions
Refactoring to create higher-level operations
Using apply, sapply, etc., to avoid iteration

Agenda

Functions are objects, and can be arguments to other functions
Functions are objects, and can be returned by other functions
Example: surface

Reading: Sections 7.5, 7.11 and 7.13 of Matloff

Optional Recommended Reading: Chapter 3 of Chambers

Functions as Objects

In R, functions are objects, just like everything else
This means that they can be passed to functions as arguments
and returned by functions as outputs as well

Functions of Functions: Computationally

We often want to do very similar things to many different functions
The procedure is the same, only the function we're working with changes
\( \therefore \) Write one function to do the job, and pass the function as an argument
Because R treats a function like any other object, we can do this simply: invoke the function by its argument name in the body
We have already seen examples

R Functions That Take Functions as Arguments

apply(), sapply(), etc.: Take this function and use it on all of these objects
nlm(): Take this function and try to make it small, starting from here
ks.test(): Compare these data to this cumulative distribution function
curve(): Evaluate this function over that range, and plot the results

Some R Syntax Facts About Functions

Typing a function's name, without parentheses, in the terminal gives you its source code:

sample

function (x, size, replace = FALSE, prob = NULL) 
{
    if (length(x) == 1L && is.numeric(x) && x >= 1) {
        if (missing(size)) 
            size <- x
        sample.int(x, size, replace, prob)
    }
    else {
        if (missing(size)) 
            size <- length(x)
        x[sample.int(length(x), size, replace, prob)]
    }
}
<bytecode: 0x102d004a0>
<environment: namespace:base>

Some R Syntax Facts About Functions

Functions are their own class in R:

class(sin)

[1] "function"

class(sample)

[1] "function"

resample <- function(x) { sample(x, size=length(x), replace=TRUE) }
class(resample)

[1] "function"

Some R Syntax Facts About Functions

Functions can be put into lists or even arrays
A call to function returns a function object
- body executed; access with body(foo)
- arguments required: access with formals(foo)
  gives argument list of foo: names are argument names, values are expressions for defaults (if any)
- parent environment: access with environment(foo)

Some R Syntax Facts About Functions

R has separate types for built-in functions and for those written in R:

typeof(resample)

[1] "closure"

typeof(sample)

[1] "closure"

typeof(sin)

[1] "builtin"

Why closure for written-in-R functions? Because expressions are “closed” by referring to the parent environment

There's also a 2nd class of built-in functions called primitive

Anonymous Functions

function() returns an object of class function
So far we've assigned that object to a name
If we don't have an assignment, we get an anonymous function
Usually part of some larger expression:

sapply((-2):2,function(log.ratio){exp(log.ratio)/(1+exp(log.ratio))})

[1] 0.1192 0.2689 0.5000 0.7311 0.8808

Anonymous Functions

Often handy when connecting other pieces of code
- especially in things like apply and sapply
Won't cluttering the workspace
Can't be examined or re-used later

Example: grad()

Problems in stats. come down to optimization
So do lots of problems in econ., physics, CS, bio, …
Lots of optimization problems require the gradient of the objective function
Gradient of \( f \) at \( x \): \[ \nabla f(x) = \left[\left.\frac{\partial f}{\partial x_1}\right|_{x} \ldots \left.\frac{\partial f}{\partial x_p}\right|_{x}\right] \]

Example: grad()

We do the same thing to get the gradient of \( f \) at \( x \) no matter what \( f \) is:

  find the partial derivative of f with respect to each component of x
  return the vector of partial derivatives

It makes no sense to re-write this every time we change \( f \)!
\( \therefore \) write code to calculate the gradient of an arbitrary function
We could write our own, but there are lots of tricky issues
- Best way to calculate partial derivative
- What if \( x \) is at the edge of the domain of \( f \)?
Fortunately, someone has already done this

Example: grad()

From the package numDeriv

grad(func, x, ...) # Plus other arguments

Assumes func is a function which returns a single floating-point value
Assumes x is a vector of arguments to func
- If x is a vector and func(x) is also a vector, then it's assumed func is vectorized and we get a vector of derivatives
Extra arguments in ... get passed along to func
Other functions in the package for the Jacobian of a vector-valued function, and the matrix of 2nd partials (Hessian)

Example: grad()

Does it work as advertized?

require("numDeriv")
just_a_phase <- runif(n=1,min=-pi,max=pi)
all.equal(grad(func=cos,x=just_a_phase),-sin(just_a_phase))

[1] TRUE

phases <- runif(n=10,min=-pi,max=pi)
all.equal(grad(func=cos,x=phases),-sin(phases))

[1] TRUE

grad(func=function(x){x[1]^2+x[2]^3}, x=c(1,-1))

[1] 2 3

Note: grad is perfectly happy with func being an anonymous function!

gradient.descent()

Now we can use this as a piece of a larger machine:

gradient.descent <- function(f,x,max.iterations,step.scale,
  stopping.deriv,...) {
  for (iteration in 1:max.iterations) {
    gradient <- grad(f,x,...)
    if(all(abs(gradient) < stopping.deriv)) { break() }
    x <- x - step.scale*gradient
  }
  fit <- list(argmin=x,final.gradient=gradient,final.value=f(x,...),
    iterations=iteration)
  return(fit)
}

Works equally well whether f is mean squared error of a regression, \( \psi \) error of a regression, (negative log) likelihood, cost of a production plan, …

Cautions

Scoping f takes values for all names which aren't its arguments from the environment where it was defined, not the one where it is called (e.g., not from inside grad or gradient.descent)
Debugging If f and g are both complicated, avoid debugging g(f) as a block; divide the work by writing very simple f.dummy to debug/test g, and debug/test the real f separately

Returning Functions: A trivial example

Functions can be return values like anything else

make.noneuclidean <- function(ratio.to.diameter=pi) {
  circumference <- function(d) { return(ratio.to.diameter*d) }
  return(circumference)
}

Returning Functions: A trivial example (cont'd.)

try(circumference(10))
kings.i <- make.noneuclidean(3)
try(kings.i(10))

[1] 30

formals(kings.i)

$d

body(kings.i)

{
    return(ratio.to.diameter * d)
}

environment(kings.i)

<environment: 0x100b588d8>

try(circumference(10))

A Less Trivial Example

Create a linear predictor, based on sample values of two variables

make.linear.predictor <- function(x,y) {
  linear.fit <- lm(y~x)
  predictor <- function(x) {
   return(predict(object=linear.fit,newdata=data.frame(x=x)))
  }
  return(predictor)
}

The predictor function persists and works, even when the data we used to create it is gone

A Less Trivial Example

library(MASS); data(cats)
vet_predictor <- make.linear.predictor(x=cats$Bwt,y=cats$Hwt)
rm(cats)            # Data set goes away
vet_predictor(3.5)  # My cat's body mass in kilograms

    1 
13.76

A more mathematical example

Instead of finding \( \nabla f(x) \), find the function \( \nabla f \):

nabla <- function(f,...) {
  require("numDeriv")
  g <- function(x,...) { grad(func=f,x=x,...) }
  return(g)
}

Exercise: Write a test case!

Example: curve()

You learned to use curve in the first week (because you did all of the assigned reading, including section 2.3.3 of the textbook)
A call to curve looks like this:

curve(expr, from = a, to = b, ...)

expr is some expression involving a variable called x
which is swept from the value a to the value b
... are other plot-control arguments

curve feeds the expression a vector x and expects a numeric vector back, e.g. curve(x^2 * sin(x)) is fine

Using curve() with our own functions

If we have defined a function already, we can use it in curve:

psi <- function(x,c=1) {ifelse(abs(x)>c,2*c*abs(x)-c^2,x^2)}
curve(psi(x,c=10),from=-20,to=20)

Try this! Also try

curve(psi(x=10,c=x),from=-20,to=20)

and explain it to yourself

Using curve() with our own functions

If our function doesn't take vectors to vectors, curve becomes unhappy

mse <- function(y0,a,Y=gmp$pcgmp,N=gmp$pop) {
   mean((Y - y0*(N^a))^2)
}

> curve(mse(a=x,y0=6611),from=0.10,to=0.15)
Error in curve(mse(a = x, y0 = 6611), from = 0.1, to = 0.15) : 
  'expr' did not evaluate to an object of length 'n'
In addition: Warning message:
In N^a : longer object length is not a multiple of shorter object length

How do we solve this?

Using curve() with our own functions

Define a new, vectorized function, say with sapply:

sapply(seq(from=0.10,to=0.15,by=0.01),mse,y0=6611)

[1] 154701953 102322974  68755654  64529166 104079527 207057513

mse(6611,0.10)

[1] 154701953

mse.plottable <- function(a,...){ return(sapply(a,mse,...)) }
mse.plottable(seq(from=0.10,to=0.15,by=0.01),y0=6611)

[1] 154701953 102322974  68755654  64529166 104079527 207057513

Using curve() with our own functions

curve(mse.plottable(a=x,y0=6611),from=0.10,to=0.20,xlab="a",ylab="MSE")
curve(mse.plottable(a=x,y0=5100),add=TRUE,col="blue")

plot of chunk unnamed-chunk-19

Using curve() with our own functions

Alternate strategy: Vectorize() returns a new, vectorized function

mse.vec <- Vectorize(mse, vectorize.args=c("y0","a"))
mse.vec(a=seq(from=0.10,to=0.15,by=0.01),y0=6611)

[1] 154701953 102322974  68755654  64529166 104079527 207057513

mse.vec(a=1/8,y0=c(5000,6000,7000))

[1] 134617132  74693733  63732256

Using curve() with our own functions

curve(mse.vec(a=x,y0=6611),from=0.10,to=0.20,xlab="a",ylab="MSE")
curve(mse.vec(a=x,y0=5100),add=TRUE,col="blue")

plot of chunk unnamed-chunk-21

Example: surface()

curve takes an expression and, as a side-effect, plots a 1-D curve by sweeping over x
Suppose we want something like that but sweeping over two variables
Built-in plotting function contour:
```
contour(x,y,z, [[other stuff]])
```
x and y are vectors of coordinates, z is a matrix of the corresponding shape
(see help(contour) for graphical options)
Strategy: surface should make x and y sequences, evaluate the expression at each combination to get z, and then call contour

First attempt at surface()

Only works with vector-to-number functions:

surface.1 <- function(f,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101,
  n.y=101,...) {
  x.seq <- seq(from=from.x,to=to.x,length.out=n.x)
  y.seq <- seq(from=from.y,to=to.y,length.out=n.y)
  plot.grid <- expand.grid(x=x.seq,y=y.seq)
  z.values <- apply(plot.grid,1,f)
  z.matrix <- matrix(z.values,nrow=n.x)
  contour(x=x.seq,y=y.seq,z=z.matrix,...)
  invisible(list(x=x.seq,y=y.seq,z=z.matrix))
}

First attempt at surface()

surface.1(function(p){return(sum(p^3))},from.x=-1,from.y=-1)

plot of chunk unnamed-chunk-23

Expressions and Evaluation

curve doesn't require us to write a function every time — what's it's trick?
Expressions are just another class of R object, so they can be created and manipulated
One manipulation is evaluation

eval(expr,envir)

evaluates the expression expr in the environment envir, which can be a data frame or even just a list

When we type something like x^2+y^2 as an argument to surface.1, R tries to evaluate it prematurely
substitute returns the unevaluted expression
curve uses first substitute(expr) and then eval(expr,envir), having made the right envir

Second attempt at surface()

surface.2 <- function(expr,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101,
  n.y=101,...) {
  x.seq <- seq(from=from.x,to=to.x,length.out=n.x)
  y.seq <- seq(from=from.y,to=to.y,length.out=n.y)
  plot.grid <- expand.grid(x=x.seq,y=y.seq)
  unevaluated.expression <- substitute(expr)
  z.values <- eval(unevaluated.expression,envir=plot.grid)
  z.matrix <- matrix(z.values,nrow=n.x)
  contour(x=x.seq,y=y.seq,z=z.matrix,...)
  invisible(list(x=x.seq,y=y.seq,z=z.matrix))
}

Second attempt at surface()

surface.2(abs(x^3)+abs(y^3),from.x=-1,from.y=-1)

plot of chunk unnamed-chunk-25

Evaluating at Combinations

Evaluating a function at every combination of two arguments is a really common task
There is a function to do it for us: outer

Third attempt at surface()

surface.3 <- function(expr,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101,
  n.y=101,...) {
  x.seq <- seq(from=from.x,to=to.x,length.out=n.x)
  y.seq <- seq(from=from.y,to=to.y,length.out=n.y)
  unevaluated.expression <- substitute(expr)
  z <- function(x,y) {
    return(eval(unevaluated.expression,envir=list(x=x,y=y)))
  }
  z.values <- outer(X=x.seq,Y=y.seq,FUN=z)
  z.matrix <- matrix(z.values,nrow=n.x)
  contour(x=x.seq,y=y.seq,z=z.matrix,...)
  invisible(list(x=x.seq,y=y.seq,z=z.matrix, func=z))
}

Third attempt at surface()

surface.3(x^4-y^4,from.x=-1,from.y=-1)

plot of chunk unnamed-chunk-27

surface()

surface.3(mse.vec(a=x,y0=y),from.x=0.10,to.x=0.15,
          from.y=6e3,to.y=7e3,nlevels=20)

plot of chunk unnamed-chunk-28

Summary

In R, functions are objects, and can be arguments to other functions
- Use this to do the same thing to many different functions
- Separates writing the high-level operations and the first-order functions
- Use sapply (etc.), wrappers, anonymous functions as adapters
Functions can also be returned by other functions
- Variables other than the arguments to the function are fixed by the environment of creation
- Manipulating expressions lets us flexibly create functions

Functions of Functions: Mathematically

Maximum, and location of the maximum: takes \( f \), gives number \[ \max_{x}{f(x)} ~, ~ \argmax_{x}{f(x)} \]
Derivative of \( f \) at \( x_0 \): takes a function and a point, gives a number \[ \frac{df}{dx}(x_0) \equiv \lim_{h\rightarrow 0}{\frac{f(x_0+h) - f(x_0)}{h}} \]
Definite integral of \( f \) over \( [a,b] \): takes a function and two points, gives a number \[ \int_{a}^{b}{f(x) dx} \equiv \lim_{n\rightarrow \infty}{\sum_{i=0}^{n-1}{\left(\frac{b-a}{n}\right)f\left(a+i\frac{b-a}{n}\right)}} \]

Mathematical view cont'd.

Functions of functions which return numbers sometimes are sometimes called functionals, e.g., expectation values: \[ \mathbb{E}[f(X)] \equiv \int_{\mathrm{all}~x}{f(x) p(x) dx} \]
\( \nabla f(x_0) \) takes \( f \) and \( x_0 \), gives vector: not strictly a functional
\( \nabla f \) is another, vector-valued function
\( \nabla \) takes a function and returns a function
\( \nabla \) is an operator, not a functional

Mathematically

Something which takes a function in and gives a function back is an operator
Differentiation: the operator \( d/dx \) takes \( f \) and gives a new function
Gradient: the operator \( \nabla \) takes \( f \) and gives a new function
similarly \( \nabla \cdot \), \( \nabla \times \), …
Indefinite integration: \( \int_{-\infty}^{x}{f(u) du} \) takes \( f \) and gives a new function
Fourier transform: takes \( f \) and gives a new function \[ \tilde{f}(\omega) = \int_{x=-\infty}^{x=\infty}{f(x) e^{2i\pi \omega x} dx} \]

Bonus: Writing Our Own gradient()

Suppose we didn't know about the numDeriv package..

-Use the simplest possible method: change x by some amount, find the difference in f, take the slope
method="simple" option in numDeriv::grad

Start with pseudo-code

gradient <- function(f,x,deriv.steps) {
  # not real code
  evaluate the function at x and at x+deriv.steps
  take slopes to get partial derivatives
  return the vector of partial derivatives
}

Bonus Example: gradient()

A naive implementation would use a for loop

gradient <- function(f,x,deriv.steps,...) {
  p <- length(x)
  stopifnot(length(deriv.steps)==p)
  f.old <- f(x,...)
  gradient <- vector(length=p)
  for (coordinate in 1:p) {
   x.new <- x
   x.new[coordinate] <- x.new[coordinate]+deriv.steps[coordinate]
   f.new <- f(x.new,...)
   gradient[coordinate] <- (f.new - f.old)/deriv.steps[coordinate]
  }
  return(gradient)
}

Works, but it's so repetitive!

Bonus Example: gradient()

Better: use matrix manipulation and apply

gradient <- function(f,x,deriv.steps,...) {
  p <- length(x)
  stopifnot(length(deriv.steps)==p)
  x.new <- matrix(rep(x,times=p),nrow=p) + diag(deriv.steps,nrow=p)
  f.new <- apply(x.new,2,f,...)
  gradient <- (f.new - f(x,...))/deriv.steps
  return(gradient)
}

(clearer, and half as long)

Presumes that f takes a vector and returns a single number
Any extra arguments to gradient will get passed to f
Check: Does this work when f is a function of a single number?

Bonus Example: gradient()

Acts badly if f is only defined on a limited domain and we ask for the gradient somewhere near a boundary
Forces the user to choose deriv.steps
Uses the same deriv.steps everywhere, imagine \( f(x) = x^2\sin{x} \)

… and so on through much of a first course in numerical analysis (or at least sec. 5.7 of Numerical Recipes)