36-350
20 October 2014
apply
, sapply
, etc., to avoid iterationsurface
Reading: Sections 7.5, 7.11 and 7.13 of Matloff
Optional Recommended Reading: Chapter 3 of Chambers
In R, functions are objects, just like everything else
This means that they can be passed to functions as arguments
and returned by functions as outputs as well
We often want to do very similar things to many different functions
The procedure is the same, only the function we're working with changes
\( \therefore \) Write one function to do the job, and pass the function as an argument
Because R treats a function like any other object, we can do this simply: invoke the function by its argument name in the body
We have already seen examples
apply()
, sapply()
, etc.: Take this function and use it on all of these objectsnlm()
: Take this function and try to make it small, starting from hereks.test()
: Compare these data to this cumulative distribution functioncurve()
: Evaluate this function over that range, and plot the resultssample
function (x, size, replace = FALSE, prob = NULL)
{
if (length(x) == 1L && is.numeric(x) && x >= 1) {
if (missing(size))
size <- x
sample.int(x, size, replace, prob)
}
else {
if (missing(size))
size <- length(x)
x[sample.int(length(x), size, replace, prob)]
}
}
<bytecode: 0x102d004a0>
<environment: namespace:base>
class(sin)
[1] "function"
class(sample)
[1] "function"
resample <- function(x) { sample(x, size=length(x), replace=TRUE) }
class(resample)
[1] "function"
function
returns a function object
body(foo)
formals(foo)
foo
: names are argument names, values are
expressions for defaults (if any)environment(foo)
typeof(resample)
[1] "closure"
typeof(sample)
[1] "closure"
typeof(sin)
[1] "builtin"
Why closure
for written-in-R functions? Because expressions are “closed” by referring to the parent environment
There's also a 2nd class of built-in functions called primitive
function()
returns an object of class function
sapply((-2):2,function(log.ratio){exp(log.ratio)/(1+exp(log.ratio))})
[1] 0.1192 0.2689 0.5000 0.7311 0.8808
apply
and sapply
Problems in stats. come down to optimization
So do lots of problems in econ., physics, CS, bio, …
Lots of optimization problems require the gradient of the objective function
Gradient of \( f \) at \( x \): \[ \nabla f(x) = \left[\left.\frac{\partial f}{\partial x_1}\right|_{x} \ldots \left.\frac{\partial f}{\partial x_p}\right|_{x}\right] \]
find the partial derivative of f with respect to each component of x
return the vector of partial derivatives
It makes no sense to re-write this every time we change \( f \)!
\( \therefore \) write code to calculate the gradient of an arbitrary function
We could write our own, but there are lots of tricky issues
Fortunately, someone has already done this
From the package numDeriv
grad(func, x, ...) # Plus other arguments
func
is a function which returns a single floating-point valuex
is a vector of arguments to func
x
is a vector and func(x)
is also a vector, then it's assumed func
is vectorized and we get a vector of derivatives...
get passed along to func
require("numDeriv")
just_a_phase <- runif(n=1,min=-pi,max=pi)
all.equal(grad(func=cos,x=just_a_phase),-sin(just_a_phase))
[1] TRUE
phases <- runif(n=10,min=-pi,max=pi)
all.equal(grad(func=cos,x=phases),-sin(phases))
[1] TRUE
grad(func=function(x){x[1]^2+x[2]^3}, x=c(1,-1))
[1] 2 3
Note: grad
is perfectly happy with func
being an anonymous function!
Now we can use this as a piece of a larger machine:
gradient.descent <- function(f,x,max.iterations,step.scale,
stopping.deriv,...) {
for (iteration in 1:max.iterations) {
gradient <- grad(f,x,...)
if(all(abs(gradient) < stopping.deriv)) { break() }
x <- x - step.scale*gradient
}
fit <- list(argmin=x,final.gradient=gradient,final.value=f(x,...),
iterations=iteration)
return(fit)
}
f
is mean squared error of a regression,
\( \psi \) error of a regression, (negative log) likelihood, cost of a production
plan, …f
takes values for all names which aren't its arguments
from the environment where it was defined, not the one where it is called
(e.g., not from inside grad
or gradient.descent
)f
and g
are both complicated, avoid
debugging g(f)
as a block; divide the work by writing very
simple f.dummy
to debug/test g
, and debug/test the real
f
separatelyFunctions can be return values like anything else
make.noneuclidean <- function(ratio.to.diameter=pi) {
circumference <- function(d) { return(ratio.to.diameter*d) }
return(circumference)
}
try(circumference(10))
kings.i <- make.noneuclidean(3)
try(kings.i(10))
[1] 30
formals(kings.i)
$d
body(kings.i)
{
return(ratio.to.diameter * d)
}
environment(kings.i)
<environment: 0x100b588d8>
try(circumference(10))
Create a linear predictor, based on sample values of two variables
make.linear.predictor <- function(x,y) {
linear.fit <- lm(y~x)
predictor <- function(x) {
return(predict(object=linear.fit,newdata=data.frame(x=x)))
}
return(predictor)
}
The predictor function persists and works, even when the data we used to create it is gone
library(MASS); data(cats)
vet_predictor <- make.linear.predictor(x=cats$Bwt,y=cats$Hwt)
rm(cats) # Data set goes away
vet_predictor(3.5) # My cat's body mass in kilograms
1
13.76
nabla <- function(f,...) {
require("numDeriv")
g <- function(x,...) { grad(func=f,x=x,...) }
return(g)
}
Exercise: Write a test case!
You learned to use curve
in the first week
(because you did all of the assigned reading, including
section 2.3.3 of the textbook)
A call to curve
looks like this:
curve(expr, from = a, to = b, ...)
expr
is some expression involving a variable called x
which is swept from
the value a
to
the value b
...
are other plot-control arguments
curve
feeds the expression a vector x
and expects a numeric vector back, e.g.
curve(x^2 * sin(x))
is finecurve
:psi <- function(x,c=1) {ifelse(abs(x)>c,2*c*abs(x)-c^2,x^2)}
curve(psi(x,c=10),from=-20,to=20)
Try this! Also try
curve(psi(x=10,c=x),from=-20,to=20)
and explain it to yourself
curve
becomes
unhappymse <- function(y0,a,Y=gmp$pcgmp,N=gmp$pop) {
mean((Y - y0*(N^a))^2)
}
> curve(mse(a=x,y0=6611),from=0.10,to=0.15)
Error in curve(mse(a = x, y0 = 6611), from = 0.1, to = 0.15) :
'expr' did not evaluate to an object of length 'n'
In addition: Warning message:
In N^a : longer object length is not a multiple of shorter object length
How do we solve this?
sapply
:sapply(seq(from=0.10,to=0.15,by=0.01),mse,y0=6611)
[1] 154701953 102322974 68755654 64529166 104079527 207057513
mse(6611,0.10)
[1] 154701953
mse.plottable <- function(a,...){ return(sapply(a,mse,...)) }
mse.plottable(seq(from=0.10,to=0.15,by=0.01),y0=6611)
[1] 154701953 102322974 68755654 64529166 104079527 207057513
curve(mse.plottable(a=x,y0=6611),from=0.10,to=0.20,xlab="a",ylab="MSE")
curve(mse.plottable(a=x,y0=5100),add=TRUE,col="blue")
Vectorize()
returns a new, vectorized functionmse.vec <- Vectorize(mse, vectorize.args=c("y0","a"))
mse.vec(a=seq(from=0.10,to=0.15,by=0.01),y0=6611)
[1] 154701953 102322974 68755654 64529166 104079527 207057513
mse.vec(a=1/8,y0=c(5000,6000,7000))
[1] 134617132 74693733 63732256
curve(mse.vec(a=x,y0=6611),from=0.10,to=0.20,xlab="a",ylab="MSE")
curve(mse.vec(a=x,y0=5100),add=TRUE,col="blue")
curve
takes an expression and, as a side-effect, plots a 1-D curve
by sweeping over x
Suppose we want something like that but sweeping over two variables
Built-in plotting function contour
:
contour(x,y,z, [[other stuff]])
x
and y
are vectors of coordinates, z
is a matrix
of the corresponding shape
(see help(contour)
for graphical options)
Strategy: surface
should make x
and y
sequences,
evaluate the expression at each combination to get z
, and then call
contour
surface.1 <- function(f,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101,
n.y=101,...) {
x.seq <- seq(from=from.x,to=to.x,length.out=n.x)
y.seq <- seq(from=from.y,to=to.y,length.out=n.y)
plot.grid <- expand.grid(x=x.seq,y=y.seq)
z.values <- apply(plot.grid,1,f)
z.matrix <- matrix(z.values,nrow=n.x)
contour(x=x.seq,y=y.seq,z=z.matrix,...)
invisible(list(x=x.seq,y=y.seq,z=z.matrix))
}
surface.1(function(p){return(sum(p^3))},from.x=-1,from.y=-1)
curve
doesn't require us to write a function every time — what's
it's trick?
Expressions are just another class of R object, so they can be created and manipulated
One manipulation is evaluation
eval(expr,envir)
evaluates the expression expr
in the environment envir
,
which can be a data frame or even just a list
When we type something like x^2+y^2
as an argument to
surface.1
, R tries to evaluate it prematurely
substitute
returns the unevaluted expression
curve
uses first substitute(expr)
and then
eval(expr,envir)
, having made the right envir
surface.2 <- function(expr,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101,
n.y=101,...) {
x.seq <- seq(from=from.x,to=to.x,length.out=n.x)
y.seq <- seq(from=from.y,to=to.y,length.out=n.y)
plot.grid <- expand.grid(x=x.seq,y=y.seq)
unevaluated.expression <- substitute(expr)
z.values <- eval(unevaluated.expression,envir=plot.grid)
z.matrix <- matrix(z.values,nrow=n.x)
contour(x=x.seq,y=y.seq,z=z.matrix,...)
invisible(list(x=x.seq,y=y.seq,z=z.matrix))
}
surface.2(abs(x^3)+abs(y^3),from.x=-1,from.y=-1)
Evaluating a function at every combination of two arguments is a really common task
There is a function to do it for us: outer
surface.3 <- function(expr,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101,
n.y=101,...) {
x.seq <- seq(from=from.x,to=to.x,length.out=n.x)
y.seq <- seq(from=from.y,to=to.y,length.out=n.y)
unevaluated.expression <- substitute(expr)
z <- function(x,y) {
return(eval(unevaluated.expression,envir=list(x=x,y=y)))
}
z.values <- outer(X=x.seq,Y=y.seq,FUN=z)
z.matrix <- matrix(z.values,nrow=n.x)
contour(x=x.seq,y=y.seq,z=z.matrix,...)
invisible(list(x=x.seq,y=y.seq,z=z.matrix, func=z))
}
surface.3(x^4-y^4,from.x=-1,from.y=-1)
surface.3(mse.vec(a=x,y0=y),from.x=0.10,to.x=0.15,
from.y=6e3,to.y=7e3,nlevels=20)
sapply
(etc.), wrappers, anonymous functions as adaptersMaximum, and location of the maximum: takes \( f \), gives number \[ \max_{x}{f(x)} ~, ~ \argmax_{x}{f(x)} \]
Derivative of \( f \) at \( x_0 \): takes a function and a point, gives a number \[ \frac{df}{dx}(x_0) \equiv \lim_{h\rightarrow 0}{\frac{f(x_0+h) - f(x_0)}{h}} \]
Definite integral of \( f \) over \( [a,b] \): takes a function and two points, gives a number \[ \int_{a}^{b}{f(x) dx} \equiv \lim_{n\rightarrow \infty}{\sum_{i=0}^{n-1}{\left(\frac{b-a}{n}\right)f\left(a+i\frac{b-a}{n}\right)}} \]
Functions of functions which return numbers sometimes are sometimes called functionals, e.g., expectation values: \[ \mathbb{E}[f(X)] \equiv \int_{\mathrm{all}~x}{f(x) p(x) dx} \]
\( \nabla f(x_0) \) takes \( f \) and \( x_0 \), gives vector: not strictly a functional
\( \nabla f \) is another, vector-valued function
\( \nabla \) takes a function and returns a function
\( \nabla \) is an operator, not a functional
Something which takes a function in and gives a function back is an operator
Differentiation: the operator \( d/dx \) takes \( f \) and gives a new function
Gradient: the operator \( \nabla \) takes \( f \) and gives a new function
similarly \( \nabla \cdot \), \( \nabla \times \), …
Indefinite integration: \( \int_{-\infty}^{x}{f(u) du} \) takes \( f \) and gives a new function
Fourier transform: takes \( f \) and gives a new function \[ \tilde{f}(\omega) = \int_{x=-\infty}^{x=\infty}{f(x) e^{2i\pi \omega x} dx} \]
numDeriv
package..-Use the simplest possible method: change x
by some amount, find the difference in f
, take the slope
method="simple"
option in numDeriv::grad
gradient <- function(f,x,deriv.steps) {
# not real code
evaluate the function at x and at x+deriv.steps
take slopes to get partial derivatives
return the vector of partial derivatives
}
A naive implementation would use a for
loop
gradient <- function(f,x,deriv.steps,...) {
p <- length(x)
stopifnot(length(deriv.steps)==p)
f.old <- f(x,...)
gradient <- vector(length=p)
for (coordinate in 1:p) {
x.new <- x
x.new[coordinate] <- x.new[coordinate]+deriv.steps[coordinate]
f.new <- f(x.new,...)
gradient[coordinate] <- (f.new - f.old)/deriv.steps[coordinate]
}
return(gradient)
}
Works, but it's so repetitive!
Better: use matrix manipulation and apply
gradient <- function(f,x,deriv.steps,...) {
p <- length(x)
stopifnot(length(deriv.steps)==p)
x.new <- matrix(rep(x,times=p),nrow=p) + diag(deriv.steps,nrow=p)
f.new <- apply(x.new,2,f,...)
gradient <- (f.new - f(x,...))/deriv.steps
return(gradient)
}
(clearer, and half as long)
Presumes that f
takes a vector and returns a single number
Any extra arguments to gradient
will get passed to f
Check: Does this work when f
is a function of a single number?
f
is only defined on a limited domain
and we ask for the gradient somewhere near a boundaryderiv.steps
deriv.steps
everywhere, imagine \( f(x) = x^2\sin{x} \)… and so on through much of a first course in numerical analysis (or at least sec. 5.7 of Numerical Recipes)