npscoef {np} | R Documentation |
npscoef
computes a kernel regression estimate of a one
(1) dimensional dependent variable on p-variate explanatory
data, using the model Y_i = t(W_i) * gamma(Z_i) + u_i where t(W_i)=(1,t(X_i)),
given a set of evaluation points, training points (consisting of
explanatory data and dependent data), and a bandwidth specification. A
bandwidth specification can be a scbandwidth
object, or a
bandwidth vector, bandwidth type and kernel type.
npscoef(bws, ...) ## S3 method for class 'formula': npscoef(bws, data = NULL, newdata = NULL, ...) ## S3 method for class 'call': npscoef(bws, ...) ## Default S3 method: npscoef(bws, txdat, tydat, tzdat, ...) ## S3 method for class 'scbandwidth': npscoef(bws, txdat = stop("training data 'txdat' missing"), tydat = stop("training data 'tydat' missing"), tzdat = NULL, exdat, eydat, ezdat, residuals = FALSE, errors = TRUE, iterate = TRUE, maxiter = 100, tol = .Machine$double.eps, leave.one.out = FALSE, betas = FALSE, ...)
bws |
a bandwidth specification. This can be set as a scbandwidth
object returned from an invocation of npscoefbw , or
as a vector of bandwidths, with each element i corresponding
to the bandwidth for column i in tzdat . If specified as
a vector
additional arguments will need to be supplied as necessary to
specify the bandwidth type, kernel types, training data, and so on.
|
... |
additional arguments supplied to specify the regression type,
bandwidth type, kernel types, selection methods, and so on.
To do this, you may specify any of bwscaling , bwtype ,
ckertype , ckerorder , as described in
npscoefbw .
|
data |
an optional data frame, list or environment (or object
coercible to a data frame by as.data.frame ) containing
the variables
in the model. If not found in data, the variables are taken from
environment(bws) , typically the environment from which
npscoefbw was called.
|
newdata |
An optional data frame in which to look for evaluation data. If omitted, the training data are used. |
txdat |
a p-variate data frame of explanatory data (training data),
which, by default, populates the columns 2 through p+1
of W in the model equation, and in the
absence of zdat , will also correspond to
Z from the model equation. Defaults to
the training data used to
compute the bandwidth object.
|
tydat |
a one (1) dimensional numeric or integer vector of dependent data, each
element i corresponding to each observation (row) i of
txdat . Defaults to
the training data used to
compute the bandwidth object.
|
tzdat |
a optionally specified q-variate data frame of explanatory data (training data), which corresponds to Z in the model equation. Defaults to the training data used to compute the bandwidth object. |
exdat |
a p-variate data frame of points on which the regression will be
estimated (evaluation data).By default,
evaluation takes place on the data provided by txdat .
|
eydat |
a one (1) dimensional numeric or integer vector of the true values of the dependent variable. Optional, and used only to calculate the true errors. |
ezdat |
an optionally specified q-variate data frame of points on
which
the regression will be estimated
(evaluation data), which corresponds to Z
in the model equation. Defaults to be the same as txdat .
|
errors |
a logical value indicating whether or not asymptotic standard errors
should be computed and returned in the resulting
smoothcoefficient object. Defaults to TRUE .
|
residuals |
a logical value indicating that you want residuals computed and
returned in the resulting smoothcoefficient object. Defaults to
FALSE .
|
iterate |
a logical value indicating whether or not backfitted estimates
should be iterated for self-consistency. Defaults to TRUE .
|
maxiter |
integer specifying the maximum number of times to iterate the
backfitted estimates while attempting make the backfitted estimates
converge to the desired tolerance. Defaults to 100 .
|
tol |
desired tolerance on the relative convergence of backfit
estimates. Defaults to .Machine$double.eps .
|
leave.one.out |
a logical value to specify whether or not to compute the leave one
out estimates. Will not work if e[xyz]dat is specified. Defaults to
FALSE .
|
betas |
a logical value indicating whether or not estimates of the
components of gamma should be returned in the
smoothcoefficient object along with the regression estimates.
Defaults to FALSE .
|
npscoef
returns a smoothcoefficient
object. The generic
functions fitted
, residuals
, coef
,
se
, and predict
,
extract (or generate) estimated values,
residuals, coefficients, bootstrapped standard
errors on estimates, and predictions, respectively, from
the returned object. Furthermore, the functions summary
and plot
support objects of this type. The returned object
has the following components:
eval |
evaluation points |
mean |
estimation of the regression function (conditional mean) at the evaluation points |
merr |
if errors = TRUE , standard errors of the
regression estimates |
beta |
if betas = TRUE , estimates of the coefficients
gamma at the
evaluation points |
resid |
if residuals = TRUE , in-sample or out-of-sample
residuals where appropriate (or possible) |
R2 |
coefficient of determination |
MSE |
mean squared error |
MAE |
mean absolute error |
MAPE |
mean absolute percentage error |
CORR |
absolute value of Pearson's correlation coefficient |
SIGN |
fraction of observations where fitted and observed values agree in sign |
If you are using data of mixed types, then it is advisable to use the
data.frame
function to construct your input data and not
cbind
, since cbind
will typically not work as
intended on mixed data types and will coerce the data to the same
type.
Support for backfitted bandwidths is experimental and is limited in functionality. The code does not support asymptotic standard errors or out of sample estimates with backfitting.
Tristen Hayfield hayfield@phys.ethz.ch, Jeffrey S. Racine racinej@mcmaster.ca
Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.
Cai Z. (2007), “Trending time-varying coefficient time series models with serially correlated errors,” Journal of Econometrics, 136, 163-188.
Hastie, T. and R. Tibshirani (1993), “Varying-coefficient models,” Journal of the Royal Statistical Society, B 55, 757-796.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
Li, Q. and J.S. Racine (2008), “Smooth varying-coefficient estimation and inference for qualitative and quantitative data,” manuscript.
Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.
Racine, J.S. and D. Ouyang and Q. Li (2007), “Nonparametric multilevel models: a smoothing approach,” manuscript.
Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.
bw.nrd
, bw.SJ
, hist
,
npudens
, npudist
,
npudensbw
, npscoefbw
# EXAMPLE 1 (INTERFACE=FORMULA): n <- 250 x <- runif(n) z <- runif(n, min=-2, max=2) y <- x*exp(z)*(1.0+rnorm(n,sd = 0.2)) bw <- npscoefbw(y~x|z) model <- npscoef(bw) plot(model) ## Not run: # EXAMPLE 1 (INTERFACE=DATA FRAME): n <- 250 x <- runif(n) z <- runif(n, min=-2, max=2) y <- x*exp(z)*(1.0+rnorm(n,sd = 0.2)) bw <- npscoefbw(xdat=x, ydat=y, zdat=z) model <- npscoef(bw) plot(model) ## End(Not run)