R: Kernel Consistent Quantile Regression Model Specification Test with Mixed Data Types

npqcmstest {np}

R Documentation

Kernel Consistent Quantile Regression Model Specification Test with Mixed Data Types

Description

npqcmstest implements a consistent test for correct specification of parametric quantile regression models (linear or nonlinear) as described in Racine (2006) which extends the work of Zheng (1998).

Usage

npqcmstest(formula,
           data = NULL,
           subset,
           xdat,
           ydat,
           model = stop(paste(sQuote("model")," has not been provided")),
           tau = 0.5,
           distribution = c("bootstrap", "asymptotic"),
           bwydat = c("y","varepsilon"),
           boot.method=c("iid","wild","wild-rademacher"),
           boot.num = 399,
           pivot = TRUE,
           density.weighted = TRUE,
           random.seed = 42,
           ...)

Arguments

`formula`	a symbolic description of variables on which the test is to be performed. The details of constructing a formula are described below.
`data`	an optional data frame, list or environment (or object coercible to a data frame by `as.data.frame`) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which the function is called.
`subset`	an optional vector specifying a subset of observations to be used.
`model`	a model object obtained from a call to `rq`. Important: the call to `rq` must have the argument `model=TRUE` or `npqcmstest` will not work.
`xdat`	a p-variate data frame of explanatory data (training data) used to calculate the quantile regression estimators.
`ydat`	a one (1) dimensional numeric or integer vector of dependent data, each element i corresponding to each observation (row) i of `xdat`.
`tau`	a numeric value specifying the tauth quantile is desired
`distribution`	a character string used to specify the method of estimating the distribution of the statistic to be calculated. `bootstrap` will conduct bootstrapping. `asymptotic` will use the normal distribution. Defaults to `bootstrap`.
`bwydat`	a character string used to specify `ydat` used in bandwidth selection. `varepsilon` uses 1-tau,-tau for `ydat` while `y` will use y. Defaults to `y`.
`boot.method`	a character string used to specify the bootstrap method. `iid` will generate independent identically distributed draws. `wild` will use a wild bootstrap. `wild-rademacher` will use a wild bootstrap with Rademacher variables. Defaults to `iid`.
`boot.num`	an integer value specifying the number of bootstrap replications to use. Defaults to `399`.
`pivot`	a logical value specifying whether the statistic should be normalised such that it approaches N(0,1) in distribution. data. Defaults to `TRUE`.
`density.weighted`	a logical value specifying whether the statistic should be weighted by the density of `xdat`. Defaults to `TRUE`.
`random.seed`	an integer used to seed R's random number generator. This is to ensure replicability. Defaults to 42.
`...`	additional arguments supplied to control bandwidth selection on the residuals. One can specify the bandwidth type, kernel types, and so on. To do this, you may specify any of `bwscaling`, `bwtype`, `ckertype`, `ckerorder`, `ukertype`, `okertype`, as described in `npregbw`. This is necessary if you specify `bws` as a p-vector and not a `bandwidth` object, and you do not desire the default behaviours.

Value

npqcmstest returns an object of type cmstest with the following components, components will contain information related to Jn or In depending on the value of pivot:

`Jn`	the statistic `Jn`
`In`	the statistic `In`
`Omega.hat`	as described in Racine, J.S. (2006).
`q.*`	the various quantiles of the statistic `Jn` (or `In` if `pivot=FALSE`) are in components `q.90`, `q.95`, `q.99` (one-sided 1%, 5%, 10% critical values)
`P`	the P-value of the statistic
`Jn.bootstrap`	if `pivot=TRUE` contains the bootstrap replications of `Jn`
`In.bootstrap`	if `pivot=FALSE` contains the bootstrap replications of `In`

summary supports object of type cmstest.

Usage Issues

If you are using data of mixed types, then it is advisable to use the data.frame function to construct your input data and not cbind, since cbind will typically not work as intended on mixed data types and will coerce the data to the same type.

Author(s)

Tristen Hayfield hayfield@phys.ethz.ch, Jeffrey S. Racine racinej@mcmaster.ca

References

Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.

Koenker, R.W. and G.W. Bassett (1978), “Regression quantiles”, Econometrica, 46, 33-50.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Murphy, K. M. and F. Welch (1990), “Empirical age-earnings profiles,” Journal of Labor Economics, 8, 202-229.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Racine, J.S. (2006), “Consistent specification testing of heteroskedastic parametric regression quantile models with mixed data,” manuscript.

Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.

Zheng, J. (1998), “A consistent nonparametric test of parametric regression models under conditional quantile restrictions”, Econometric Theory, 14, 123-138.

Examples

# EXAMPLE 1: For this example, we conduct a consistent quantile regression
# model specification test for a parametric wage quantile regression
# model that is quadratic in age. The work of Murphy and Welch (1990)
# would suggest that this parametric quantile regression model is
# misspecified.

library("quantreg")

data("cps71")
attach(cps71)

model <- rq(logwage~age+I(age^2), tau=0.5, model=TRUE)

plot(age, logwage)
lines(age, fitted(model))

X <- data.frame(age)

# Note - this may take a few minutes depending on the speed of your
# computer...

npqcmstest(model = model, xdat = X, ydat = logwage, tau=0.5)

## Not run: 

# Sleep for 5 seconds so that we can examine the output...

Sys.sleep(5)

# Next try Murphy & Welch's (1990) suggested quintic specification.

model <- rq(logwage~age+I(age^2)+I(age^3)+I(age^4)+I(age^5), model=TRUE)

plot(age, logwage)
lines(age, fitted(model))

X <- data.frame(age)

# Note - this may take a few minutes depending on the speed of your
# computer...

npqcmstest(model = model, xdat = X, ydat = logwage, tau=0.5)

detach(cps71)
## End(Not run)

[Package np version 0.30-3 Index]