Analysis of variance models (ANOVA models) are used to study the relationship between predictor variables that are not necessarily quantitative and response variables that are quantitative. They are structured like linear regression models, and have the same basic assumptions, but they do not assume a linear relationship between the predictor and response variables.
Example.
Suppose we want to study the relationship between X = the price of a product, and Y = the sales volume in one month.
As a linear regression problem, this is simple. We want to fit the model
But this explicitly assumes that the relationship between price and sales is linear, and this need not be the case, as shown in the figure (for about 70 months' worth of data).
An alternative to fitting the complex nonlinear models we talked about at the end of class last time is to choose a few price points, such as $20.00, $50.00, and $90.00, and define dummy variables
and then fit the regression model
This is an example of a ``one-way ANOVA'' model.
There are three things to note about this approach:
showing that the model is overparameterized (linearly dependent columns).
All ANOVA models are overparametrized in their
``natural'' form, and we often must make a choice about what a
sensible ``reduced'' parametrization is.