ch 7 variable selection 7.1 Evaluating subsets of predictors 7.1.1 R^2_adj (interlude ML thy for regression - seems I might have done or alluded to this before...) (what might be more useful is a review of the LR test & the partial F test...) - nested vs non-nested 7.1.2 AIC * related to LR test * p+2 vs p+1 (sigma^2 I assume?) 7.1.3 CAIC, corrected AIC ("magic") 7.1.4 BIC (just "magic") 7.1.5 comparison... "eh..." 7.2 "deciding the collection" ... really: organizing the search - nested vs non-nested... 7.2.1 All subsets 7.2.2 stepwise - backward vs forward - relative merits 7.2.3 Inference after selection - double-dipping 7.3 Cross-validation methods two-sample methods protect against double-dipping (small sample effects may occur...) k-fold produces a better estimate of prediction accuracy 7.4 lasso... --------- Ch 6 of ISLR Gelman-Hill recs p 68 --------- https://www.stata.com/support/faqs/statistics/stepwise-regression-problems/ https://www4.stat.ncsu.edu/~post/josh/LASSO_Ridge_Elastic_Net_-_Examples.html (bit lengthy) https://www.analyticsvidhya.com/blog/2017/06/a-comprehensive-guide-for-linear-ridge-and-lasso-regression/ (long but has the right details about regularization) https://onlinecourses.science.psu.edu/stat501/node/330/ (good description of R2adj and mallows' Cp) --------- 1. MSE-related measures MSE and in-sample error R2adj Mallows' Cp (both basically track MSE) problems with in-sample approaches - * capitalization on chance, * double-dipping for inference 2. Other in-sample methods F & t tests - we've seen those Likekihood ratio tests - boils down to minimizing RSS penalized likelihood methods -2 LL + penality in regression: n*log(RSS/n) + penality AIC, CAIC, BIC 3. Typical methods - All subsets - stepwise (forward or backwards) Harrell warning 4. Penalized estimation methods recall LL ~ RSS RSS + (smoothing penality) ridge lasso elastic net 5. inference after fitting two-sample cross-validation k-fold cross-validation (on the training sample!) in-sample corrections (AIC, BIC, current research on lasso & related, to get good SE's of parameters etc.) 6. What do I really do? One can use automatic methods to suggest variables that might / might not be important but subject matter expertise rules Gelman & Hill recs The people you work with will expect (a) you are a high priest of variable selection and know the "right" way to do it - they will tend to accept anything you say (b) you can provide them with statistical cover for whatever model they want neither is correct!