Model selection via the lasso in conditional logistic regression
Abstract
We propose a model selection procedure in the context of matched case-control studies and, more specifically, for the conditional logistic regression. The method is based on penalized conditional likelihood with an L1-type penalty of the regression coefficients, the so-called lasso. This penalty, which shrinks coefficients to improve the accuracy of prediction, is particularly adapted when the number of covariates is large (with respect to the number of events) or in case of collinearity between them. An attractive property of the lasso is that it performs parameter estimation and variable selection simultaneously. The implementation of the method is based on a simple modification of the R-package "penalized" for fitting generalized linear and Cox proportional hazard models with L1 or L2 penalties, developed by J. Goeman. The effectiveness of the algorithms allows the use of resampling methods for choosing the regularization term and for evaluating the stability of the selected model. Thus, K-fold likelihood cross-validation and bootstrap are applied, both taking into account the dependent nature of data. The methodology is illustrated with some examples.
Origin : Files produced by the author(s)