Penalized additive logistic regression for cardiovascular risk prediction - Inserm - Institut national de la santé et de la recherche médicale Accéder directement au contenu
Communication Dans Un Congrès Année : 2004

Penalized additive logistic regression for cardiovascular risk prediction

Résumé

Predicting individual risk is needed to target preventive interventions toward people with the highest probability of benefit over a given time period. Any estimate of cardiovascular risk is currently based on the use of statistical models inferred from cohort data with methods such as logistic regression. Although attractively simple, the logistic model fails in some situations: 1) If the number of prognostic factors is large (with respect to the number of observations) or if they are highly correlated, then the variance of coefficient estimates may be high, leading to prediction inaccuracy. Subset selection is extensively used to address this difficulty. Another way to overcome these obstacles consists in imposing a penalty on large fluctuations of the estimated parameters. The lasso estimates a vector of linear regression coefficients by minimizing the residual sum of squares subject to a constraint on the l_{1}-norm of coefficient vector. An interesting feature of the l_1-norm constraint is that it shrinks some coefficients and sets others to exactly zero. On the other hand, the smooth form of the constraint leads to a less variable model than that provided by subset selection. 2) In real life, effects are generally not linear. When the study exposure is continuous, linear models may not accurately characterize the exposure-response curve. A generalization of the standard logistic model is the additive logistic model. The aim of this study is to model parsimoniously the relationship between a binary response and several continuous covariates in the case of possible nonlinearities in the effect of the covariates. We present a new method for variable selection and function estimation in non parametric additive logistic models fitted by cubic smoothing splines: penalized additive logistic regression. The method is based on a generalization of the lasso. Because of their nature, these constraints shrink linear and nonlinear coefficients, some of them going exactly to zero. Hence, they give parsimonious models, select significant variables, and reveal nonlinearities in the effects of predictors. Penalized additive logistic regression is applied to predict the risk of cardiovascular disease in a real database from the INDANA project (Individual Data Analysis of Antihypertensive Intervention Trials).
Fichier principal
Vignette du fichier
avalos.pdf (65.34 Ko) Télécharger le fichier
Loading...

Dates et versions

inserm-00149854 , version 1 (11-05-2010)

Identifiants

  • HAL Id : inserm-00149854 , version 1

Citer

Marta Avalos, Yves Grandvalet, Christophe Ambroise. Penalized additive logistic regression for cardiovascular risk prediction. 2004, pp.301. ⟨inserm-00149854⟩
129 Consultations
91 Téléchargements

Partager

Gmail Facebook X LinkedIn More