Skip to Main content Skip to Navigation
Conference papers

Penalized additive logistic regression for cardiovascular risk prediction

Abstract : Predicting individual risk is needed to target preventive interventions toward people with the highest probability of benefit over a given time period. Any estimate of cardiovascular risk is currently based on the use of statistical models inferred from cohort data with methods such as logistic regression. Although attractively simple, the logistic model fails in some situations: 1) If the number of prognostic factors is large (with respect to the number of observations) or if they are highly correlated, then the variance of coefficient estimates may be high, leading to prediction inaccuracy. Subset selection is extensively used to address this difficulty. Another way to overcome these obstacles consists in imposing a penalty on large fluctuations of the estimated parameters. The lasso estimates a vector of linear regression coefficients by minimizing the residual sum of squares subject to a constraint on the l_{1}-norm of coefficient vector. An interesting feature of the l_1-norm constraint is that it shrinks some coefficients and sets others to exactly zero. On the other hand, the smooth form of the constraint leads to a less variable model than that provided by subset selection. 2) In real life, effects are generally not linear. When the study exposure is continuous, linear models may not accurately characterize the exposure-response curve. A generalization of the standard logistic model is the additive logistic model. The aim of this study is to model parsimoniously the relationship between a binary response and several continuous covariates in the case of possible nonlinearities in the effect of the covariates. We present a new method for variable selection and function estimation in non parametric additive logistic models fitted by cubic smoothing splines: penalized additive logistic regression. The method is based on a generalization of the lasso. Because of their nature, these constraints shrink linear and nonlinear coefficients, some of them going exactly to zero. Hence, they give parsimonious models, select significant variables, and reveal nonlinearities in the effects of predictors. Penalized additive logistic regression is applied to predict the risk of cardiovascular disease in a real database from the INDANA project (Individual Data Analysis of Antihypertensive Intervention Trials).
Document type :
Conference papers
Complete list of metadata

Cited literature [1 references]  Display  Hide  Download
Contributor : Marta Avalos <>
Submitted on : Tuesday, May 11, 2010 - 2:11:32 PM
Last modification on : Wednesday, April 14, 2021 - 12:26:01 PM
Long-term archiving on: : Thursday, September 16, 2010 - 12:12:25 PM


  • HAL Id : inserm-00149854, version 1



Marta Avalos, Yves Grandvalet, Christophe Ambroise. Penalized additive logistic regression for cardiovascular risk prediction. 2004, pp.301. ⟨inserm-00149854⟩



Record views


Files downloads