O. Trelles, P. Prins, M. Snir, and R. Jansen, Big data, but are we ready?, Nature Reviews Genetics, vol.11, issue.3, p.224, 2011.
DOI : 10.1038/nrg2857-c1

J. Fontana, E. Alexander, and S. M. , Translational research in infectious disease: current paradigms and challenges ahead, Translational Research, vol.159, issue.6, pp.430-453, 2012.
DOI : 10.1016/j.trsl.2011.12.009

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3361696

N. Shah and J. Tenenbaum, The coming age of data-driven medicine: translational bioinformatics' next frontier, Journal of the American Medical Informatics Association, vol.19, issue.e1, pp.2-4
DOI : 10.1136/amiajnl-2012-000969

P. Bougnères and A. Valleron, Causes of early-onset type 1 diabetes: toward data-driven environmental approaches, The Journal of Experimental Medicine, vol.131, issue.13, pp.2953-2957, 2008.
DOI : 10.1038/nature06014

H. Choi and N. Pavelka, When One and One Gives More than Two: Challenges and Opportunities of Integrative Omics, Frontiers in Genetics, vol.2, p.105, 2011.
DOI : 10.3389/fgene.2011.00105

T. Murdoch and A. Detsky, The Inevitable Application of Big Data to Health Care, JAMA, vol.309, issue.13, pp.1351-1352
DOI : 10.1001/jama.2013.393

H. Liao and H. Lynn, A survey of variable selection methods in two Chinese epidemiology journals, BMC Medical Research Methodology, vol.57, issue.1, p.87, 2010.
DOI : 10.1016/j.jclinepi.2003.05.003

S. Walter and H. Tiemeier, Variable selection: current practice in epidemiological studies, European Journal of Epidemiology, vol.55, issue.1, pp.733-736, 2009.
DOI : 10.1007/s10654-009-9411-2

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2791468

P. Peduzzi, J. Concato, E. Kemper, T. Holford, and A. Feinstein, A simulation study of the number of events per variable in logistic regression analysis, Journal of Clinical Epidemiology, vol.49, issue.12, pp.1373-1379, 1996.
DOI : 10.1016/S0895-4356(96)00236-3

P. Smyth, Data mining: data analysis on a grand scale?, Statistical Methods in Medical Research, vol.32, issue.4, pp.309-327, 2000.
DOI : 10.1177/096228020000900402

P. Austin, A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality, Statistics in Medicine, vol.30, issue.15, pp.2937-2957, 2007.
DOI : 10.1002/sim.2770

J. Maroco, D. Silva, A. Rodrigues, M. Guerreiro, I. Santana et al., Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests, BMC Research Notes, vol.4, issue.1, p.299, 2011.
DOI : 10.1016/0167-8655(95)00113-1

M. Green, J. Björk, J. Forberg, U. Ekelund, L. Edenbrandt et al., Comparison between neural networks and multiple logistic regression to predict acute coronary syndrome in the emergency room, Artificial Intelligence in Medicine, vol.38, issue.3, pp.305-318, 2006.
DOI : 10.1016/j.artmed.2006.07.006

O. Regnier-coudert, J. Mccall, R. Lothian, T. Lam, S. Mcclinton et al., Machine learning for improved pathological staging of prostate cancer: A performance comparison on a range of classifiers, Artificial Intelligence in Medicine, vol.55, issue.1, pp.25-35, 2012.
DOI : 10.1016/j.artmed.2011.11.003

P. Austin, D. Lee, E. Steyerberg, and J. Tu, Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?, Biometrical Journal, vol.5, issue.5, pp.657-673
DOI : 10.1002/bimj.201100251

P. Austin, J. Tu, J. Ho, D. Levy, and D. Lee, Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes, Journal of Clinical Epidemiology, vol.66, issue.4, pp.398-407, 2013.
DOI : 10.1016/j.jclinepi.2012.11.008

R. Tibshirani, Regression shrinkage and selection via the Lasso, J R Stat Soc Ser B, vol.58, pp.267-288, 1996.
DOI : 10.1111/j.1467-9868.2011.00771.x

C. Xu, A. Van-der-schaaf, C. Schilstra, J. Langendijk, . Van-'t et al., Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models, International Journal of Radiation Oncology*Biology*Physics, vol.82, issue.4, pp.677-684, 2012.
DOI : 10.1016/j.ijrobp.2011.09.036

M. Avalos, N. Adroher, E. Lagarde, F. Thiessard, Y. Grandvalet et al., Prescription-Drug-Related Risk in Driving, Epidemiology, vol.23, issue.5, pp.706-712, 2012.
DOI : 10.1097/EDE.0b013e31825fa528

URL : https://hal.archives-ouvertes.fr/hal-00742317

N. Lapidus, D. Lamballerie, X. Salez, N. Setbon, M. Ferrari et al., Integrative study of pandemic A/H1N1 influenza infections: design and methods of the CoPanFlu-France cohort, BMC Public Health, vol.10, issue.1, p.417, 2012.
DOI : 10.1186/1471-2334-10-301

URL : https://hal.archives-ouvertes.fr/inserm-00730688

M. Reijans, G. Dingemans, C. Klaassen, J. Meis, J. Keijdener et al., RespiFinder: a New Multiparameter Test To Differentially Identify Fifteen Respiratory Viruses, Journal of Clinical Microbiology, vol.46, issue.4, pp.1232-1240, 2008.
DOI : 10.1128/JCM.02294-07

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2292964

A. European-medicines, Committee for proprietary medicinal products Note for guidance on harmonization of requirements for influenza vaccines, 2009.

N. Lapidus, X. De-lamballerie, N. Salez, M. Setbon, R. Delabre et al., Factors Associated with Post-Seasonal Serological Titer and Risk Factors for Infection with the Pandemic A/H1N1 Virus in the French General Population, PLoS ONE, vol.6, issue.4, p.60127, 2013.
DOI : 10.1371/journal.pone.0060127.s002

URL : https://hal.archives-ouvertes.fr/hal-01122215

J. Friedman, Greedy function approximation: a gradient boosting machine, pp.1189-1232, 2001.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2009.

J. Friedman, Stochastic gradient boosting, Computational Statistics & Data Analysis, vol.38, issue.4, pp.367-378, 2002.
DOI : 10.1016/S0167-9473(01)00065-2

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1666

P. Mccullagh and J. Nelder, Generalized Linear Models, 1989.

J. Friedman, T. Hastie, and R. Tibshirani, Regularization Paths for Generalized Linear Models via Coordinate Descent, Journal of Statistical Software, vol.33, issue.1, pp.1-22, 2010.
DOI : 10.18637/jss.v033.i01

URL : http://doi.org/10.18637/jss.v033.i01

T. Hesterberg, D. Moore, S. Monaghan, A. Clipson, R. Epstein et al., Bootstrap Methods and Permutation Tests, Introd to Pract Stat, 2005.

A. Altmann, L. Tolo?i, O. Sander, and T. Lengauer, Permutation importance: a corrected feature importance measure, Bioinformatics, vol.26, issue.10, pp.1340-1347, 2010.
DOI : 10.1093/bioinformatics/btq134

R. Steuer, J. Kurths, C. Daub, J. Weise, and J. Selbig, The mutual information: Detecting and evaluating dependencies between variables, Bioinformatics, vol.18, issue.Suppl 2, pp.231-240, 2002.
DOI : 10.1093/bioinformatics/18.suppl_2.S231

A. Liaw and M. Wiener, Classification and regression by randomForest, pp.18-22, 2002.

G. Ridgeway, Generalized boosted models: a guide to the gbm package, pp.1-12, 2007.

W. Touw, J. Bayjanov, L. Overmars, L. Backus, J. Boekhorst et al., Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?, Briefings in Bioinformatics, vol.14, issue.3, pp.315-326, 2013.
DOI : 10.1093/bib/bbs034

L. Tolosi and T. Lengauer, Classification with correlated features: unreliability of feature ranking and solutions, Bioinformatics, vol.27, issue.14, pp.1986-1994, 2011.
DOI : 10.1093/bioinformatics/btr300

R. Bender and S. Lange, Adjusting for multiple testing???when and how?, Journal of Clinical Epidemiology, vol.54, issue.4, pp.343-349, 2001.
DOI : 10.1016/S0895-4356(00)00314-0

R. Bender and S. Lange, Multiple test procedures other than Bonferroni's deserve wider use, BMJ, vol.318, issue.7183, pp.600-601, 1999.
DOI : 10.1136/bmj.318.7183.600a

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320, 2005.
DOI : 10.1073/pnas.201162998

S. Ng, V. Fang, D. Ip, K. Chan, G. Leung et al., Estimation of the Association Between Antibody Titers and Protection Against Confirmed Influenza Virus Infection in Children, Journal of Infectious Diseases, vol.208, issue.8, pp.1320-1324, 2013.
DOI : 10.1093/infdis/jit372

S. Riley, K. Kwok, K. Wu, D. Ning, B. Cowling et al., Epidemiological Characteristics of 2009 (H1N1) Pandemic Influenza Based on Paired Sera from a Longitudinal Community Cohort Study, PLoS Medicine, vol.361, issue.6, p.1000442, 2011.
DOI : 10.1371/journal.pmed.1000442.s009

J. Simmerman, P. Suntarattiwong, J. Levy, R. Jarman, S. Kaewchana et al., Findings from a household randomized controlled trial of hand washing and face masks to reduce influenza transmission in Bangkok, Thailand, Influenza and Other Respiratory Viruses, vol.366, issue.4, pp.256-267, 2011.
DOI : 10.1111/j.1750-2659.2011.00205.x

K. Kloepfer, J. Olenec, W. Lee, G. Liu, R. Vrtis et al., Increased H1N1 Infection Rate in Children with Asthma, American Journal of Respiratory and Critical Care Medicine, vol.185, issue.12, pp.1275-1279, 2012.
DOI : 10.1164/rccm.201109-1635OC