Prediction of Early Breast Cancer Metastasis from DNA Microarray Data Using High-Dimensional Cox Regression Models - Archive ouverte HAL Access content directly
Journal Articles Cancer Informatics Year : 2015

Prediction of Early Breast Cancer Metastasis from DNA Microarray Data Using High-Dimensional Cox Regression Models

(1) , (2) , (2) , (2) , (2) , (3) , (1)
1
2
3

Abstract

BACKGROUND: DNA microarray studies identified gene expression signatures predictive of metastatic relapse in early breast cancer. Standard feature selection procedures applied to reduce the set of predictive genes did not take into account the correlation between genes. In this paper, we studied the performances of three high-dimensional regression methods - CoxBoost, LASSO (Least Absolute Shrinkage and Selection Operator), and Elastic net - to identify prognostic signatures in patients with early breast cancer. METHODS: We analyzed three public retrospective datasets, including a total of 384 patients with axillary lymph node-negative breast cancer. The Amsterdam van't Veer's training set of 78 patients was used to determine the optimal gene sets and classifiers using sensitivity thresholds resulting in mis-classification of no more than 10% of the poor-prognosis group. To ensure the comparability between different methods, an automatic selection procedure was used to determine the number of genes included in each model. The van de Vijver's and Desmedt's datasets were used as validation sets to evaluate separately the prognostic performances of our classifiers. The results were compared to the original Amsterdam 70-gene classifier. RESULTS: The automatic selection procedure reduced the number of predictive genes up to a minimum of six genes. In the two validation sets, the three models (Elastic net, LASSO, and CoxBoost) led to the definition of genomic classifiers predicting the 5-year metastatic status with similar performances, with respective 59, 56, and 54% accuracy, 83, 75, and 83% sensitivity, and 53, 52, and 48% specificity in the Desmedt's dataset. In comparison, the Amsterdam 70-gene signature showed 45% accuracy, 97% sensitivity, and 34% specificity. The gene overlap and the classification concordance between the three classifiers were high. All the classifiers added significant prognostic information to that provided by the traditional prognostic factors and showed a very high overlap with respect to gene ontologies (GOs) associated with genes overexpressed in the predicted poor-prognosis vs. good-prognosis classes and centred on cell proliferation. Interestingly, all classifiers reported high sensitivity to predict the 4-year status of metastatic disease. CONCLUSIONS: High-dimensional regression methods are attractive in prognostic studies because finding a small subset of genes may facilitate the transfer to the clinic, and also because they strengthen the robustness of the model by limiting the selection of false-positive predictive genes. With only six genes, the CoxBoost classifier predicted the 4-year status of metastatic disease with 93% sensitivity. Selecting a few genes related to ontologies other than cell proliferation might further improve the overall sensitivity performance.
Fichier principal
Vignette du fichier
cin-suppl.2-2015-129.pdf (727.03 Ko) Télécharger le fichier
Origin : Publisher files allowed on an open archive
Loading...

Dates and versions

inserm-01996812 , version 1 (28-01-2019)

Identifiers

Cite

Christophe Zemmour, Francois Bertucci, Pascal Finetti, Bernard Chetrit, Daniel Birnbaum, et al.. Prediction of Early Breast Cancer Metastasis from DNA Microarray Data Using High-Dimensional Cox Regression Models. Cancer Informatics, 2015, 14(Suppl 2), pp.129-138. ⟨10.4137/cin.s17284⟩. ⟨inserm-01996812⟩
110 View
128 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More