A Comparison of Techniques for Class Imbalance in Deep Learning Classification of Breast Cancer - Inserm - Institut national de la santé et de la recherche médicale Accéder directement au contenu
Article Dans Une Revue Diagnostics Année : 2022

A Comparison of Techniques for Class Imbalance in Deep Learning Classification of Breast Cancer

Résumé

Tools based on deep learning models have been created in recent years to aid radiologists in the diagnosis of breast cancer from mammograms. However, the datasets used to train these models may suffer from class imbalance, i.e., there are often fewer malignant samples than benign or healthy cases, which can bias the model towards the healthy class. In this study, we systematically evaluate several popular techniques to deal with this class imbalance, namely, class weighting, oversampling, and under-sampling, as well as a synthetic lesion generation approach to increase the number of malignant samples. These techniques are applied when training on three diverse Full-Field Digital Mammography datasets, and tested on in-distribution and out-of-distribution samples. The experiments show that a greater imbalance is associated with a greater bias towards the majority class, which can be counteracted by any of the standard class imbalance techniques. On the other hand, these methods provide no benefit to model performance with respect to Area Under the Curve of the Recall Operating Characteristic (AUC-ROC), and indeed under-sampling leads to a reduction of 0.066 in AUC in the case of a 19:1 benign to malignant imbalance. Our synthetic lesion methodology leads to better performance in most cases, with increases of up to 0.07 in AUC on out-of-distribution test sets over the next best experiment.
Fichier principal
Vignette du fichier
published-paper.pdf (10.79 Mo) Télécharger le fichier
Origine : Publication financée par une institution

Dates et versions

inserm-03927923 , version 1 (06-01-2023)

Identifiants

Citer

Ricky Walsh, Mickael Tardy. A Comparison of Techniques for Class Imbalance in Deep Learning Classification of Breast Cancer. Diagnostics, 2022, 13 (1), pp.67. ⟨10.3390/diagnostics13010067⟩. ⟨inserm-03927923⟩
91 Consultations
120 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More