A Comparison of Techniques for Class Imbalance in Deep Learning Classification of Breast Cancer - Archive ouverte HAL Access content directly
Journal Articles Diagnostics Year : 2022

A Comparison of Techniques for Class Imbalance in Deep Learning Classification of Breast Cancer

(1) , (2, 3, 4)
1
2
3
4

Abstract

Tools based on deep learning models have been created in recent years to aid radiologists in the diagnosis of breast cancer from mammograms. However, the datasets used to train these models may suffer from class imbalance, i.e., there are often fewer malignant samples than benign or healthy cases, which can bias the model towards the healthy class. In this study, we systematically evaluate several popular techniques to deal with this class imbalance, namely, class weighting, oversampling, and under-sampling, as well as a synthetic lesion generation approach to increase the number of malignant samples. These techniques are applied when training on three diverse Full-Field Digital Mammography datasets, and tested on in-distribution and out-of-distribution samples. The experiments show that a greater imbalance is associated with a greater bias towards the majority class, which can be counteracted by any of the standard class imbalance techniques. On the other hand, these methods provide no benefit to model performance with respect to Area Under the Curve of the Recall Operating Characteristic (AUC-ROC), and indeed under-sampling leads to a reduction of 0.066 in AUC in the case of a 19:1 benign to malignant imbalance. Our synthetic lesion methodology leads to better performance in most cases, with increases of up to 0.07 in AUC on out-of-distribution test sets over the next best experiment.
Fichier principal
Vignette du fichier
published-paper.pdf (10.79 Mo) Télécharger le fichier
Origin : Publication funded by an institution

Dates and versions

inserm-03927923 , version 1 (06-01-2023)

Identifiers

Cite

Ricky Walsh, Mickael Tardy. A Comparison of Techniques for Class Imbalance in Deep Learning Classification of Breast Cancer. Diagnostics, 2022, 13 (1), pp.67. ⟨10.3390/diagnostics13010067⟩. ⟨inserm-03927923⟩
0 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More