Building test data from real outbreaks for evaluating detection algorithms

Gaëtan Texier; Michael Jackson; Leonel Siwe; Jean-Baptiste Meynard; Xavier Deparis; Herve Chaudet

doi:10.1371/journal.pone.0183992

Article Dans Une Revue PLoS ONE Année : 2017

Building test data from real outbreaks for evaluating detection algorithms

(1, 2) , (3) , (4) , (5) , (5) , (1)

1
2
3
4
5

Gaëtan Texier

Fonction : Auteur correspondant
PersonId : 975670

Connectez-vous pour contacter l'auteur

Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale

Centre Pasteur du Cameroun

Michael Jackson

Fonction : Auteur

Group Health Research Institute [Seattle]

Leonel Siwe

Fonction : Auteur

Sub-Regional Institute of Statistics and Applied Economics [Yaoundé, Cameroun]

Jean-Baptiste Meynard

Fonction : Auteur
PersonId : 21185
IdHAL : jean-baptiste-meynard
IdRef : 110974859

Centre d'épidémiologie et de santé publique des armées [Marseille]

Xavier Deparis

Fonction : Auteur

Centre d'épidémiologie et de santé publique des armées [Marseille]

Herve Chaudet

Fonction : Auteur

Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale

Résumé

Benchmarking surveillance systems requires realistic simulations of disease outbreaks. However, obtaining these data in sufficient quantity, with a realistic shape and covering a sufficient range of agents, size and duration, is known to be very difficult. The dataset of outbreak signals generated should reflect the likely distribution of authentic situations faced by the surveillance system, including very unlikely outbreak signals. We propose and evaluate a new approach based on the use of historical outbreak data to simulate tailored outbreak signals. The method relies on a homothetic transformation of the historical distribution followed by resampling processes (Binomial, Inverse Transform Sampling Method-ITSM, Metropolis-Hasting Random Walk, Metropolis-Hasting Independent, Gibbs Sampler, Hybrid Gibbs Sampler). We carried out an analysis to identify the most important input parameters for simulation quality and to evaluate performance for each of the resampling algorithms. Our analysis confirms the influence of the type of algorithm used and simulation parameters (i.e. days, number of cases, outbreak shape, overall scale factor) on the results. We show that, regardless of the outbreaks, algorithms and metrics chosen for the evaluation, simulation quality decreased with the increase in the number of days simulated and increased with the number of cases simulated. Simulating outbreaks with fewer cases than days of duration (i.e. overall scale factor less than 1) resulted in an important loss of information during the simulation. We found that Gibbs sampling with a shrinkage procedure provides a good balance between accuracy and data dependency. If dependency is of little importance, binomial and ITSM methods are accurate. Given the constraint of keeping the simulation within a range of plausible epidemiological curves faced by the surveillance system, our study confirms that our approach can be used to generate a large spectrum of outbreak signals.

Domaines

Santé publique et épidémiologie

Fichier principal

pone.0183992.pdf (4.39 Mo)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Claire Lissalde : Connectez-vous pour contacter le contributeur

https://inserm.hal.science/inserm-01986527

Soumis le : vendredi 18 janvier 2019-18:34:26

Dernière modification le : mercredi 15 novembre 2023-10:52:17

Dates et versions

inserm-01986527 , version 1 (18-01-2019)

Identifiants

HAL Id : inserm-01986527 , version 1
DOI : 10.1371/journal.pone.0183992
PUBMED : 28863159
PUBMEDCENTRAL : PMC5593515

Citer

Gaëtan Texier, Michael Jackson, Leonel Siwe, Jean-Baptiste Meynard, Xavier Deparis, et al.. Building test data from real outbreaks for evaluating detection algorithms. PLoS ONE, 2017, 12 (9), pp.e0183992. ⟨10.1371/journal.pone.0183992⟩. ⟨inserm-01986527⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSERM IRD SSA RIIP UNIV-AMU RIIP_CAMEROUN U1252

146 Consultations

109 Téléchargements

Building test data from real outbreaks for evaluating detection algorithms

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager