Controversies in modern evolutionary biology: the imperative for error detection and quality control. - Inserm - Institut national de la santé et de la recherche médicale Accéder directement au contenu
Article Dans Une Revue BMC Genomics Année : 2012

Controversies in modern evolutionary biology: the imperative for error detection and quality control.

Résumé

ABSTRACT: BACKGROUND: The data from high throughput genomics technologies provide unique opportunities for studies of complex biological systems, but also pose many new challenges. The shift to the genome scale in evolutionary biology, for example, has led to many interesting, but often controversial studies. It has been suggested that part of the conflict may be due to errors in the initial sequences. Most gene sequences are predicted by bioinformatics programs and a number of quality issues have been raised, concerning DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. RESULTS: We investigated the impact of these errors on evolutionary studies and specifically on the identification of important genetic events. We focused on the detection of asymmetric evolution after duplication, which has been the subject of controversy recently. Using the human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates are observed. We estimated the rates at which protein sequence errors occur and are accumulated in the higher-level analyses. We showed that the majority of the detected events (57%) are in fact artifacts due to the putative erroneous sequences and that these artifacts are sufficient to mask the true functional significance of the events. CONCLUSIONS: Initial errors are accumulated throughout the evolutionary analysis, generating artificially high rates of event predictions and leading to substantial uncertainty in the conclusions. This study emphasizes the urgent need for error detection and quality control strategies in order to efficiently extract knowledge from the new genome data.
Fichier principal
Vignette du fichier
1471-2164-13-5.pdf (896.87 Ko) Télécharger le fichier
1471-2164-13-5-S1.PDF (1.56 Mo) Télécharger le fichier
1471-2164-13-5-S2.PDF (291.58 Ko) Télécharger le fichier
1471-2164-13-5.xml (139.79 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Format : Autre
Format : Autre
Format : Autre

Dates et versions

inserm-00682055 , version 1 (23-03-2012)

Identifiants

Citer

Francisco Prosdocimi, Benjamin Linard, Pierre Pontarotti, Olivier Poch, Julie Thompson. Controversies in modern evolutionary biology: the imperative for error detection and quality control.. BMC Genomics, 2012, 13 (1), pp.5. ⟨10.1186/1471-2164-13-5⟩. ⟨inserm-00682055⟩
344 Consultations
284 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More