Inbreeding Coefficient Estimation with Dense SNP Data: Comparison of Strategies and Application to HapMap III

Abstract : Background/AimsIf the parents of an individual are related, it is possible for the individual to have received at a locus two identical by descent (IBD) alleles that are copies of a single allele carried by the parents’ common ancestor. The inbreeding coefficient measures the probability of this event and increases with increasing relatedness between the parents. It is traditionally computed from the observed inbreeding loops in the genealogies and its accuracy thus depends on the depth and reliability of genealogies. With the availability of genome-wide genetic data, it has become possible to compute a genome-based inbreeding coefficient f and different methods have been developed to estimate f and identify inbred individuals in a sample from the observed patterns of homozygosity at markers. MethodsIn this paper, we performed simulations with known genealogies using different SNP panels with different levels of linkage disequilibrium (LD) to compare several estimators of f, including single-point estimates, methods based on the length of runs of homozygosity (ROHs) and different methods that use hidden Markov models (HMMs). We also compared the performances of some of these estimators to identify inbred individuals in a sample using either HMM likelihood ratio tests or an adapted version of ERSA software.ResultsSingle-points methods were found to have higher standard deviations than other methods. ROHs give the best estimates provided the correct length threshold is known. HMM on sparse data gave equivalent or better results than HMM modeling LD. Provided LD is correctly accounted for, inbreeding estimates were very similar using the different SNP panels. HMM likelihood ratio tests were found to perform better at detecting inbred individuals in a sample than the adapted ERSA. All methods accurately detected inbreeding up to 2nd cousin offspring. We applied the best method on the release 3 of HapMap phase III project, found up to 4% of inbred individuals, and created HAP1067, an unrelated and outbred dataset of this release.ConclusionsWe recommend using HMMs on multiple sparse maps to estimate and detect inbreeding on large samples. If the sample of individuals is too small to estimate allele frequencies, we advise to estimate them on reference panels or to use 1,500 kb ROHs. Finally, we suggest to investigators using HapMap to be careful with inbred individuals, especially in the GIH population.
Type de document :
Article dans une revue
Human Heredity, Karger, 2014, 77, pp.49 - 62. 〈10.1159/000358224〉
Liste complète des métadonnées

Littérature citée [50 références]  Voir  Masquer  Télécharger

http://www.hal.inserm.fr/inserm-01084018
Contributeur : Steven Gazal <>
Soumis le : mardi 18 novembre 2014 - 13:06:24
Dernière modification le : mardi 30 janvier 2018 - 10:42:01
Document(s) archivé(s) le : jeudi 19 février 2015 - 11:30:33

Fichier

 Accès restreint
Fichier visible le : jamais

Connectez-vous pour demander l'accès au fichier

Identifiants

Collections

Citation

Steven Gazal, Mourad Sahbatou, Hervé Perdry, Sébastien Letort, Emmanuelle Génin, et al.. Inbreeding Coefficient Estimation with Dense SNP Data: Comparison of Strategies and Application to HapMap III. Human Heredity, Karger, 2014, 77, pp.49 - 62. 〈10.1159/000358224〉. 〈inserm-01084018〉

Partager

Métriques

Consultations de la notice

100