Effective normalization for copy number variation in Hi-C data

Abstract : Background: Normalization is essential to ensure accurate analysis and proper interpretation of sequencing data, and chromosome conformation capture data such as Hi-C have particular challenges. Although several methods have been proposed, the most widely used type of normalization of Hi-C data usually casts estimation of unwanted effects as a matrix balancing problem, relying on the assumption that all genomic regions interact equally with each other. Results: In order to explore the effect of copy-number variations on Hi-C data normalization, we first propose a simulation model that predict the effects of large copy-number changes on a diploid Hi-C contact map. We then show that the standard approaches relying on equal visibility fail to correct for unwanted effects in the presence of copy-number variations. We thus propose a simple extension to matrix balancing methods that model these effects. Our approach can either retain the copy-number variation effects (LOIC) or remove them (CAIC). We show that this leads to better downstream analysis of the three-dimensional organization of rearranged genomes. Conclusions: Taken together, our results highlight the importance of using dedicated methods for the analysis of Hi-C cancer data. Both CAIC and LOIC methods perform well on simulated and real Hi-C data sets, each fulfilling different needs.
Complete list of metadatas

Cited literature [41 references]  Display  Hide  Download

https://www.hal.inserm.fr/inserm-02059124
Contributor : Myriam Bodescot <>
Submitted on : Wednesday, March 6, 2019 - 1:37:55 PM
Last modification on : Wednesday, August 7, 2019 - 12:19:23 PM
Long-term archiving on : Friday, June 7, 2019 - 10:59:38 PM

File

12859_2018_Article_2256.pdf
Publisher files allowed on an open archive

Identifiers

Citation

Nicolas Servant, Nelle Varoquaux, Edith Heard, Emmanuel Barillot, Jean-Philippe Vert. Effective normalization for copy number variation in Hi-C data. BMC Bioinformatics, BioMed Central, 2018, 19 (1), pp.313. ⟨10.1186/s12859-018-2256-5⟩. ⟨inserm-02059124⟩

Share

Metrics

Record views

146

Files downloads

210