PEGR: a management platform for ChIP-based next generation sequencing pipelines
Pages 285 - 292
Abstract
There has been a rapid development in genome sequencing, including high-throughput next generation sequencing (NGS) technologies, automation in biological experiments, new bioinformatics tools and utilization of high-performance computing and cloud computing. ChIP-based NGS technologies, e.g. ChIP-seq and ChIP-exo, are widely used to detect the binding sites of DNA-interacting proteins in the genome and help us to have a deeper mechanistic understanding of genomic regulation. As sequencing data is generated at an unprecedented pace from the ChIP-based NGS pipelines, there is an urgent need for a metadata management system. To meet this need, we developed the Platform for Eukaryotic Genomic Regulation (PEGR), a web service platform that logs metadata for samples and sequencing experiments, manages the data processing workflows, and provides reporting and visualization. PEGR links together people, samples, protocols, DNA sequencers and bioinformatics computation. With the help of PEGR, scientists can have a more integrated understanding of the sequencing data and better understand the scientific mechanisms of genomic regulation. In this paper, we present the architecture and the major functionalities of PEGR. We also share our experience in developing this application and discuss the future directions.
Supplemental Material
MP4 File
- Download
- 152.42 MB
References
[1]
Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A Grüning, Aysam Guerler, Jennifer Hillman-Jackson, Saskia Hiltemann, Vahid Jalili, Helena Rasche, Nicola Soranzo, Jeremy Goecks, James Taylor, Anton Nekrutenko, and Daniel Blankenberg. 2018. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research 46, W1 (2018), W537–W544. https://doi.org/10.1093/nar/gky379 arXiv:http://oup.prod.sis.lan/nar/article-pdf/46/W1/W537/25110642/gky379.pdf
[2]
Istvan Albert, Travis N. Mavrich, Lynn P. Tomsho, Ji Qi, Sara J. Zanton, Stephan C. Schuster, and B. Franklin Pugh. 2007. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446(2007), 572 – 576.
[3]
D. S. Gilmour and J. T. Lis. 1984. Detecting protein-DNA interactions in vivo: distribution of RNA polymerase on specific bacterial genes. Proc. Natl Acad. Sci. 81(1984), 4275 – 4279.
[4]
Ayman Grada and Kate Weinbrecht. 2013. Next-Generation Sequencing: Methodology and Application. Journal of Investigative Dermatology 133, 8 (2013), 1 – 4. https://doi.org/10.1038/jid.2013.248
[5]
David S. Johnson, Ali Mortazavi, Richard M. Myers, and Barbara Wold. 2007. Genome-wide mapping of in vivo protein-DNA interactions. Science 316(2007), 1497 – 1502.
[6]
William K. M. Lai and B. Franklin Pugh. 2017. Understanding nucleosome dynamics and their links to gene expression and DNA replication. Nature Reviews Molecular Cell Biology 18 (2017), 548 EP –. https://doi.org/10.1038/nrm.2017.47 Review Article.
[7]
Sean Owen, Daniel Switkin, and ZXing Team. 2019. Barcode Scanner. https://play.google.com/store/apps/details?id=com.google.zxing.client.android. Retrieved: 2020-01-27.
[8]
Louis Papageorgiou, Picasi Eleni, Sofia Raftopoulou, Meropi Mantaiou, Vasileios Megalooikonomou, and Dimitrios Vlachakis. 2018. Genomic big data hitting the storage bottleneck. EMBnet.journal 24, 0 (2018), 910. https://doi.org/10.14806/ej.24.0.910
[9]
Jason A Reuter, Damek V Spacek, and Michael P Snyder. 2015. High-throughput sequencing technologies. Molecular cell 58(2015), 586 – 97. Issue 4. https://doi.org/10.1016/j.molcel.2015.05.004
[10]
Ho Sung Rhee and B. Franklin Pugh. 2012. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 6 (2012), 1408 – 1419. https://doi.org/10.1016/j.cell.2011.11.013
[11]
Ho Sung Rhee and B. Franklin Pugh. 2012. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 7389 (2012), 295 – 301. https://doi.org/10.1038/nature10799
[12]
Matthew J. Rossi and William K. M. Lai. 2018. Simplified ChIP-exo assays. Nature Communications 9(2018), 2842. https://doi.org/10.1038/s41467-018-05265-7
[13]
Carlo Scarioni. 2013. Pro Spring Security(1st. ed.). Apress, New York, NY.
[14]
D. O. Skobelev, T. M. Zaytseva, A. D. Kozlov, V. L. Perepelitsa, and A. S. Makarova. 2011. Laboratory information management systems in the work of the analytic laboratory. Measurement Techniques 53, 10 (01 Jan 2011), 1182–1189. https://doi.org/10.1007/s11018-011-9638-7
[15]
Glen Smith and Peter Ledbrook. 2014. Grails in Action (2nd.ed.). Manning, Shelter Island, NY.
[16]
M. J. Solomon and A. Varshavsky. 1985. Formaldehyde-mediated DNA-protein crosslinking: a probe for in vivo chromatin structures. Proc. Natl Acad. Sci. 82(1985), 6470 – 6474.
[17]
Xinkun Wang. 2016. Next-Generation Sequencing Data Analysis. CRC Press, Boca Raton, FL.
Recommendations
Alignment-Free sequence comparison based on next generation sequencing reads: extended abstract
RECOMB'12: Proceedings of the 16th Annual international conference on Research in Computational Molecular BiologyNext generation sequencing (NGS) technologies have generated enormous amount of shotgun read data and assembly of the reads can be challenging, especially for organisms without template sequences. We study the power of genome comparison based on shotgun ...
Comments
Information & Contributors
Information
Published In
July 2020
556 pages
ISBN:9781450366892
DOI:10.1145/3311790
Copyright © 2020 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 26 July 2020
Check for updates
Author Tags
Qualifiers
- Research-article
- Research
- Refereed limited
Funding Sources
Conference
PEARC '20: Practice and Experience in Advanced Research Computing
July 26 - 30, 2020
OR, Portland, USA
Acceptance Rates
Overall Acceptance Rate 133 of 202 submissions, 66%
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 80Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Reflects downloads up to 30 Jan 2025
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign inFull Access
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderHTML Format
View this article in HTML Format.
HTML Format