skip to main content
10.1145/3311790.3396621acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

PEGR: a management platform for ChIP-based next generation sequencing pipelines

Published: 26 July 2020 Publication History

Abstract

There has been a rapid development in genome sequencing, including high-throughput next generation sequencing (NGS) technologies, automation in biological experiments, new bioinformatics tools and utilization of high-performance computing and cloud computing. ChIP-based NGS technologies, e.g. ChIP-seq and ChIP-exo, are widely used to detect the binding sites of DNA-interacting proteins in the genome and help us to have a deeper mechanistic understanding of genomic regulation. As sequencing data is generated at an unprecedented pace from the ChIP-based NGS pipelines, there is an urgent need for a metadata management system. To meet this need, we developed the Platform for Eukaryotic Genomic Regulation (PEGR), a web service platform that logs metadata for samples and sequencing experiments, manages the data processing workflows, and provides reporting and visualization. PEGR links together people, samples, protocols, DNA sequencers and bioinformatics computation. With the help of PEGR, scientists can have a more integrated understanding of the sequencing data and better understand the scientific mechanisms of genomic regulation. In this paper, we present the architecture and the major functionalities of PEGR. We also share our experience in developing this application and discuss the future directions.

Supplemental Material

MP4 File
Presentation video

References

[1]
Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A Grüning, Aysam Guerler, Jennifer Hillman-Jackson, Saskia Hiltemann, Vahid Jalili, Helena Rasche, Nicola Soranzo, Jeremy Goecks, James Taylor, Anton Nekrutenko, and Daniel Blankenberg. 2018. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research 46, W1 (2018), W537–W544. https://doi.org/10.1093/nar/gky379 arXiv:http://oup.prod.sis.lan/nar/article-pdf/46/W1/W537/25110642/gky379.pdf
[2]
Istvan Albert, Travis N. Mavrich, Lynn P. Tomsho, Ji Qi, Sara J. Zanton, Stephan C. Schuster, and B. Franklin Pugh. 2007. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446(2007), 572 – 576.
[3]
D. S. Gilmour and J. T. Lis. 1984. Detecting protein-DNA interactions in vivo: distribution of RNA polymerase on specific bacterial genes. Proc. Natl Acad. Sci. 81(1984), 4275 – 4279.
[4]
Ayman Grada and Kate Weinbrecht. 2013. Next-Generation Sequencing: Methodology and Application. Journal of Investigative Dermatology 133, 8 (2013), 1 – 4. https://doi.org/10.1038/jid.2013.248
[5]
David S. Johnson, Ali Mortazavi, Richard M. Myers, and Barbara Wold. 2007. Genome-wide mapping of in vivo protein-DNA interactions. Science 316(2007), 1497 – 1502.
[6]
William K. M. Lai and B. Franklin Pugh. 2017. Understanding nucleosome dynamics and their links to gene expression and DNA replication. Nature Reviews Molecular Cell Biology 18 (2017), 548 EP –. https://doi.org/10.1038/nrm.2017.47 Review Article.
[7]
Sean Owen, Daniel Switkin, and ZXing Team. 2019. Barcode Scanner. https://play.google.com/store/apps/details?id=com.google.zxing.client.android. Retrieved: 2020-01-27.
[8]
Louis Papageorgiou, Picasi Eleni, Sofia Raftopoulou, Meropi Mantaiou, Vasileios Megalooikonomou, and Dimitrios Vlachakis. 2018. Genomic big data hitting the storage bottleneck. EMBnet.journal 24, 0 (2018), 910. https://doi.org/10.14806/ej.24.0.910
[9]
Jason A Reuter, Damek V Spacek, and Michael P Snyder. 2015. High-throughput sequencing technologies. Molecular cell 58(2015), 586 – 97. Issue 4. https://doi.org/10.1016/j.molcel.2015.05.004
[10]
Ho Sung Rhee and B. Franklin Pugh. 2012. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 6 (2012), 1408 – 1419. https://doi.org/10.1016/j.cell.2011.11.013
[11]
Ho Sung Rhee and B. Franklin Pugh. 2012. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 7389 (2012), 295 – 301. https://doi.org/10.1038/nature10799
[12]
Matthew J. Rossi and William K. M. Lai. 2018. Simplified ChIP-exo assays. Nature Communications 9(2018), 2842. https://doi.org/10.1038/s41467-018-05265-7
[13]
Carlo Scarioni. 2013. Pro Spring Security(1st. ed.). Apress, New York, NY.
[14]
D. O. Skobelev, T. M. Zaytseva, A. D. Kozlov, V. L. Perepelitsa, and A. S. Makarova. 2011. Laboratory information management systems in the work of the analytic laboratory. Measurement Techniques 53, 10 (01 Jan 2011), 1182–1189. https://doi.org/10.1007/s11018-011-9638-7
[15]
Glen Smith and Peter Ledbrook. 2014. Grails in Action (2nd.ed.). Manning, Shelter Island, NY.
[16]
M. J. Solomon and A. Varshavsky. 1985. Formaldehyde-mediated DNA-protein crosslinking: a probe for in vivo chromatin structures. Proc. Natl Acad. Sci. 82(1985), 6470 – 6474.
[17]
Xinkun Wang. 2016. Next-Generation Sequencing Data Analysis. CRC Press, Boca Raton, FL.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PEARC '20: Practice and Experience in Advanced Research Computing 2020: Catch the Wave
July 2020
556 pages
ISBN:9781450366892
DOI:10.1145/3311790
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. next generation sequencing pipeline
  2. science gateway
  3. web application

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

PEARC '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 80
    Total Downloads
  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media