An alignment algorithm for bisulfite sequencing using the Applied Biosystems SOLiD System
- PMID: 20562417
- PMCID: PMC2905549
- DOI: 10.1093/bioinformatics/btq291
An alignment algorithm for bisulfite sequencing using the Applied Biosystems SOLiD System
Abstract
Summary: Bisulfite sequencing allows cytosine methylation, an important epigenetic marker, to be detected via nucleotide substitutions. Since the Applied Biosystems SOLiD System uses a unique di-base encoding that increases confidence in the detection of nucleotide substitutions, it is a potentially advantageous platform for this application. However, the di-base encoding also makes reads with many nucleotide substitutions difficult to align to a reference sequence with existing tools, preventing the platform's potential utility for bisulfite sequencing from being realized. Here, we present SOCS-B, a reference-based, un-gapped alignment algorithm for the SOLiD System that is tolerant of both bisulfite-induced nucleotide substitutions and a parametric number of sequencing errors, facilitating bisulfite sequencing on this platform. An implementation of the algorithm has been integrated with the previously reported SOCS alignment tool, and was used to align CpG methylation-enriched Arabidopsis thaliana bisulfite sequence data, exhibiting a 2-fold increase in sensitivity compared to existing methods for aligning SOLiD bisulfite data.
Availability: Executables, source code, and sample data are available at http://solidsoftwaretools.com/gf/project/socs/
Similar articles
-
B-SOLANA: an approach for the analysis of two-base encoding bisulfite sequencing data.Bioinformatics. 2012 Feb 1;28(3):428-9. doi: 10.1093/bioinformatics/btr660. Epub 2011 Dec 6. Bioinformatics. 2012. PMID: 22155865 Free PMC article.
-
Methy-Pipe: an integrated bioinformatics pipeline for whole genome bisulfite sequencing data analysis.PLoS One. 2014 Jun 19;9(6):e100360. doi: 10.1371/journal.pone.0100360. eCollection 2014. PLoS One. 2014. PMID: 24945300 Free PMC article.
-
A trimming-and-retrieving alignment scheme for reduced representation bisulfite sequencing.Bioinformatics. 2015 Jun 15;31(12):2040-2. doi: 10.1093/bioinformatics/btv089. Epub 2015 Feb 13. Bioinformatics. 2015. PMID: 25681254 Free PMC article.
-
Methodological aspects of whole-genome bisulfite sequencing analysis.Brief Bioinform. 2015 May;16(3):369-79. doi: 10.1093/bib/bbu016. Epub 2014 May 27. Brief Bioinform. 2015. PMID: 24867940 Review.
-
Sense from sequence reads: methods for alignment and assembly.Nat Methods. 2009 Nov;6(11 Suppl):S6-S12. doi: 10.1038/nmeth.1376. Nat Methods. 2009. PMID: 19844229 Review.
Cited by
-
Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data.Brief Bioinform. 2016 Nov;17(6):938-952. doi: 10.1093/bib/bbv103. Epub 2015 Dec 1. Brief Bioinform. 2016. PMID: 26628557 Free PMC article.
-
Bacterial evolution of antibiotic hypersensitivity.Mol Syst Biol. 2013 Oct 29;9:700. doi: 10.1038/msb.2013.57. Mol Syst Biol. 2013. PMID: 24169403 Free PMC article.
-
Mitochondrial DNA in human identification: a review.PeerJ. 2019 Aug 13;7:e7314. doi: 10.7717/peerj.7314. eCollection 2019. PeerJ. 2019. PMID: 31428537 Free PMC article.
-
High-throughput sequencing of cytosine methylation in plant DNA.Plant Methods. 2013 Jun 7;9(1):16. doi: 10.1186/1746-4811-9-16. Plant Methods. 2013. PMID: 23758782 Free PMC article.
-
Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA.Nucleic Acids Res. 2012 Jun;40(11):5023-33. doi: 10.1093/nar/gks144. Epub 2012 Feb 16. Nucleic Acids Res. 2012. PMID: 22344696 Free PMC article.
References
-
- Karp RM, Rabin MO. Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 1987;31:249–260.