Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul 27:10:232.
doi: 10.1186/1471-2105-10-232.

BSMAP: whole genome bisulfite sequence MAPping program

Affiliations

BSMAP: whole genome bisulfite sequence MAPping program

Yuanxin Xi et al. BMC Bioinformatics. .

Abstract

Background: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation.

Results: We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible.

Conclusion: BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pipeline of bisulfite sequencing. 1) Denaturation: separating Watson and Crick strands; 2) Bisulfite treatment: converting un-methylated cytosines (blue) to uracils; methylated cytosines (red) remain unchanged; 3) PCR amplification of bisulfite-treated sequences resulting in four distinct strands: Bisulfite Watson (BSW), bisulfite Crick (BSC), reverse complement of BSW (BSWR), and reverse complement of BSC (BSCR).
Figure 2
Figure 2
Mapping of bisulfite reads. 1) Increased search space due to the cytosine-thymine conversion in the bisulfite treatment. 2) Mapping asymmetry: thymines in bisulfite reads can be aligned with cytosines in the reference (illustrated in blue) but not the reverse.
Figure 3
Figure 3
BSMAP algorithm. A) Bisulfite seed table, using the original seed and bisulfite variants as keys and corresponding coordinates in the reference genome as values. Each read was looked up in the seed table for potential mapping positions. B) A positional specific mask of the corresponding reference sequence was generated by setting 01 to C(light blue) and 11 to A, G, T(black). The original read was masked by a bitwise AND operation with the positional specific mask. C) The reference sequence and the masked read were compared with a bitwise XOR operation. Non-zero XOR results were counted as mismatches (red). Bisulfite alignment is marked in green.
Figure 4
Figure 4
Mapping bisulfite reads to 4 possible bisulfite strands (BSW/BSWR/BSC/BSCR) is equivalent to mapping the bisulfite read and its reverse complementary read to both Watson/Crick strands of the original reference sequence.

Similar articles

Cited by

References

    1. Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW, Molloy PL, Paul CL. A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci USA. 1992;89:1827. doi: 10.1073/pnas.89.5.1827. - DOI - PMC - PubMed
    1. Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, Zhang X, Bernstein BE, Nusbaum C, Jaffe DB, et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008;454:766. - PMC - PubMed
    1. Lister R, O'Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR. Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis. Cell. 2008;133:523–36. doi: 10.1016/j.cell.2008.03.029. - DOI - PMC - PubMed
    1. Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008;452:215. doi: 10.1038/nature06745. - DOI - PMC - PubMed
    1. Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program. Bioinformatics. 2008;24:713. doi: 10.1093/bioinformatics/btn025. - DOI - PubMed

Publication types

LinkOut - more resources