Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 6:11:230.
doi: 10.1186/1471-2105-11-230.

BISMA--fast and accurate bisulfite sequencing data analysis of individual clones from unique and repetitive sequences

Affiliations

BISMA--fast and accurate bisulfite sequencing data analysis of individual clones from unique and repetitive sequences

Christian Rohde et al. BMC Bioinformatics. .

Abstract

Background: Bisulfite sequencing is a popular method to analyze DNA methylation patterns at high resolution. A region of interest is targeted by PCR and about 20-50 subcloned DNA molecules are usually analyzed, to determine the methylation status at single CpG sites and molecule resolution.

Results: The BISMA (Bisulfite Sequencing DNA Methylation Analysis) software for analysis of primary bisulfite sequencing data implements sequencing data extraction and enhanced data processing, quality controls, analysis and presentation of the methylation state. It uses an improved strategy for detection of clonal molecules and accurate CpG site detection and it supports for the first time analysis of repetitive sequences.

Conclusions: BISMA works highly automated but still provides the user full control over all steps of the analysis. The BISMA software is freely available as an online tool for academic purposes for the analysis of bisulfite sequencing data from both unique and repetitive sequences http://biochem.jacobs-university.de/BDPC/BISMA/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Workflow of the BISMA software and summary of its result files. A) Uploading the reference sequence and the bisulfite sequencing data. B) Analysis of the sequencing data by BISMA using the user defined thresholds. Sequences which do not pass the user defined thresholds will be removed. C) Visualization of the alignment of all included sequences. Sequences which pass the filtering for clonal sequences will be pre-selected to be included for later analysis. D) Analysis of the methylation pattern in the user selected dataset. E) All result files can be downloaded in one ZIP file containing: 1) The sequence alignment in which the methylation pattern is highlighted. 2) A graphical representation of the methylation pattern in context of the CpG distribution in the reference sequence. Each DNA sequence is represented by a line and each CpG site by a box. 3) A condensed graphical representation of the methylation pattern. Each row corresponds to one DNA sequence while each column represents a CpG site. 4) A graphical representation of the average methylation at each CpG site. 5) The methylation statistics including the methylation level observed over all sequences and the number of CpG sites that were found to be informative. 6) The methylation levels of the individual sequences.
Figure 2
Figure 2
Improved algorithm for CpG methylation state determination as implemented in the BISMA software. BISMA detects CpG sites in the reference sequence which is aligned to the experimental sequences. After identification of a guanine at the position aligned to the reference CpG's guanine, the next base in 5' direction is used for determination of the methylation state. Methylation state "unknown" refers to sites lacking clear methylation information due to mutations or sequencing problems.
Figure 3
Figure 3
Improved algorithm for removal of clonal sequences of the BISMA software. For illustration, a data set obtained in the DNA methylation analysis of in the mouse Xist promoter of a female animal was used, where 50% fully methylated and 50% unmethylated clones are expected. A) Simplified example DNA sequence alignment of bisulfite sequencing data for demonstration of the filtering algorithm. Cytosines in the reference sequence on top of the alignment are indicated in bold green. For the rest of the aligned experimental sequences, methylated CpG sites are highlighted in bold orange, while unmethylated CpG sites are shown in bold purple. Converted cytosines at non-CpG positions are shown in bold black, while conversion artifacts are indicated in bold green. B) Relevant cytosine pattern derived from the multiple sequence alignment includes information about the methylation status of CpG sites and the conversion status of non-CpG site cytosines. 1) Using the strict option BISMA keeps only one sequence out of several with identical patterns. 2) Using the BISMA suggested filtering algorithm only removes clones with identical patterns if these have a conversion artifact at the same position. C) Final methylation pattern obtained after using the 1) strict filtering algorithm or 2) the BISMA suggested filtering algorithm: each square indicates a CpG. Columns represent CpG sites while rows represent single molecules which were subcloned and sequenced. The underlying full DNA sequence alignment of all bisulfite sequencing data is available in Additional file 1: Suppl. Text S9.
Figure 4
Figure 4
Analysis of global Alu methylation using the BISMA software for repetitive sequences. All methylated cytosines in CpG-context are presented and used for calculation. A) Graphical representation of the methylated CpG sites in all sequences of the Alu PCR products at the observed position in the alignment. Each line corresponds to one sequence. The methylated CpGs, which are found at the consensus positions, are represented by black boxes, while those CpGs at other positions are indicated by red boxes. The sequences are sorted according to the total number of methylated CpG. B) Frequency of methylated CpG sites in the sequences from individual clones in different samples. C) Frequency of methylated CpG per 100 bps in all clones in different samples.
Figure 5
Figure 5
Comparison of important features of three different programs for analysis of bisulfite sequencing data in a CpG context.
Figure 6
Figure 6
Average analysis time necessary to process and analyze the example datasets with the BiQ Analyzer, QUMA and BISMA programs. The bars indicate the lowest and the highest analysis time which was measured.
Figure 7
Figure 7
Integrated bisulfite DNA methylation analysis platform consisting of the BISMA primary sequencing data analysis software and the BDPC compilation and clustering programs. The result HTML files from BISMA for unique sequences can be further analyzed and displayed with the BDPC compilation software, which will provide an overview table. This table can be used for further result presentations with the BDPC clustering software.

Similar articles

Cited by

References

    1. Bernstein BE, Meissner A, Lander ES. The mammalian epigenome. Cell. 2007;128(4):669–681. doi: 10.1016/j.cell.2007.01.033. - DOI - PubMed
    1. Kouzarides T. Chromatin modifications and their function. Cell. 2007;128(4):693–705. doi: 10.1016/j.cell.2007.02.005. - DOI - PubMed
    1. Hermann A, Gowher H, Jeltsch A. Biochemistry and biology of mammalian DNA methyltransferases. Cell Mol Life Sci. 2004;61(19-20):2571–2587. doi: 10.1007/s00018-004-4201-1. - DOI - PubMed
    1. Klose RJ, Bird AP. Genomic DNA methylation: the mark and its mediators. Trends Biochem Sci. 2006;31(2):89–97. doi: 10.1016/j.tibs.2005.12.008. - DOI - PubMed
    1. Feinberg AP. The epigenetics of cancer etiology. Semin Cancer Biol. 2004;14(6):427–432. doi: 10.1016/j.semcancer.2004.06.005. - DOI - PubMed

Publication types

LinkOut - more resources