Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 16;89(10):5467-5475.
doi: 10.1021/acs.analchem.7b00380. Epub 2017 May 2.

Top-Down Proteomics of Large Proteins up to 223 kDa Enabled by Serial Size Exclusion Chromatography Strategy

Affiliations

Top-Down Proteomics of Large Proteins up to 223 kDa Enabled by Serial Size Exclusion Chromatography Strategy

Wenxuan Cai et al. Anal Chem. .

Abstract

Mass spectrometry (MS)-based top-down proteomics is a powerful method for the comprehensive analysis of proteoforms that arise from genetic variations and post-translational modifications (PTMs). However, top-down MS analysis of high molecular weight (MW) proteins remains challenging mainly due to the exponential decay of signal-to-noise ratio with increasing MW. Size exclusion chromatography (SEC) is a favored method for size-based separation of biomacromolecules but typically suffers from low resolution. Herein, we developed a serial size exclusion chromatography (sSEC) strategy to enable high-resolution size-based fractionation of intact proteins (10-223 kDa) from complex protein mixtures. The sSEC fractions could be further separated by reverse phase chromatography (RPC) coupled online with high-resolution MS. We have shown that two-dimensional (2D) sSEC-RPC allowed for the identification of 4044 more unique proteoforms and a 15-fold increase in the detection of proteins above 60 kDa, compared to one-dimensional (1D) RPC. Notably, effective sSEC-RPC separation of proteins significantly enhanced the detection of high MW proteins up to 223 kDa and also revealed low abundance proteoforms that are post-translationally modified. This sSEC method is MS-friendly, robust, and reproducible and, thus, can be applied to both high-efficiency protein purification and large-scale proteomics analysis of cell or tissue lysate for enhanced proteome coverage, particularly for low abundance and high MW proteoforms.

PubMed Disclaimer

Figures

Figure 1
Figure 1. High-resolution sSEC separation of complex protein mixture
Comparison of A) SEC (500 Å), B) sSEC 2sSEC (1000 Å – 500 Å), and C) 3sSEC (1000 Å – 500 Å – 500 Å) for the fractionation of sarcomeric protein extract. Top panel: representative UV chromatogram of each experiment with the collected fractions (1–10 for SEC and 2sSEC, 2*–12 for 3sSEC) annotated. Bottom panel: SDS-PAGE analysis of the corresponding SEC, 2sSEC or 3sSEC fractions collected and pooled from two technical replicates. Red and blue marks to the right of each gel represent molecular weight markers (250, 150, 100, 75, 50, 37, 25 kDa from top to bottom). LM, loading mixture prior to SEC/sSEC fractionation. The lane corresponding to Fraction 2* in C) represents visualization of equal volumes of both Fraction 1 and 2, which were combined for RPC-MS analysis due to their similarity in the protein contents.
Figure 2
Figure 2. Comparison of 1D RPC-MS and 2D 3sSEC-RPC-MS for the top-down analysis of a complex protein mixture
A) Total ion chromatogram (TIC) of 1D RPC-MS analysis for 10 μg sarcomeric protein extract. The most abundant sarcomeric proteins are highlighted (pink, cTnT; green, cTnI; orange, MLC1; brown, MLC2; blue, actin; purple, cTnC) and their raw mass spectra are shown. The MW of each species is indicated in red font. B) High-resolution deconvoluted mass spectra of the sarcomeric proteins highlighted in 1D RPC-MS TIC. The major proteoforms of each protein species are annotated. Red italic p and pp represents mono-phosphorylation and bis-phosphorylation, respectively. C) TICs of 2D sSEC-RPC-MS analysis coupling sSEC fractionation with RPC-MS. Highlighted peaks correspond to the peaks of the abundant sarcomeric proteins highlighted in the 1D TIC. (Note: The highlights indicate the fractions in which the proteins are present most abundantly, but do not indicate that the species is exclusively present in the highlighted fractions.) D) Venn diagrams showing the numbers of total proteoforms and proteoforms with MW > 60 kDa detected in 1D versus 2D analysis.
Figure 3
Figure 3. sSEC fractionation enabled detection of high MW proteins
A) TICs of RPC-MS for sSEC Fractions 1–6. Highlighted regions of the TICs represent retention windows for the corresponding high MW proteins. B) Top-down mass spectra for a 223.1 kDa and a 140.8 kDa with zoom-in views of the charge states and the corresponding deconvoluted spectra. The deconvoluted spectrum of the 140.8 kDa protein shows multiple proteoforms. C) Top-down mass spectra and the deconvoluted spectra of proteins with MW 116.4 kDa, 80.9 kDa, 65.2 kDa, 72.3 kDa, 69.6 kDa, 62.7 kDa and 53.5 kDa.
Figure 4
Figure 4. Coeluted high MW and low MW proteins were separated by sSEC, permitting top-down MS analysis
A) TIC of 1D RPC-MS (black trace) analysis for the whole sarcomeric protein mixture aligned with TICs of RPC-MS analysis for sSEC Fractions 5 and 6 (red and purple traces, respectively) of the same protein extract. B) The corresponding top-down mass spectra of the proteins eluted between 28.0 and 28.5 min. Protein species with MW 116.4 kDa and 47.6 kDa were revealed in sSEC Fractions 5 and 6, respectively, which regularly coeluted with smaller protein species (13.4 kDa, 7.8 kDa) in 1D RPC-MS analysis and remained undetected. To the right of the top-down mass spectra shows the corresponding deconvoluted spectra of the major proteins detected in 1D RPC-MS analysis (13.4 kDa and 7.8 kDa proteins), and RPC-MS analysis of sSEC Fractions 5 and 6 (116.4 kDa and 47.6 kDa protein, respectively). Deconvoluted mass spectrum of the 116.4 kDa proteins (resolving power 10000) revealed complex proteoforms each with 80 Da mass shifts.
Figure 5
Figure 5. 3sSEC fractionation enabled top-down MS analysis of low abundance protein PTMs
A) TIC of 1D RPC-MS (black trace) analysis for the whole sarcomeric protein mixture aligned with TICs of RPC-MS analysis for sSEC Fraction 5 of the same protein extract. The corresponding top-down mass spectra of the proteins eluted between 28.5 and 29 min are shown at the right panel. B) Zoom-in views of the top-down mass spectra and the corresponding deconvoluted spectra of the high MW proteins detected in sSEC Fraction 5. The charge states of the 65.2 kDa protein were detected in the 1D analysis. However, the low abundance PTMs of the 65.2 kDa was not revealed. sSEC allowed for the detection of the low abundance post-translationally modified form of the 65.2 kDa, as well as another protein that is 72.3 kDa. Deconvoluted mass spectrum of the 65.2 kDa and 72.3 kDa proteins (resolving power 10000) revealed low-abundant PTMs each with 80 Da shift. * denotes mono-phosphorylated proteoform; ** denotes bis-phosphorylated proteoform.
Figure 6
Figure 6. Top-down targeted MS/MS for protein identification in the sSEC fraction
A) TIC of the first MS experiment (Run 1, blue trace) aligned with TICs of the second experiment (Run 2, purple and orange trace) for targeted MS/MS analysis (purple trace) of the proteins of interest in defined time segments. For the remaining time segments wherein no proteins of interest were found, MS data were collected (orange trace). The detailed information regarding the molecular weights and retention windows for individual proteins analyzed by MS/MS is shown in Table S3. B) High-resolution mass spectrum and the deconvoluted spectrum (80,000 resolving power) for the 42.9 kDa protein eluted between 33.5 and 34.5 min from the first MS experiment. Precursor ions of the 42.9 kDa protein were fragmented by CAD in Run 2, yielding complex high-resolution tandem mass spectrum. Insets show zoom-in view of the isotopically resolved fragment ions. C) The 42.9 kDa protein was identified as creatine kinase M-type (CKM) with high confidence (p-value: 2.4 E-24). An average of 20 tandem mass spectrum yielded 23 b ions and 36 y ions from CKM. D) Representative b and y fragment ions of CKM with high mass accuracy.

Similar articles

Cited by

References

    1. Smith LM, Kelleher NL. Nat Methods. 2013;10:186–187. - PMC - PubMed
    1. Gregorich ZR, Ge Y. Proteomics. 2014;14:1195–1210. - PMC - PubMed
    1. Cai W, Tucholski TM, Gregorich ZR, Ge Y. Expert Rev Proteomics. 2016;13:717–730. - PMC - PubMed
    1. Siuti N, Kelleher NL. Nat Methods. 2007;4:817–821. - PMC - PubMed
    1. Yates JR, Ruse CI, Nakorchevsky A. Annu Rev Biomed Eng. 2009;11:49–79. - PubMed

Publication types

MeSH terms