Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun;7(6):e1001393.
doi: 10.1371/journal.pgen.1001393. Epub 2011 Jun 9.

Comparative analysis of proteome and transcriptome variation in mouse

Affiliations

Comparative analysis of proteome and transcriptome variation in mouse

Anatole Ghazalpour et al. PLoS Genet. 2011 Jun.

Abstract

The relationships between the levels of transcripts and the levels of the proteins they encode have not been examined comprehensively in mammals, although previous work in plants and yeast suggest a surprisingly modest correlation. We have examined this issue using a genetic approach in which natural variations were used to perturb both transcript levels and protein levels among inbred strains of mice. We quantified over 5,000 peptides and over 22,000 transcripts in livers of 97 inbred and recombinant inbred strains and focused on the 7,185 most heritable transcripts and 486 most reliable proteins. The transcript levels were quantified by microarray analysis in three replicates and the proteins were quantified by Liquid Chromatography-Mass Spectrometry using O(18)-reference-based isotope labeling approach. We show that the levels of transcripts and proteins correlate significantly for only about half of the genes tested, with an average correlation of 0.27, and the correlations of transcripts and proteins varied depending on the cellular location and biological function of the gene. We examined technical and biological factors that could contribute to the modest correlation. For example, differential splicing clearly affects the analyses for certain genes; but, based on deep sequencing, this does not substantially contribute to the overall estimate of the correlation. We also employed genome-wide association analyses to map loci controlling both transcript and protein levels. Surprisingly, little overlap was observed between the protein- and transcript-mapped loci. We have typed numerous clinically relevant traits among the strains, including adiposity, lipoprotein levels, and tissue parameters. Using correlation analysis, we found that a low number of clinical trait relationships are preserved between the protein and mRNA gene products and that the majority of such relationships are specific to either the protein levels or transcript levels. Surprisingly, transcript levels were more strongly correlated with clinical traits than protein levels. In light of the widespread use of high-throughput technologies in both clinical and basic research, the results presented have practical as well as basic implications.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A schematic representation of the experimental design.
97 inbred and recombinant inbred strains in the HMDP panel were utilized to study the relationships between transcripts, proteins, and clinical traits. The relationships between proteins and transcripts were assessed at the biological level by the overall correlation across datasets, and at the genetic level by comparing the genome-wide association profiles of the two datasets. The biological relationship between the transcripts and proteins was also assessed in the context of the physiological phenotypes by relating these two datasets to the 42 clinical traits measured in the HMDP panel.
Figure 2
Figure 2. Proteome and transcriptome data quality.
A) Reliability of peptide measurement in LC-MS. The distribution of variance among the technical replicates in the LC-MS data (grey plot) and in the HMDP population (blue plot). B) The frequency of peptides with varying amount as defined by the “signal to noise” ratio. C) Distribution of heritability (fraction of total variance attributed to genetics) in the transcript dataset. The dashed line depicts the significant heritability estimates (p-value<0.05) D) Comparison of Affymetrix data with the Next Generation Sequencing data. E) Number of peptides per gene in the filtered peptide dataset.
Figure 3
Figure 3. Relationships between protein levels and transcript levels.
A) Histogram of correlation coefficients computed peptides and probesets representing the same gene. The median correlation coefficient is 0.27. B) Classification of correlations between probeset-peptides based on signal to noise ratio in the peptide data (larger signal to noise depicts less technical variation in the peptide measurement).
Figure 4
Figure 4. Isoform-specific analysis of peptide data.
A) An example of differential regulation of isoforms detected in the LC-MS data. Top panel, comparison of similarity in expression variation of 20 peptides measured for Acox1. Grey plots illustrate the expression variation among inbred mice for 19 peptides which represent all four Acox1 isoforms. Red plot illustrated the expression profile of the peptide representing the isoforms skipping exon 4. Bottom panel, Ensembl genome browser's schematic representation of four Acox1 isoforms. Arrow points to Acox1-002 isoform which skips exon 4. B) Concordance between Acox1 peptides. The left boxplot depicts correlations among peptides that include Acox1-002 isoform. The right boxplot depicts correlations between the peptide mapping to exon 4 and all other peptides. The scatter points overlaid on each boxplot represent the pair-wise correlation values. C) Exon level analysis of peptide measurements by LC-MS and transcript measurements as measured by NGS in the livers of the B6 and DBA inbred strains. The black dots depict the relationships examined by comparing peptide data to microarray data and the red dots represent the highly significant relations found by peptide comparison with the microarray data. The lines depict the best fit as predicted by linear regression (black line = regression of all peptides, red line = regression of highly significant peptides).
Figure 5
Figure 5. Relationships between the peptide data and transcript data with clinical traits and biological pathways.
A) Correlations of transcriptome and proteome with clinical traits. A scatter plot of correlation coefficients between 607 probesets and 1343 peptides with 42 clinical traits (peptide-trait correlations are plotted on the x-axis and probeset-trait correlations are plotted on the y-axis). Red points are those correlations which were significant for transcripts only, green points are those correlations which were significant for protein data only and black points are those which were not significant in either of the two datasets. B) Concordance of transcripts and proteins in 115 KEGG biological pathways.
Figure 6
Figure 6. Global analyses of proteome and transcriptome genetic regulation.
A) Global eQTL profile for the 14463 eQTLs and 1368 pQTLs superimposed on each other. In this plot, larger dots represent protein association and smaller dots represent transcript association. The diagonal line with strong association depicts the local eQTLs and pQTLs and each off-diagonal dot depicts the location of distant eQTLs and pQTLs. B) eQTL landscape for protein and transcript data. For each dataset, the genome was divided into 2 Mb bins and the number of eQTLs (grey) and pQTLs (red) were counted separately in each bin as the windows were slid every 50 kb. The frequency of eQTLs and pQTLs in each window are plotted as the fraction of total significant associations (14463 for transcripts and 1368 for proteins).

Comment in

Similar articles

Cited by

References

    1. Brem RB, Yvert G, Clinton R, Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast. Science. 2002;296:752–755. - PubMed
    1. Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, et al. Uncovering regulatory pathways that affect hematopoietic stem cell function using ‘genetical genomics’. Nat Genet. 2005;37:225–232. - PubMed
    1. Ghazalpour A, Doss S, Kang H, Farber C, Wen PZ, et al. High-resolution mapping of gene expression using association in an outbred mouse stock. PLoS Genet. 2008;4:e1000149. doi: 10.1371/journal.pgen.1000149. - DOI - PMC - PubMed
    1. Ghazalpour A, Doss S, Zhang B, Wang S, Plaisier C, et al. Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2006;2:e130. doi: 10.1371/journal.pgen.0020130. - DOI - PMC - PubMed
    1. Hubner N, Wallace CA, Zimdahl H, Petretto E, Schulz H, et al. Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nat Genet. 2005;37:243–253. - PubMed

Publication types