Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

Transcriptome analysis of mouse stem cells and early embryos

Alexei A Sharov et al. PLoS Biol. 2003 Dec.

Abstract

Understanding and harnessing cellular potency are fundamental in biology and are also critical to the future therapeutic use of stem cells. Transcriptome analysis of these pluripotent cells is a first step towards such goals. Starting with sources that include oocytes, blastocysts, and embryonic and adult stem cells, we obtained 249,200 high-quality EST sequences and clustered them with public sequences to produce an index of approximately 30,000 total mouse genes that includes 977 previously unidentified genes. Analysis of gene expression levels by EST frequency identifies genes that characterize preimplantation embryos, embryonic stem cells, and adult stem cells, thus providing potential markers as well as clues to the functional features of these cells. Principal component analysis identified a set of 88 genes whose average expression levels decrease from oocytes to blastocysts, stem cells, postimplantation embryos, and finally to newborn tissues. This can be a first step towards a possible definition of a molecular scale of cellular potency. The sequences and cDNA clones recovered in this work provide a comprehensive resource for genes functioning in early mouse embryos and stem cells. The nonrestricted community access to the resource can accelerate a wide range of research, particularly in reproductive and regenerative medicine.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no conflicts of interest exist.

Figures

Figure 1
Figure 1. Flow Chart of Sequence Data Analysis
Using TIGR gene indices clustering tools (Pertea et al. 2003), 249,200 ESTs were clustered, generating 58,713 consensuses and singletons. NIA consensuses and singletons were further clustered with Ensembl transcripts (Hubbard et al. 2002), RIKEN transcripts (Okazaki et al. 2002), and RefSeq transcripts and transcript predictions (Pruitt and Maglott 2001). Alignments of these sequences to the mouse genome (UCSC February 2002 freeze data, available from ftp://genome.cse.ucsc.edu/goldenPath/mmFeb2002) (Waterston et al. 2002) using BLAT (Kent 2002) helped to avoid false clustering of similar sequences at nonmatching genome locations. Erroneous clusters were reassembled based on the analysis of genome alignment. A total 94,039 putative transcripts were thus generated and then grouped into 39,678 putative genes based on their overlap in the genome on the same chromosome strand and on clone-linking information. Using criteria of an ORF greater than 100 amino acids or of multiple exons (excluding sequences that are potentially located in a wrong strand), 29,810 mouse genes were identified. Finally, 977 genes unique to the NIA database were identified.
Figure 2
Figure 2. Examples of NIA-Only cDNA Clones and RT–PCR Results
Expression pattern of 19 novel cDNA clones in 16 different cell lines or tissues: unfertilized egg, E3.5 blastocyst, E7.5 whole embryo (embryo plus placenta), E12.5 male mesonephros (gonad plus mesonephros), newborn brain, newborn ovary, newborn kidney, embryonic germ (EG) cell, embryonic stem (ES) cell (maintained as undifferentiated in the presence of LIF), trophoblast stem (TS) cell, mesenchymal stem (MS) cell, osteoblast, neural stem/progenitor (NS) cell, NS differentiated (differentiated neural stem/progenitor cells), and hematopoietic stem/progenitor (HS) cells. Glyceraldegyde-3-phosphate dehydrogenase (GAP-DH) was used as a control. A U number is assigned to each gene in the gene index (see Dataset S2). The exon number was predicted from alignment with the mouse genome sequence, and the amino acid sequence was predicted with the ORF finder from NCBI.
Figure 3
Figure 3. Signature Genes for Specific Groups of Early Embryos and Stem Cells
Figure 4
Figure 4. PCA Analysis of EST Frequency
The results were obtained by analyzing 2,812 genes that exceeded 0.1% in at least one library. (A) 3D biplot that shows both cell types (red spheres) and genes (yellow boxes). (B) 2D PCA of cell types. EST frequencies were log-transformed before the analysis. Names of some cells and tissues are abbreviated as follows: 6.5 EP, E6.5 whole embryo (embryo plus placenta); 7.5 EP, E7.5 whole embryo (embryo plus placenta); 8.5 EP, E8.5 whole embryo (embryo plus placenta); 9.5 EP, E9.5 whole embryo (embryo plus placenta); 7.5 E, E7.5 embryonic part only; 7.5 P, E7.5 extraembryonic part only; NbOvary, newborn ovary; NbBrain, newborn brain; NbHeart, newborn heart; NbKidney, newborn kidney; 13.5 VMB, E13.5 ventral midbrain dopamine cells; 12.5 Gonad (F), E12.5 female gonad/mesonephros; 12.5 Gonad (M), E12.5 male gonad/mesonephros; HS (Kit, Sca1), hematopoietic stem/progenitor cells (Lin, Kit, Sca1); HS (Kit, Sca1+), hematopoietic stem/progenitor cells (Lin, Kit, Sca1+); HS (Kit+, Sca1), hematopoietic stem/progenitor cells (Lin, Kit+, Sca1); HS (Kit+, Sca1+), hematopoietic stem/progenitor cells (Lin, Kit+, Sca1+); and NS-D, differentiated NS cells.
Figure 5
Figure 5. Relationship between PC3 and Average Expression Levels of 88 Signature Genes
A list of 88 genes associated with developmental potential: Birc2, Bmp15, Btg4, Cdc25a, Cyp11a, Dtx2, E2f1, Fmn2, Folr4, Gdf9, Krt2–16, Mitc1, Oas1d, Oas1e, Obox3, Prkab1, Rfpl4, Rgs2, Rnf35, Rnpc1, Slc21a11, Spin, Tcl1, Tcl1b1, Tcl1b3, 1810015H18Rik, 2210021E03Rik, 2410003C07Rik, 2610005B21Rik, 2610005H11Rik, 3230401D17Rik, 4833422F24Rik, 4921528E07Rik, 4933428G09Rik, 5730419I09Rik, A030007L17Rik, A930014I12Rik, E130301L11Rik, AA617276, Bcl2l10, MGC32471, MGC38133, MGC38960, D7Ertd784e, and 44 genes with only NIA U numbers (see Dataset S10).

Similar articles

Cited by

References

    1. Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, et al. Complementary DNA sequencing: Expressed seqeunce tags and human genome project. Science. 1991;252:1651–1656. - PubMed
    1. Anisimov SV, Tarasov KV, Tweedie D, Stern MD, Wobus AM, et al. SAGE identification of gene transcripts with profiles unique to pluripotent mouse R1 embryonic stem cells. Genomics. 2002;79:169–176. - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. Gene ontology: Tool for the unification of biology—the Gene Ontology Consortium. Nat Genet. 2000;25:25–29. - PMC - PubMed
    1. Audic S, Claverie JM. The significance of digital gene expression profiles. Genome Res. 1997;7:986–995. - PubMed
    1. Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc B Met. 1995;57:289–300.

Publication types

MeSH terms

Associated data