Abstract
Transcriptome analysis of human brain provides fundamental insight into development and disease, but it largely relies on existing annotation. We sequenced transcriptomes of 72 prefrontal cortex samples across six life stages and identified 50,650 differentially expression regions (DERs) associated with developmental and aging, agnostic of annotation. While many DERs annotated to non-exonic sequence (41.1%), most were similarly regulated in cytosolic mRNA extracted from independent samples. The DERs were developmentally conserved across 16 brain regions and in the developing mouse cortex, and were expressed in diverse cell and tissue types. The DERs were further enriched for active chromatin marks and clinical risk for neurodevelopmental disorders such as schizophrenia. Lastly, we demonstrate quantitatively that these DERs associate with a changing neuronal phenotype related to differentiation and maturation. These data show conserved molecular signatures of transcriptional dynamics across brain development, have potential clinical relevance and highlight the incomplete annotation of the human brain transcriptome.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
Primary accessions
BioProject
Referenced accessions
Gene Expression Omnibus
Sequence Read Archive
References
Colantuoni, C. et al. Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature 478, 519–523 (2011).
Kang, H.J. et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).
Birnbaum, R., Jaffe, A.E., Hyde, T.M., Kleinman, J.E. & Weinberger, D.R. Prenatal expression patterns of genes associated with neuropsychiatric disorders. Am. J. Psychiatry 171, 758–767 (2014).
Gulsuner, S. et al. Spatial and temporal mapping of de novo mutations in schizophrenia to a fetal prefrontal cortical network. Cell 154, 518–529 (2013).
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Parikshak, N.N. et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell 155, 1008–1021 (2013).
Willsey, A.J. et al. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell 155, 997–1007 (2013).
Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013).
Frazee, A.C., Sabunciyan, S., Hansen, K.D., Irizarry, R.A. & Leek, J.T. Differential expression analysis of RNA-seq data at single-base resolution. Biostatistics (2014).
Jaffe, A.E. et al. Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. Int. J. Epidemiol. 41, 200–209 (2012).
Flicek, P. et al. Ensembl 2014. Nucleic Acids Res. 42, D749–D755 (2014).
Hinrichs, A.S. et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).
Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).
BrainSpan. Atlas of the Developing Human Brain. http://www.brainspan.org/ (2011).
Dillman, A.A. et al. mRNA expression, splicing and editing in the embryonic and adult mouse cerebral cortex. Nat. Neurosci. 16, 499–506 (2013).
Xie, W. et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148 (2013).
Farrell, C.M. et al. Current status and new features of the Consensus Coding Sequence database. Nucleic Acids Res. 42, D865–D872 (2014).
Wang, Y., Lin, L., Lai, H., Parada, L.F. & Lei, L. Transcription factor Sox11 is essential for both embryonic and adult neurogenesis. Dev. Dyn. 242, 638–653 (2013).
Curtis, M.A. et al. Human neuroblasts migrate to the olfactory bulb via a lateral ventricular extension. Science 315, 1243–1249 (2007).
Hyde, T.M. et al. Expression of GABA signaling molecules KCC2, NKCC1, and GAD1 in cortical development and schizophrenia. J. Neurosci. 31, 11088–11095 (2011).
Frankland, P.W., O'Brien, C., Ohno, M., Kirkwood, A. & Silva, A.J. Alpha-CaMKII-dependent plasticity in the cortex is required for permanent memory. Nature 411, 309–313 (2001).
Krug, A. et al. The effect of neurogranin on neural correlates of episodic memory encoding and retrieval. Schizophr. Bull. 39, 141–150 (2013).
Morris, D.W. et al. Confirming RGS4 as a susceptibility gene for schizophrenia. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 125B, 50–53 (2004).
Wang, K. et al. Common genetic variants on 5p14.1 associate with autism spectrum disorders. Nature 459, 528–533 (2009).
Bernstein, B.E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).
Ji, H. et al. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat. Biotechnol. 26, 1293–1300 (2008).
Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Banerjee-Basu, S. & Packer, A. SFARI Gene: an evolving database for the autism research community. Dis. Model. Mech. 3, 133–135 (2010).
Nalls, M.A. et al. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson's disease. Nat. Genet. 46, 989–993 (2014).
Lambert, J.C. et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat. Genet. 45, 1452–1458 (2013).
Morris, A.P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).
Callicott, J.H. et al. Complexity of prefrontal cortical dysfunction in schizophrenia: more than up or down. Am. J. Psychiatry 160, 2209–2215 (2003).
Raney, B.J. et al. Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 30, 1003–1005 (2014).
Houseman, E.A. et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics 13, 86 (2012).
Kim, M. et al. Dynamic changes in DNA methylation and hydroxymethylation when hES cells undergo differentiation toward a neuronal lineage. Hum. Mol. Genet. 23, 657–667 (2014).
Guintivano, J., Aryee, M.J. & Kaminsky, Z.A. A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics 8, 290–302 (2013).
Miller, J.A. et al. Transcriptional landscape of the prenatal human brain. Nature 508, 199–206 (2014).
He, Z., Bammann, H., Han, D., Xie, G. & Khaitovich, P. Conserved expression of lincRNA during human and macaque prefrontal cortex development and maturation. RNA 20, 1103–1111 (2014).
Pletikos, M. et al. Temporal specification and bilaterality of human neocortical topographic gene expression. Neuron 81, 321–332 (2014).
Kleinman, J.E. et al. Genetic neuropathology of schizophrenia: new approaches to an old question and new uses for postmortem human brains. Biol. Psychiatry 69, 140–145 (2011).
Ameur, A. et al. Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain. Nat. Struct. Mol. Biol. 18, 1435–1440 (2011).
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Hupe, M., Li, M.X., Gertow Gillner, K., Adams, R.H. & Stenman, J.M. Evaluation of TRAP-sequencing technology with a versatile conditional mouse model. Nucleic Acids Res. 42, e14 (2014).
Ramos, A.D. et al. Integration of genome-wide approaches identifies lncRNAs of adult neural stem cells and their progeny in vivo. Cell Stem Cell 12, 616–628 (2013).
Lipska, B.K. et al. Critical factors in gene expression in postmortem human brain: Focus on studies in schizophrenia. Biol. Psychiatry 60, 650–658 (2006).
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).
Churchill, G.A. & Doerge, R.W. Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971 (1994).
Carlson, M. TxDb.Hsapiens.UCSC.hg19.lincRNAsTranscripts: annotation package for TxDb object(s). (2014).
Carlson, M. TxDb.Hsapiens.UCSC.hg19.knownGene: annotation package for TranscriptDb object(s). (2014).
Zhang, Z. et al. PseudoPipe: an automated pseudogene identification pipeline. Bioinformatics 22, 1437–1439 (2006).
Hansen, K.D., Brenner, S.E. & Dudoit, S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 38, e131 (2010).
Dohm, J.C., Lottaz, C., Borodina, T. & Himmelbauer, H. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 36, e105 (2008).
Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).
Roberts, A., Trapnell, C., Donaghey, J., Rinn, J.L. & Pachter, L. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 12, R22 (2011).
Levin, J.Z. et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).
Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 36, D13–D21 (2008).
Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).
Liao, Y., Smyth, G.K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Brennand, K. et al. Phenotypic differences in hiPSC NPCs derived from patients with schizophrenia. Mol. Psychiatry published online, doi:10.1038/mp.2014.22 (14 April 2014).
Collado-Torres, L. & Jaffe, A.E. enrichedRanges: identify enrichment between two sets of genomic ranges v. 0.0.1. https://github.com/lcolladotor/enrichedRanges/ (2014).
Johnson, A.D . et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24, 2938–2939 (2008).
Aryee, M.J. et al. Accurate genome-scale percentage DNA methylation estimates from microarray data. Biostatistics 12, 197–210 (2011).
Aryee, M.J. et al. Minfi: A flexible and comprehensive Bioconductor package for the analysis of Infinium DNA Methylation microarrays. Bioinformatics 30, 1363–1369.
Jaffe, A.E. & Irizarry, R.A. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 15, R31 (2014).
Smyth, G.K. in Bioinformatics and Computational Biology Solutions Using R and Bioconductor (eds. Gentleman, R. et al.) 397–420 (Springer, New York, 2005).
Acknowledgements
We are grateful for the vision and generosity of the Lieber and Maltz families, who made this work possible. Human brain material was acquired from the Offices of the Chief Medical Examiner of the District of Columbia and those of the Commonwealth of Virginia, Northern District, and processed and stored at the NIH Clinical Center in Bethesda, Maryland. We thank the families who donated to this research and we thank R. Straub for criticism of the data analyses. This work was supported by the Lieber Institute for Brain Development. A.E.J. was partially supported by 1R21MH102791, L.C.-T. was supported by CONACyT México (351535) and J.T.L. was supported by 1R01GM105705-01A1.
Author information
Authors and Affiliations
Contributions
All authors contributed to the writing of the manuscript, plus the following individual contributions: A.E.J. designed the study, performed data analyses on summarized DERs: BrainSpan, mouse, cell and tissue types, histone tail– and disease-associated enrichments, and cell composition. J.S. performed data analysis involving processing the RNA-seq data. L.C.-T. performed data analysis involving the initial global derfinder approach. J.T.L. performed data analysis involving the initial global derfinder approach. R.T. performed RNA extractions and cytosolic separations. C.L. performed RNA extractions and cytosolic separations. Y.G. created sequencing libraries and oversaw the data generation for the discovery data. Y.J. created sequencing libraries and oversaw the data generation for the validation data. B.J.M. assisted in the biological interpretation of the computational findings. T.M.H. provided brain tissue and demographic data and assisted in biological interpretation of the computational findings. J.E.K. oversaw the project, provided brain tissue and demographic data, and assisted in biological interpretation of the computational findings. D.R.W. designed the project, oversaw the project and assisted in biological interpretation of the computational findings.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Principal component analysis (PCA) of differentially expressed regions (DERs).
Principal components (PCs) A) 1, B), 2, and C) 3 of the normalized coverage levels across the DERs, plotted against age group.
Supplementary Figure 2 Trajectories of differentially expressed regions (DERs) across age and development, colored by the age group/range with the highest expression levels.
Six genes overlapping these DERs are shown with thicker lines and described in the text. Y-axis: mean adjusted (for sequencing depth, per million mapped reads) coverage, on the log2 scale. The majority of DERs have highest expression in fetal life.
Supplementary Figure 3 Venn diagrams depicting the overlap of differentially expressed regions (DERs).
This overlap was performed using reference data from A) Ensembl p12, B) UCSC hg19 knownGene and C) Gencode v19 datasbases. DERs must overlap at least 20bp of each feature to successfully overlap.
Supplementary Figure 4 Concordance between differential expression in total RNA compared to cytosolic RNA.
Log2 fold changes comparing fetal to adult in 12 total RNA samples (discovery, six per age group) and 6 independent samples (3 per age group) with the cytosolic fraction separated from the nuclear mRNA for (A) all DERs and (B) only DERs overlapping annotated intronic regions. ρ = Spearman correlation, κ = directionality concordance.
Supplementary Figure 5 RNA quality number (RIN) explains second principal component of BrainSpan samples across developmental DERs.
Each point is a sample colored by age (purple: prenatal and green: postnatal), where white corresponds roughly to birth.
Supplementary Figure 6 Age-associated non-exonic differentially expressed region (DER) expression patterns across multiple brain regions.
Principal component analysis (PCA) was performed on normalized coverage estimates across non-exonic DERs using all BrainSpan samples. Each point is a sample colored by age (purple: prenatal and green: postnatal), where white corresponds to birth.
Supplementary Figure 7 Developmental differentially expressed region (DER) expression patterns across multiple brain regions in latter principal components (analogous to Figure 2).
Each point is a sample colored by age (purple: prenatal and green: postnatal), where white corresponds roughly to birth.
Supplementary Figure 8 Example of “LIBD Human DLPFC Development” custom UCSC Track Hub.
This Track Hub displays the regions representing the significant DERs, the F-statistic indicating differential expression at a particular base, and the mean age group normalized coverage levels across the genome.
Supplementary Figure 9 Estimated proportions of cell types within postmortem brain samples.
We utilized DNA methylation data to calculate the relative proportion of each cell type based on publicly available cellular populations, consisting of A) NeuN+ and B) NeuN- cells from primary tissue, and C) cell line data from ES-derived NPCs. We can correlate these composition estimates to the expression levels within identified differentially expressed regions (DERs) and the corresponding –log10(p-values) are shown in Panel D.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–9 (PDF 3881 kb)
Supplementary Table 1: Demographic/phenotype information for 36 DLPFC discovery samples.
BrNum/RNum: ID columns, BrainCloud: whether the sample was included in the lifespan series in Colantuoni et al 2011; Age: age at death; Sex: M(ale) or F(emale); Race: AA=African American, CAUC = Caucasian, AS=Asian, HISP=Hispanic; ageGroup: used for differential expression analysis, RIN: RNA integrity number, a measure of RNA quality; totalMapped: total number of reads mapping to autosomes and sex chromosomes (excluding mitochondrial) using TopHat2; Cohort: Discovery or validation data; Accession: SRA accession numbers for BAM files of discovery cohort. (XLSX 14 kb)
Supplementary Table 2: List of significant differentially expressed regions (DERs).
Coordinates are relative to the hg19 genome build. Width in base pairs; value: average f-statistic in region; area: sum of f-statistic across the region (used to rank regions); fwer: family wise error rate, the proportion of null permutations with a larger area; meanCoverage: average coverage/number of reads across all samples across the region; nearestGene: nearest RefSeq gene symbol; annotation: RefSeq ID; description: relation of DER to gene; distToTSS: distance in base pairs to transcriptional start site; subregion: where DER overlaps gene; exonnumber: the closest exon to the DER; annoStrand: strand of the RefSeq gene; geneL: gene length in base pairs; codingL: coding length in base pairs; Fetal_adjMeans: mean adjusted coverage of fetal (-1,0] samples, Infant_adjMeans: mean adjusted coverage of infant (0,1] samples, Child_adjMeans: mean adjusted coverage of child (1-10] samples, Teen_adjMeans: mean adjusted coverage of teen (10-20] samples; Adult_adjMeans: mean adjusted coverage of adult (20-50] samples; 50plus_adjMeans: mean adjusted coverage of 50+ (50,100] samples. (XLSX 24071 kb)
Supplementary Table 3: Gene ontology (GO) results.
A) all differentially expressed regions (DERs), B) top 1000 DERs, C) DERs with highest fetal expression, D) DERs with highest infant expression, E) DERs with highest child expression, F) DERs with highest teen expression, G) DERs with highest adult expression, H) DERs with highest 50+ expression. For each sheet, GOBPID: gene ontology ID; Pvalue: unadjusted p-value from hypergeometric test; OddsRatio: corresponding odds ratio of enrichment; Size: number of genes in set; Term: GO category description (XLSX 173 kb)
Supplementary Table 4: Phenotype data for nuclear and cytosolic separation independent validation.
BrNum/RNum: ID columns; Zone: Nuclear or cytosol fraction; Age: age at death; Sex: M(ale) or F(emale); Race: AA=African American, CAUC = Caucasian; RIN: RNA integrity number, a measure of RNA quality; ugTotal: total yield of RNA; Yield(ug/mg): normalized yield of RNA; Group: fetal or adult age group; totalMapped: total number of reads mapping to autosomes and sex chromosomes (excluding mitochondrial) using TopHat2. (XLSX 8 kb)
Supplementary Table 5: Odds ratios for overlap between DERs and fetal brain histone marks, stratified by DER annotation.
Odds ratios are for a bin size of 1kb. (XLSX 8 kb)
Supplementary Table 6: List of CpGs, and their mean DNA methylation levels, used in the cell type proportion calculations.
This file can be used to estimate these relative cell types in other datasets using the estimateCellCounts function in the minfi package. Row names correspond to identifiers on the Illumina 450k microarray. (XLSX 21 kb)
Supplementary Data 1: RPKM counts from cell and tissue type analysis.
Includes the RPKM matrix (rpkmMatrix_n121_combinedPheno.txt), annotation (geneAnnotation_n121_combinedPheno.txt) and phenotype data (phenotypeDat_n121_combinedPheno.txt) (ZIP 32380 kb)
Supplementary Data 2: R code.
See corresponding 00_README file for descriptions. (ZIP 40 kb)
Rights and permissions
About this article
Cite this article
Jaffe, A., Shin, J., Collado-Torres, L. et al. Developmental regulation of human cortex transcription and its clinical relevance at single base resolution. Nat Neurosci 18, 154–161 (2015). https://doi.org/10.1038/nn.3898
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.3898
This article is cited by
-
Leveraging omic features with F3UTER enables identification of unannotated 3’UTRs for synaptic genes
Nature Communications (2022)
-
SCN1A overexpression, associated with a genomic region marked by a risk variant for a common epilepsy, raises seizure susceptibility
Acta Neuropathologica (2022)
-
How do established developmental risk-factors for schizophrenia change the way the brain develops?
Translational Psychiatry (2021)
-
Cannabinoid receptor CNR1 expression and DNA methylation in human prefrontal cortex, hippocampus and caudate in brain development and schizophrenia
Translational Psychiatry (2020)
-
Regulatory sites for splicing in human basal ganglia are enriched for disease-relevant information
Nature Communications (2020)