Main

A long-standing goal of neuroscience is to understand how molecules and cellular structures at the microscale give rise to communication between brain regions at the macroscale. However, there is a disconnect between molecular biology and neuroimaging that hinders progress toward this goal. A paradigm shift would be to collect postmortem omics data alongside antemortem neuroimaging data from the same set of individuals. In this way, molecular levels can be tied to the functional and structural properties of the brain. Establishing such associations is imperative for translation from humans to experimental model systems that could better recapitulate neurologic health and disease.

Both molecular and neuroimaging studies have independently identified correlates of brain function and cognition, but the search for the molecular basis of functional connectivity is nascent. Recent neuroimaging genetics studies are beginning to address the question of which molecules are most related to functional connectivity1,2,3,4,5. However, despite thousands of samples, these studies have found only a dozen independent genetic loci with modest heritability6,7. Most of those genetic variants reside in noncoding regions, making it difficult to determine the implicated genes and mechanisms of action. Another class of studies tests for spatial correspondence between gene expression and brain connectivity to infer the implicated genes8,9,10,11,12. These studies typically combine the Allen Human Brain Atlas (AHBA)13, which contains gene expression profiles in hundreds of brain regions from six individuals, with functional magnetic resonance imaging (fMRI) data from an entirely separate cohort. Such data combinations cannot examine the covariation between gene expression and functional connectivity in a broader population. Representing such covariation is critical for identifying molecules that explain individual variation in functional connectivity and the associated cognitive phenotypes14.

In this study, we gathered data types that span multiple biophysical scales from a cohort of 98 individuals. Data types included resting-state fMRI, structural MRI, genetics, dendritic spine morphometry, proteomics and gene expression measurements from postmortem tissues of the superior frontal gyrus (SFG) and inferior temporal gyrus (ITG). Based on the stability of functional connectivity patterns within individuals15, we hypothesized that it is possible to combine postmortem molecular and subcellular data with antemortem neuroimaging data from the same individuals to prioritize molecular mechanisms underlying brain connectivity. To test this hypothesis, we built models that integrate protein measurements with dendritic spine morphometry to explain between-individual variation in functional connectivity. As a form of replication, we repeated the analysis using gene expression measurements in place of protein abundance and structural covariation as a surrogate of connectivity.

Results

Multimodal human brain data

To model how molecules at the microscale give rise to brain connectivity at the macroscale, we collected data that span multiple biophysical scales (Fig. 1) as part of the Religious Orders Study and Rush Memory and Aging Project (ROSMAP)16. Data types included genetics, gene expression, protein abundance, dendritic spine morphometry and neuroimaging measurements. Ninety-eight ROSMAP participants had all these data types measured. The average age of these participants at the time of the MRI scan and death was 88 ± 6 years and 91 ± 6 years, respectively, with an average time interval between the MRI scan and the age at death of 3 ± 2 years. The average postmortem interval (PMI) was 8.5 ± 4.6 h. Participants were 77% female and had 15 ± 3 years of education. We performed detailed characterization of each omic, cellular and neuroimaging data type and then integrated them as described next.

Fig. 1: Study overview.
figure 1

a, Dendritic spine morphometric, genome-wide gene expression and protein abundance data were acquired from SFG and ITG in a cohort of 98 individuals. b, Structural and functional MRI data were acquired from the same 98 individuals. c, Molecular measurements were integrated with dendritic spine morphologies to model functional connectivity and structural covariation between SFG and ITG. d, Key molecular processes and proteins associated with SFG–ITG connectivity. Bold formatting indicates associations with both functional connectivity and structural covariation.

Neuroimaging phenotypes

Following current best practices, we organized the neuroimaging data from 1,210 ROSMAP participants into the brain imaging data structure (BIDS)17, which we validated using CuBIDS18 (Fig. 2a). For the fMRI data, we performed standard preprocessing, such as slice-time correction and spatial normalization, and regressed out motion confounds19. We made the fMRI data and their derivatives in BIDS as a free resource at radc.rush.edu to ease data sharing and re-analysis by interested investigators. To estimate functional connectivity, we first used a functional atlas comprising 100 parcels generated from resting-state fMRI data of more than 1,000 participants20 to divide the brain into functionally homogeneous regions. We then averaged the time series of voxels within each parcel (Fig. 2b) and computed Pearson’s correlation between all pairs of parcels for each participant (Fig. 2b). For the structural MRI data, we performed nonuniformity correction, skull-stripping, spatial normalization and tissue segmentation. We then divided the brain into 62 anatomical regions using the Desikan–Killiany–Tourville (DKT)21 atlas (Fig. 2d) and extracted structural attributes22 (Fig. 2e). To estimate structural covariation as a surrogate of brain connectivity, we adopted a canonical correlation analysis (CCA) approach (Supplementary Fig. 1). The motivation was based on the observation that multiple structural attributes (for example, number of vertices, surface area and curvature index) are typically cross-correlated between two regions (Fig. 2f).

Fig. 2: Functional and structural neuroimaging phenotypes.
figure 2

a, Preprocessed functional and structural MRI data organized into a free resource for data sharing. This process involved curation (BIDS), assessment of aberrant acquisition parameters (CuBIDS), preprocessing (fMRIprep) and confound regression (XCP). b, Mean time series of hemodynamic fluctuations within brain regions from which we also measured proteomics, RNA-seq and dendritic spine morphology. c, Functional connectivity estimated by correlating all pairs of parcel time series (participant average shown), with the SFG–ITG connection being of primary interest. d, Structural MRI data were collected during the same scan sessions in which we acquired fMRI data. e, Freesurfer was applied to extract structural attributes from anatomically defined brain regions. f, Cross-correlation between structural attributes of SFG and ITG.

Molecular measurements

We performed multiplex tandem mass tag mass spectrometry (TMT-MS) on tissue samples from SFG and ITG of 130 ROSMAP participants to generate proteomic data and applied standard preprocessing23,24,25. To identify the major molecular systems present in each brain region, we clustered the measured proteins into covarying sets (modules) using a purely data-driven approach (Fig. 3a)26. The modules are enriched for many brain-relevant structures and functions (Fig. 3b,c)27 that are consistent with related datasets28,29. Comparing SFG and ITG, we observed some common molecular systems with high protein overlap between their modules (Fig. 3d). In addition to protein abundance, we also collected RNA sequencing (RNA-seq) data from SFG and ITG of the same set of ROSMAP participants. We performed standard preprocessing, such as TMM normalization and confound regression with voom/limma, and clustered the measured genes into modules26. These expression modules are also enriched for brain-relevant structures and functions (Supplementary Fig. 2), and we again observed common molecular systems between SFG and ITG with high gene overlap.

Fig. 3: Identification of molecular systems in SFG and ITG based on protein abundance.
figure 3

a, Schematic representation of how proteins were clustered into co-abundant modules. Modules were inferred solely from proteomics data by applying a consensus clustering algorithm (SpeakEasy2 (ref. 26)) on the corresponding protein–protein correlation matrix, with no reference to ontology. b, SFG proteins arranged by their assigned module. Protein pairs within the same module (bounded by dashed squares) display a higher correlation than protein pairs belonging to separate modules. A short functional label was assigned to each module (‘transcription’, ‘chromatin’, etc.) based on its more highly enriched GO terms. The P value of the most enriched GO term of each module is indicated by the corresponding mauve bar (if −log10(P) > 40, bars are truncated with −log10(P) values stated). Modules were separately generated for SFG and ITG, with the module of interest in terms of brain connectivity highlighted in dashed red lines and red text. c, ITG counterpart of b. d, Most modules in one region have one or more highly overlapping modules in the other region (number of overlapping proteins shown in white text when −log10(P) > 14), enriched for similar GO terms. ECM, extracellular matrix.

Dendritic spine morphometry

Dendritic spines are actin-rich protrusions along neuronal dendrites that form the majority of excitatory synapses in the brain. Spines can exhibit remarkable variability in size, shape and density, and synapse activity is inseparably linked to spine morphology30. Spines can be divided into morphological subclasses based on their three-dimensional (3D) structure as thin, mushroom, stubby or filopodia31,32. To measure spine density and morphology from SFG and ITG, we impregnated postmortem tissue slices with Golgi stain and imaged at ×60 using a widefield microscope with a high-numerical-aperture condenser (Fig. 4a). We then reconstructed the Z stacks in 3D using Neurolucida 360 (Fig. 4b,c). We sampled between 8 and 12 pyramidal neurons from either cortical layer II or III per individual for analysis and estimated average spine density, backbone length, head diameter and volume across the reconstructed dendrites (Fig. 4d,e and Supplementary Fig. 3). We also estimated these spine attributes for each morphologic subclass—thin, mushroom, stubby and filopodia. Next, we associated each spine attribute against each measured protein (Supplementary Table 1) and applied geneset enrichment analysis (GSEA)33. Spine density and morphological attributes were enriched for relevant neuron structures, such as synapses and the actin cytoskeleton, as well as functions, including synaptic signaling and neurotransmitter release30,34 (Fig. 4f). Notably, we found the level of enrichment varies across dendritic spine subclasses. Comparing the spine density and morphology between SFG and ITG (Fig. 4g) revealed differences in overall spine density (P = 0.0310), filopodia density (P = 0.0038) and mushroom spine head diameter (P = 0.0060).

Fig. 4: Measurement of dendritic spines with protein-based functional characterization.
figure 4

a, Representative ×60 bright-field image of a Golgi-stained dendrite from ITG of an exemplar participant. Scale bar = 10 µm. b, Digital 3D reconstruction of the dendritic segment performed on the bright-field image. c, Digital 3D reconstruction used for estimating spine density and morphometric attributes, including head diameter, length and volume, and assigning subclasses. Blue indicates thin spines, green indicates mushroom spines, red indicates stubby spines and yellow indicates filopodia. d, Representative zoomed-in bright-field image of a single Golgi-impregnated thin spine in the xy plane (red box from c). e, Left to right, 3D digital reconstruction of the dendrite (gray) and spine (green) in the xy plane, clockwise rotation in xyz dimensions and further rotation in xyz. f, Each spine attribute was associated against all proteins measured from the same region with enriched GO terms indicated (SFG in green and ITG in blue). g, T values of contrasts between SFG and ITG for each spine morphologic attribute are shown, with * indicating nominal differences.

Protein modules covary with connectivity when contextualized with dendritic spine morphology

Fundamental to brain connectivity at the macroscale are molecular subprocesses related to synaptic communication. Accordingly, we focused our analysis on protein modules most enriched for neurons and various synaptic communication gene ontology (GO) terms (Supplementary Fig. 4 and see SFG pMod6 and ITG pMod8 in Supplementary Table 2). We represented the synaptic module of each brain region by the average protein abundance of its members. To avoid association introduced by spurious correlation, we account for confounding factors including age at death, age of scan, sex, years of education, scanner, postmortem interval, side of the brain the molecular data were acquired and motion (Supplementary Fig. 5) using the extra sum of squares in testing if the synaptic modules of SFG and ITG (their main effects and interaction in aggregate) are associated with an fMRI estimate of their connectivity (Fig. 1). This initial test did not detect an association between the synaptic modules and SFG–ITG connectivity (P = 0.6839), and this observation holds true for other modules (Supplementary Fig. 6) that were generated at various module resolutions and by different algorithms (whether SpeakEasy (SE) or the widely used weighted gene co-expression network analysis (WGCNA)35). Based on this observation, we posited that proteins reside at a biophysical scale that might be too far removed from region-level functional connectivity. Considering dendritic spines are tightly coordinated with synaptic function and even subtle alterations in spine morphology can induce marked effects on neuronal circuits30,36, we hypothesized that morphologic attributes of dendritic spines could provide the cellular context to bridge the difference in biophysical scales between proteins and region-level connectivity. To test this hypothesis, we estimated the dendritic spine component of the synaptic modules by fitting the average protein abundance of their members with dendritic spine attributes (Fig. 5a–c) and repeated the analysis. Using the dendritic spine component of the synaptic modules resulted in an association with SFG–ITG connectivity (P = 0.0174; Fig. 5d). We next sought to test whether the association was specific to the SFG–ITG connection. Such a test would lend credence and determine the extent to which connectivity across the brain can be explained by dendritic spines and proteins gathered from SFG and ITG. Accordingly, we associated the dendritic spine component of SFG and ITG synaptic modules with the connectivity of all other region pairs. We found only 3% of other connections have higher association strength (Fig. 5e), and these connections generally emanate from either SFG and its nearby areas in the frontal cortex or from ITG. This result likely reflects how spatially proximal areas tend to have similar functional and molecular architectures37,38. Moreover, the interaction between the synaptic modules of SFG and ITG explains additional variance beyond their main effects (P = 0.0140), with only 1.5% of other connections showing stronger interaction effects. This result suggests that not only is the synaptic module of each region individually linked to SFG–ITG connectivity but also the collective molecular state of these regions has a role. As an exploratory analysis, we further tested the other modules and found the dendritic spine component of modules predominantly enriched for RNA processing, mitoribosome and synaptic vesicles are also associated with functional connectivity (Supplementary Fig. 7).

Fig. 5: Associations between protein modules and brain connectivity.
figure 5

a, Protein abundance of SFG synaptic module plotted against its dendritic spine fit. Each dot corresponds to an individual and the red dashed line corresponds to the linear fit between protein abundance and the dendritic spine fit values of the individuals. b, ITG counterpart of a. c, Partial contribution of each attribute toward the dendritic spine fit. d, Functional connectivity (with confounds regressed out) fitted by the dendritic spine component of SFG and ITG synaptic protein modules. Each red dot corresponds to the functional connectivity value of an individual with the red line visualizing how far it is from the fitted surface. The large curvature indicates a strong interaction effect of the modules on functional connectivity. e, SFG–ITG connection and other connections with stronger association strength between functional connectivity and the dendritic spine component of synaptic protein modules displayed. Different color dots correspond to different brain regions. In total, 3% of other connections showed higher association strength. Connections emanating from SFG or ITG are displayed in red. Other connections are displayed in yellow. f, Structural covariation counterpart of d. g, Structural covariation counterpart of e. In total, 0.4% of other connections showed higher association strength between structural covariation and the dendritic spine component of synaptic protein modules.

Replication of module-level results with structural covariation and gene expression

Due to the interplay between functional connectivity and structural morphologies39,40,41,42, we repeated the above-mentioned analysis with an MRI estimate of structural covariation between SFG and ITG as the outcome. Similar to the functional connectivity results, we did not find an association between SFG–ITG structural covariation and the synaptic protein modules (P = 0.8815) unless we honed into the dendritic spine component of the synaptic protein modules (P = 0.0034; Fig. 5f). We also observed regional specificity, with only 0.4% of other brain region pairs showing stronger association (Fig. 5g). Furthermore, the interaction effect is significant (P = 0.0203), with only 3% of other brain region pairs showing stronger effects. The significant associations with the same synaptic protein modules suggest that functional connectivity and structural covariation are likely to share molecular processes. As an exploratory analysis, we tested the dendritic spine component of other modules and found no association with structural covariation (Supplementary Fig. 7).

We further repeated the analysis with synaptic expression modules in place of synaptic protein modules as replication. The synaptic modules built from gene expression data (Supplementary Fig. 8 and see SFG eMod5 and ITG eMod4 in Supplementary Table 3) contain genes that highly overlap with members of the synaptic protein modules (odds ratio (OR) = 2.92, P = 2.71 × 10−32 for SFG and OR = 2.53, P = 5.14 × 10−23 for ITG) and are most enriched for neurons among expression modules (Supplementary Fig. 8). Similar to the protein module results, we did not find an association between SFG–ITG connectivity and synaptic expression modules (P = 0.0532) unless we focus on the dendritic spine component of the synaptic expression modules (P = 0.0396). The observed regional specificity with protein abundance modules is also evident with gene expression modules in line with previous transcriptomic studies13,43,44, with only 2% of other connections displaying stronger association (Supplementary Fig. 9a). However, we did not observe an interaction effect (P = 0.4824). When we associated SFG–ITG structural covariation with synaptic expression modules, we again did not find an association (P = 0.9705) unless we honed into their dendritic spine component (P = 0.0234). Regional specificity is lower compared to synaptic protein modules, with 14% of other brain region pairs showing stronger association (Supplementary Fig. 9b), and we did not observe an interaction effect (P = 0.2538). As an exploratory analysis, we further tested the other modules and found the dendritic spine component of modules predominantly enriched for immune systems to be also associated with functional connectivity and structural covariation (Supplementary Fig. 7). Overall, these results show that the expression data replicate the associations between the dendritic spine component of synaptic modules and both functional connectivity and structural covariation, but with weaker interaction effects compared to protein data. More broadly, the results mentioned above indicate the benefits of cellular contextualization by using dendritic spine morphometry to bridge systems at the molecular level to connectivity at the brain region level.

Individual molecules associated with connectivity

The associations presented thus far are at the level of modules, which comprise hundreds of proteins and genes. To prioritize specific proteins among the 7,788 measured in each region, we modeled SFG–ITG connectivity by the main effect of each protein in a given region, the main effect of the synaptic protein module in the other region and the interaction between the given protein and the synaptic protein module. We used this strategy to avoid the limitation on statistical power in testing all pairwise combinations of 7,788 proteins between the two brain regions. We prioritized proteins in each brain region based on the following three effects: the main effect of each protein, its interaction with the synaptic protein module and these two effects combined. We declared significance at an α of 0.05 with false discovery rate (FDR) correction across all proteins, regions and effects tested. For result interpretation, we applied GSEA33 to the summary statistics of each effect.

Analogous to the module-level results, we did not find an association between SFG–ITG connectivity and the abundance of any proteins without cellular contextualization (Supplementary Table 4). Therefore, we fitted each protein with all spine attributes in aggregate to extract the dendritic spine component. For 99% of the proteins, the dendritic spine fits have R2 > 0.1 (Fig. 6a). Grouping the proteins based on the partial R2 of the spine attributes resulted in clusters that well relate to processes supporting synaptic transmission (Fig. 6b,d). Notably, we observed distinct partial R2 profiles across the clusters (Fig. 6c), suggesting that the dominant attributes in the dendritic spine fits vary across proteins. When we repeated the protein-connectivity association analysis with the dendritic spine component, the association strength became stronger (Fig. 7a; P < 10−4 for all three effects in both SFG and ITG), and we detected 87 SFG proteins and 344 ITG proteins (Supplementary Table 4). Overlapping these proteins with protein members of clusters derived from dendritic spine fits (Fig. 6b–d) shows that both spine density and morphologies contribute to connectivity-associated proteins. Applying GSEA to the protein-connectivity association statistics (based on the dendritic spine component) found SFG proteins to be predominantly enriched for GO terms related to synapse, energy metabolism and RNA processing (Supplementary Table 5). Detected SFG proteins include limbic system-associated membrane protein (LSAMP)45, NAPB46, KCNIP3 (ref. 47) and PANX1 (ref. 48), which are involved in cell surface channel biology at synapses. Similarly, the protein-connectivity association statistics for ITG are enriched for GO terms related to synapse, energy metabolism and RNA processing (Supplementary Table 5). Detected ITG proteins include LYNX1, NRN1, Pleckstrin and Sec7 domain containing 2 (PSD2) and RAB11A. Like LSAMP, LYNX1 and NRN1 are glycosylphosphatidylinositol (GPI)-anchored proteins that mediate neuronal receptor activity at excitatory synapses49,50. Like NAPB, PSD2 and RAB11A are critical for intracellular signal transduction and protein transport at synapses51,52. These findings indicate that while the enriched GO terms of SFG and ITG overlap (OR = 48.01, P = 1.77 × 10−46; Fig. 7b), their detected proteins are different (Supplementary Fig. 10).

Fig. 6: Characterization of dendritic spine fits of proteins.
figure 6

a, R2 of the dendritic spine fits of all measured proteins. b, SFG proteins clustered by applying SpeakEasy2 to the partial R2 of the spine attributes. Each functional label was assigned based on the most enriched GO terms of each cluster. The P values correspond to the overlap between functional connectivity-associated proteins and protein members of each cluster. c, Partial R2 of each dendritic spine attribute averaged over proteins of each cluster. Partial R2 profiles of SFG (green) and ITG (blue) were aligned using Hungarian clustering. d, ITG counterpart of b.

Fig. 7: Protein-connectivity associations.
figure 7

a, Histogram on the number of associations found between proteins or their dendritic spine component and functional connectivity, aggregated over regions and effects tested. Red dotted line corresponds to the FDR threshold for the dendritic spine component. b, GO term enrichment based on applying GSEA to protein-connectivity association statistics. Displayed scatterplot corresponds to interaction effects. Ranking of GO terms is similar between SFG and ITG. c, Structural covariation counterpart of a. d, Structural covariation counterpart of b. e, Sankey diagram displaying enriched GO terms common between functional connectivity (FC) and structural covariation (SC) as outcome, and proteins associated with both FC and SC (in terms of main effect, interaction or joint of main and interaction effect) that overlap with members of the GO terms.

To further probe the relationship between brain function and structure, we repeated the analysis with SFG–ITG structural covariation as the outcome. In contrast to functional connectivity, even without cellular contextualization, we found 72 SFG proteins and 171 ITG proteins associated with SFG–ITG structural covariation (Supplementary Table 4). Nevertheless, focusing on the dendritic spine component increased association strength (Fig. 7c; P < 10−4 for all effects tested except for the main effect of ITG proteins) and detected 625 SFG proteins and 1204 ITG proteins (Supplementary Table 4). These proteins overlap with proteins detected without cellular contextualization (OR = 3.61, P = 3.94 × 10−5 for SFG and OR = 4.60, P = 6.75 × 10−20 for ITG). The greater number of proteins associated with structural covariation compared to functional connectivity could be due to how anatomical structures are more stable over time. Hence, the antemortem T1 scans would better reflect the brain states at time of postmortem omic acquisition. The ranking of GO terms is again similar between SFG and ITG (OR = 215.83, P = 9.25 × 10−31; Fig. 7d). Notably, the enriched GO terms are similar to those found with functional connectivity as the outcome (OR = 125.73, P = 4.68 × 10−26 for SFG and OR = 231.90, P = 2.86 × 10−98 for ITG; Supplementary Table 5), with 18 SFG proteins and 32 ITG proteins in common (Fig. 7e). These findings provide validity for our detected protein-connectivity associations and indicate that certain molecular processes can explain variance in both functional connectivity and structural covariation.

Next, we repeated the analysis with gene expression data, and the overall trend was similar to the results obtained with the protein abundance data. We did not find any genes associated with SFG–ITG connectivity without cellular contextualization (Supplementary Table 6). When we focused on the dendritic spine component of gene expression, the association strength became higher (P < 10−4 for all tested effects except for the main effect of SFG genes). We identified 1 SFG gene and 324 ITG genes. The enriched GO terms are predominantly related to synapses, energy metabolism and RNA processing (Supplementary Table 7), similar to those found with protein abundance (OR = 29.67, P = 1.58 × 10−42 for SFG and OR = 6.61, P = 2.56 × 10−20 for ITG). However, the ranking of individual genes by association strength does not align with that of individual proteins (Supplementary Fig. 11). When we used structural covariation as the outcome, we found an association with 130 SFG genes and 18 ITG genes without cellular contextualization. Nevertheless, we detected more genes (1,023 SFG genes and 248 ITG genes) and observed higher association strength (P < 10−4 for all tested effects except for the main effect of SFG genes with P = 0.0178) when we focused on the dendritic spine component of gene expression. The enriched GO terms for gene expression overlap with those found for protein abundance (OR = 66.57, P = 7.29 × 10−12 for SFG and OR = 164.97, P = 2.66 × 10−26 for ITG), but the ranking of individual genes and individual proteins do not align (Supplementary Fig. 12). Functional connectivity and structural covariation share common GO terms (OR = 68.80, P = 1.46 × 10−45 for SFG and OR = 13.18, P = 6.91 × 10−10 for ITG), but with little overlap at the gene level (Supplementary Fig. 13). Overall, our results show consistency between protein abundance and gene expression at the molecular function level, despite the lack of overlap between individual proteins and genes. This is likely due to the low correlation between mRNA level and protein abundance of the same gene53.

Imaging transcriptomic and genetic support for connectivity molecules

To place our connectivity-associated proteins and genes in context with other imaging transcriptomic studies, we compared with genes previously found by combining the AHBA with fMRI data of other cohorts54. The 125 AHBA connectivity-related genes do not place highly among our proteins ranked by their effects on connectivity (Supplementary Fig. 14), possibly due to post-transcriptional regulation, which limits the correlation between mRNA level and protein abundance of the same gene53. Those genes are only enriched among ITG genes ranked by their interaction effects (P = 0.01, based on GSEA). This result suggests that the generic set of genes found by correlating spatial patterns of gene expression in AHBA with functional connectivity estimated from other cohorts has only mild overlap with genes found by modeling covariation between gene expression and SFG–ITG connectivity within the same set of individuals.

Finally, we compared the connectivity-associated proteins and genes found herein with previous imaging genetics studies. The largest related GWAS6 found four independent loci (EPHA3, DPP4, FBXO11 and ZNF326) that are associated with SFG–ITG connectivity (rfMRI connectivity ICA100 edge 849). These loci were derived by spatially mapping genome-wide significant genetic variants to their closest genes. Among the four loci, EPHA3 shows the highest heritability in the edge 849 GWAS, and the dendritic spine component of EPHA3 expression displays nominal association with SFG–ITG connectivity in our data (P = 0.0192 for SFG main effect and P = 0.0140 for ITG main effect). We could not perform the same analysis for protein abundance because tryptic peptides from EPHA3, DPP4, FBXO11 and ZNF326 were not measured in our dataset. Instead, we assessed our found proteins in relation to the edge 849 GWAS through a functional mapping approach. For each brain region, we built protein prediction models to combine the effects of genetic variants that are proximal to each protein55. We then applied these models to the summary statistics of the edge 849 GWAS to test whether the genetic component of the proteins is associated with edge 849. Overall, the genetic component of both SFG and ITG proteins was not associated with edge 849 after FDR correction (Supplementary Table 8), likely due to limited heritability in edge 849. Accordingly, we focused on the genetic effects of our found proteins. For SFG, 3,339 proteins have adequate heritability for model construction. Among these 3,339 proteins, the dendritic spine component of 33 proteins is associated with SFG–ITG connectivity in our data (FDR-corrected P < 0.05). Among those proteins, the genetic component of two proteins, NGB (P = 0.0219) and POLR2J (P = 0.0423), display nominal associations with edge 849. For the ITG region, 3,445 proteins have adequate heritability for model construction. Among these 3,445 proteins, the dendritic spine component of 147 proteins is associated with SFG–ITG connectivity in our data (FDR-corrected P < 0.05). Among those proteins, the genetic component of 11 proteins, including NRN1 (P = 0.0456), displays nominal associations with edge 849. These results provide further support for some of the protein-connectivity associations found in our data. The repeated detection of NRN1 (ref. 56), or neuritin, at the DNA and protein level, suggests that this secreted neuropeptide has a critical role in synaptic biology in supporting brain connectivity of ITG.

Discussion

A central goal of neuroscience is to develop an understanding of the brain that ultimately describes the mechanistic basis of human cognition and behavior. Independent studies are contributing to an exquisite ‘parts list’ of molecules, cell types and brain structures57,58,59, but how these parts cohere into human cognitive function remains obscure. This research challenge is analogous to assembling a jigsaw puzzle without seeing the image on the box. Research struggles with proto-clusters of insightful pieces that are difficult to orient with respect to each other, especially when these pieces reside on different biophysical scales. To help bridge this gap, we gathered and analyzed brain omics, and cellular and neuroimaging data from the same set of 98 individuals. Focusing on the connection between SFG and ITG from which we acquired omic and cellular data, we showed that molecular measurements likely capture many properties beyond brain connectivity. Hence, only when we restricted to their components related to dendritic spine morphology were we able to establish associations with SFG–ITG connectivity. With this approach, we found molecules and molecular modules enriched for synaptic structures and functions, mitochondria-based energy metabolism and RNA processing, in line with known biology. Notably, beyond listing connectivity-related molecules, our study has broader implications in that it demonstrates the feasibility of detecting synchrony among systems of different scales in humans, which constitutes a step toward a more coherent understanding of brain function. Practically, this integration across scales could help address the long-standing challenge of brain drug discovery, in which promising molecules from cellular assays fail to influence cognitive traits.

As our study is distinctive and there are no other human datasets containing such paired brain omics and imaging data, we used consistency among data types that span different biophysical scales as a way to validate any observed synchronization. We showed that the same synaptic protein modules can explain between-participant variability in both functional connectivity and structural covariation and replicated this finding with synaptic expression modules. We also found hundreds of proteins associated with functional connectivity, a subset of which is associated with structural covariation, and again we showed similar trends with gene expression data. Notably, all found associations are enriched for similar cellular structures and molecular processes. We further showed that the genetic component of a dozen found genes is associated with connectivity, suggesting potential causal roles of these genes. As a common consideration with fMRI is the stability of its detected effects60,61, we accordingly performed all data preprocessing and normalization in a highly reproducible and stringent fashion and used the more stable structural neuroimaging data to replicate our findings.

Dendritic spines are tiny protrusions along neuronal dendrites that participate in most excitatory synapses in the brain. The head of the spine structurally supports the postsynaptic density, which collects the essential machinery for postsynaptic neurotransmission36,62. These actin-rich structures are plastic and alter their shape in congruence with synaptic activity and plasticity, including long-term potentiation30. These facets of spine biology were recapitulated in our GSEA of spine-protein associations (Fig. 4f), which linked actin-related and postsynaptic elements to spine morphological attributes, matching known neurobiology. Also, the enrichment for synaptic signaling and neurotransmitter release is consistent with the hypothesis that multiple aspects of spine morphology have functional consequences. The dominant spine attributes contributing to the dendritic spine fits varied across the connectivity-related proteins (Fig. 6c). Notably, spine density, which reflects the capacity for excitatory neurotransmission and hence presumably links to connectivity, is not the only dominant attribute, but spine morphological attributes also dominate for certain proteins. Although our found associations between molecules and connectivity do not describe or depend on mesoscale mechanisms between spines and fMRI signals, one major theory of fMRI generation relates to local field potentials (LFP)63. LFPs are generated from synaptic transmission, which uses dendritic spines, so their involvement in large-scale signal generation is logical and plausible.

Applying GSEA to the protein-connectivity associations found enrichment for synaptic function, mitochondria-based energy metabolism and RNA biology (Figs. 6b–d and 7e). The enriched processes are consistent with normal synaptic function, which requires energy and local RNA translation. For instance, releasing and recycling synaptic vesicles at the axonal presynapse are energy-demanding mechanisms that are mediated by Ca2+ levels. Mitochondria localized to presynaptic terminals are positioned to buffer Ca2+ through the mitochondrial matrix and meet metabolic energy needs by producing adenosine triphosphate64. In contrast, postsynaptic mitochondria are typically positioned in the dendrite rather than localized inside the dendritic spine65. While the function of postsynaptic mitochondria is less well characterized, these organelles are commonly hypothesized to support the energy needs of multiple spines in proximity to the mitochondrial location in the dendrite66. Furthermore, synaptic plasticity partly depends on local protein synthesis, in which mRNA-binding proteins have a large role, especially localization and translation of mRNA following synaptic demand67.

Our analysis revealed several GPI-anchored proteins that are functionally similar and localized to the synapse, including LSAMP, LYNX1 and NRN1 (Supplementary Table 4). LSAMP is expressed widely through the adult brain, and human genetics studies as well as work in rodent models suggest that LSAMP influences emotional behavior, possibly contributing to mood disorders68. LSAMP is predicted to facilitate axon targeting to dendrites69 and is considered a putative therapeutic target for neuropsychiatric illness45. Similarly, LYNX1 localizes to synapses through its GPI anchor and interaction with nicotinic acetylcholine receptors. LYNX1 is hypothesized to maintain a balance between excitatory and inhibitory circuits in the mouse visual cortex of the adult brain70. Like the others, NRN1, or candidate plasticity gene 15 (CPG15), is a synaptic activity-regulated gene whose expression is experience-dependent in the adult brain. NRN1 anchors to the surface of synaptic structures, regulating the generation of dendritic spines and synapse maturation71. Recent findings in older adults indicated a beneficial role for NRN1 in synaptic preservation and maintenance72,73. NRN1 is a neurotrophic factor, and given the small size of its secreted form (~11 kDa), may be an attractive therapeutic target. Another set of functionally overlapping proteins between SFG and ITG converge at intracellular signaling pathways within dendritic spines. NAPB, or N-ethyl-maleimide-sensitive fusion attachment protein β, interacts with α-amino-3-hydroxy-5-methyl-4-isoxazole-propionate (AMPA) receptors and functions as a molecular chaperone to facilitate processing of AMPA receptors at the postsynapse46. PSD2, also known as the exchange factor for ARF6 (EFA6) isoform C, binds phospholipids and is localized to endo-lysosomal compartments at the postsynapse52. Past studies have identified PSD2 as a candidate risk gene for age-related memory disorders74,75. While all these proteins are involved in connectivity-relevant processes, additional studies are required to mechanistically investigate how fluctuations in these protein abundances explain interindividual differences in functional connectivity.

Our findings suggest a number of important factors to consider when trying to draw associations between postmortem microscale molecular data and macroscale neuroimaging data. First, our results indicate that the relationships between molecular abundances and brain connectivity have substantial regional specificity. Henceforth, measuring molecular levels and spine attributes in additional brain regions from the same individuals could help define the extent of this regional specificity in determining if different proteins are indeed involved in different brain regions. Second, the number of associations found might be limited by the time interval between the antemortem scans and postmortem brain collection. From this standpoint, the molecules we detected herein could be considered a minimum set and reflective of molecular abundances whose covariation with these brain regions occurs over a particular time scale. Finally, we face the common problem of assigning molecules from bulk molecular measurements to particular types of cells or synapses. For instance, RNA-related or mitochondrial proteins we discuss in the context of spines may be from nonsynaptic areas of the neuron or nonneuronal cells. Moreover, if the found proteins are indeed from spines or synapses, they may not be globally representative, as recent efforts suggest unique molecular signatures in different synapses76. Therefore, in addition to the need to explore regional specificity, it could be helpful to assay synapse-type specificity to investigate whether particular types of synapses are more relevant to fMRI coupling.

Overall, this study indicates that acquiring data across the major perspectives in human neuroscience from the same set of brains is foundational for understanding how human brain function is supported at multiple biophysical scales. While future research is necessary for fully determining the scope and components of multiscale brain synchrony, we have established a robustly defined initial set of molecules whose effects likely resonate across biophysical scales.

Methods

Cohort

Ninety-eight participants of the ROSMAP16 have all data types needed for this work, namely fMRI, structural MRI, genetic, dendritic spine morphometry, RNA-seq and proteomic data. All enrolled participants agreed to annual clinical evaluation and brain donation at death. Both studies were approved by an Institutional Review Board of Rush University Medical Center. Participants signed informed consent, an Anatomical Gift Act and a repository consent to allow their resources to be shared. The dendritic spine morphometry, RNA-seq and proteomic data were measured from postmortem tissue samples of the SFG and ITG. For each participant, tissue samples were drawn from either the left or the right hemisphere but not both sides. In total, 52% of participants had tissue samples drawn from the left hemisphere. The average age of participants at the time of MRI scan and death is 88 ± 6 years and 91 ± 6 years, respectively, with an average time interval between MRI scan and age at death of 3 ± 2 years. The average postmortem interval is 8.5 ± 4.6 h. In total, 77% of the participants are female, and the participants have 15 ± 3 years of education. Further demographic details on the participants were previously described16.

Neuroimaging

Structural scans were acquired using T1-weighted MRI (1.5T General Electric (GE)—1 mm3 resolution, repetition time (TR) = 6.3 ms, echo time (TE) = 2.8 ms, flip angle = 8°; 3T Siemens—1 mm3 resolution, TR = 2300 ms, TE = 2.98 ms, flip angle = 9° and 3T Philips—1 mm3 resolution, TR = 8.0 ms, TE = 3.7 ms, flip angle = 8°). Further details on scanner protocols can be found at https://www.radc.rush.edu/docs/var/scannerProtocols.htm (ref. 77). Nonuniformity correction, skull-stripping, spatial normalization to the MNI152 template and tissue segmentation were performed using advanced normalization tools. Freesurfer’s ‘recon-all’ function was applied to extract morphometric attributes.

Resting-state fMRI BOLD data were acquired on multiple scanners, which are as follows: 1.5T GE Signa scanner (5 mm3 resolution, TR/TE = 2,000/33 ms, flip angle 85°), 3T Siemens Magnetom TrioTim syngo (3.3 mm3 resolution, TR/TE = 3,000/30 ms, flip angle 80°) and 3T Philips Achieva Quasar TX (3.3 mm3 resolution, TR/TE = 3,000/30 ms, flip angle 80°). Further details on scanner protocols can be found at https://www.radc.rush.edu/docs/var/scannerProtocols.htm (ref. 78). Raw fMRI data were preprocessed using a well-validated pipeline of robust and reproducible fMRI processing tools. First, CuBIDS was used to identify and investigate scans with deviant parameters to reduce scan heterogeneity18. fmriprep (v.20.2.3)19 was then applied to realign and slice-time correct the fMRI volumes, with distortion correction further performed if accompanying fieldmap volumes were available. The resulting fMRI volumes were coregistered to their corresponding T1 volumes using FLIRT. Finally, the fMRI time series underwent confound regression using the eXtensible Connectivity Pipeline (XCP; https://github.com/PennLINC/xcp_d)61 (v.0.0.4 ‘xcp-abcd’) with the 36p+Despike model79. In brief, bandpass filtering between 0.01 and 0.08 Hz was applied, followed by despiking with AFNI80 and regression of 36 parameters from the time series81. Regressors included six motion attributes derived during realignment, a global signal and two physiological parameters, as well as their derivatives, squares of derivatives and quadratic terms. Additionally, mean framewise displacement was calculated for each participant and used to account for any residual effect of head motions in subsequent analysis.

Proteomic

Multiplex TMT-MS was used to generate proteomic data (7,788 proteins). Brain tissue homogenization, protein digestion, TMT peptide labeling, high-pH offline fractionation and liquid chromatography–tandem mass spectrometry (LC–MS/MS), as well as protein quantification, batch correction and data preprocessing, were performed as previously described23,24,25. Briefly, paired human tissue samples from SFG and ITG were obtained from 98 ROSMAP participants. Approximately 100 mg of tissue was homogenized in 8 M urea, 10 mM Tris and 100 mM NaH2PO4 (pH 8.5) buffer with Halt protease and phosphatase inhibitor cocktail (Thermo Fisher Scientific) using a Bullet Blender (Next Advance). Protein concentration was determined by bicinchoninic acid assay (Pierce), and 1D SDS–PAGE gels were run with Coomassie blue staining for quality control before protein digestion. Lysyl endopeptidase (Wako) at 1:100 (wt/wt) was added to each 100 μg sample for digestion overnight. Trypsin (Promega) was added at 1:50 (wt/wt), and digestion occurred for another 16 h. The peptide solutions were acidified and desalted. The samples were then loaded onto the column, washed and eluted. An equal amount of peptide from each sample was aliquoted and pooled as the global internal standard (GIS), which was split and labeled in each TMT batch. The eluates were then dried to completeness using a SpeedVac. Before TMT labeling, cases were randomized by covariates (age, sex, PMI, diagnosis, etc.) into 26 total batches. Peptides from each individual case and the GIS pooled standard or bridging sample (at least one per batch) were labeled using the TMT 11-plex kit (Thermo Fisher Scientific, 90406). For each batch, up to two TMT channels were used to label GIS standards, while the remaining TMT channels were used for samples after randomization. Next, high-pH offline fractionation was performed, and 96 individual equal-volume fractions were collected across the gradient and then pooled by concatenation into 24 fractions and finally dried with a SpeedVac. Fractions were resuspended in an equal volume of loading buffer and analyzed by LC–MS/MS. Peptide eluents were separated on a self-packed C18 (1.9 μm; Dr. Maisch) fused silica column (25 cm × 75 μM internal diameter; New Objective) by a Dionex UltiMate 3000 RSLCnano liquid chromatography system (Thermo Fisher Scientific). Peptides were monitored on an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific). Full MS scans were collected at a resolution of 120,000 (400–1,400 m/z range, 4 × 105 AGC, 50-ms maximum ion injection time). All HCD MS/MS spectra were acquired at a resolution of 60,000 (1.6 m/z isolation width, 35% collision energy, 5 × 104 AGC target, 50-ms maximum ion time). Dynamic exclusion was set to exclude previously sequenced peaks for 20 s within a 10-ppm isolation window. Peptide spectral matches (PSMs) were filtered to an FDR of less than 1% using the Percolator node. For database searching and protein quantification, raw MS data files were analyzed in Proteome Discover software suite (v2.3; Thermo Fisher Scientific), and MS/MS spectra were searched against the UniProtKB human proteome database. Following spectral alignment, peptides were assembled into proteins and filtered based on the combined probabilities of their constituent peptides to a final FDR of 1%. Reporter ions were quantified from MS2 scans using an integration tolerance of 20 ppm with the most confident centroid setting. Only PSMs with less than 50% isolation interference were used for quantification, and only unique and razor (that is, parsimonious) peptides were considered for quantification. Multiconsensus was performed to achieve parsimony across individual batches. In cases of redundancy, shared peptides were assigned to the protein sequence in adherence with the principles of parsimony. Finally, batch correction and data preprocessing were performed. A total of 10,426 high confidence, master proteins were identified across the 26 TMT batches, but only proteins quantified in >50% of samples were included in subsequent analyses. log2 abundances were normalized as a ratio divided by the central tendency of pooled standards. Batch correction was performed using a Tunable Approach for Median Polish of Ratio (https://github.com/edammer/TAMPOR). Multidimensional scaling plots were used to visualize batch contributions to variation before and after batch correction. Network connectivity was used to remove outliers, defined as samples that were greater than 3 s.d. away from the mean.

Gene expression

RNA was extracted using the Chemagic RNA tissue kit (PerkinElmer, CMG-1212). RNA was concentrated (Zymo, R1080), and RQN (RIN score) was calculated using Fragment Analyzer (Agilent, DNF-471). RNA concentration was determined using the Qubit broad-range RNA assay (Invitrogen, Q10211) according to the manufacturer’s instructions. In total, 500 ng total RNA was used for RNA-seq library generation, and rRNA was depleted with RiboGold (Illumina, 20020599). A Zephyr G3 NGS workstation (PerkinElmer) was used to generate TruSeq stranded sequencing libraries (Illumina, 20020599) with custom unique dual indexes (IDT) according to the manufacturer’s instructions with the following modifications. RNA was fragmented for 4 min at 85 °C. The first strand synthesis was extended to 50 min. Size selection post adapter ligation was modified to select larger fragments. According to the manufacturer’s instructions, library size and concentrations were determined using an NGS fragment assay (Agilent, DNF-473) and Qubit ds DNA assay (Invitrogen). The modified protocol yielded libraries with an average insert size of around 330–370 bp. Libraries were normalized for molarity and sequenced on a NovaSeq 6000 (Illumina) at 80–100 million reads, 2 × 150 bp paired-end. RNA-seq data processing was implemented using the following three parallel pipelines: an RNA-seq quality control (QC) pipeline, a gene/transcripts quantification pipeline and a 3′-UTR quantification pipeline. In the QC pipeline, paired-end RNA-seq data were first aligned by STAR (v2.6) to a human reference genome. The primary assembly of the reference genome fasta file and transcriptome annotation came from Gencode (Release 27 GRCh38). Picard tools were applied to the aligned BAM files to assess the quality of RNA-seq data. In the quantification pipeline, transcript raw counts were calculated by Kallisto (v0.46).

We preprocessed the RNA-seq data following the pipeline previously described82. In brief, we applied TMM normalization (using edgeR calcNormFactors) to the raw counts to estimate the effective library size of each participant. We then applied voom/limma to regress out confounds and convert the counts into log counts per million (log2(CPM)). Technical confounds included batch, study (ROS or MAP), RNA integrity number, postmortem interval, library size, percentages of aligned reads, coding and intergenic bases, ribosomal and untranslated region bases, duplicated reads, 3′ bias, 5′ over 3′ bias and coefficient of variation in coverage. Only genes with mean log2(CPM) > 2 were kept.

Dendritic spine imaging and processing

Golgi-Cox staining, using the FD Rapid Golgi Stain Kit (FD Neurotechnologies, PK401), was used to visualize dendrites and dendritic spines in postmortem samples from SFG and ITG. The flowing adjustments were made from the standard operating procedure in the manual from the kit. All steps were performed at room temperature. Solutions A (potassium dichromate and mercuric chloride) and B (potassium chromate) were combined 48 h before tissue submersion. In total, 2 ml of solution A + B were placed in wells of a 12-well plate (Thermo Fisher Scientific, 08-772-29). Frozen tissue blocks, approximately 10 × 10 × 10 mm, were dropped into the A + B solution in each well. The chromate solution was replaced after 24 h, and the tissue blocks remained in the solution for 21 days. Next, the tissue blocks were transferred to a new 12-well dish containing solution C. Solution C was replaced after 24 h. After a total of 72 h in solution C, each tissue block was sliced into 125 µm sections in solution C using a Leica Vibratome (VT1000 S). Free-floating sections were placed in a six-well dish (Thermo Fisher Scientific, 353046) that contained solution C in each well. Tissue sections were sequentially moved from solution D to solution E and to distilled water, similar to the manufacturer’s instructions. Slices were dehydrated with alcohols (70%, 90% and 100% ethanol) and cleared with xylenes (Thermo Fisher Scientific, X3P). Slices were then placed on glass slides (Thermo Fisher Scientific, 12-550-15) with a single slide spacer (Electron Microscopy Sciences, 70327-20S), sealed with Permount (Thermo Fisher Scientific, SP15-100) and coverslipped with 24 × 50 mm, thickness 0.13–0.17 mm, glass (Carolina Biological, 633153). Slides were dried in the dark for 7 days before microscopy imaging.

Dendrites were imaged by blinded experimenters using bright-field microscopy. Each case had multiple slides with multiple slices available for imaging. From each tissue slice, one to two pyramidal neurons from layers two or three were randomly selected to image for analysis. Between 8 and 12 neurons were sampled per individual. A single dendritic segment was imaged per neuron. While we used random site sampling, dendrite segments were imaged if the following criteria were met: (1) located centrally within the total tissue depth, (2) not obscured by staining debris and/or intersecting neighboring neurons and (3) fully impregnated. Additionally, only dendrites that were over 30 µm in length and had an approximate diameter of 1 µm were imaged. Tissue slices were visualized under 10× magnification to determine tissue quality. Once suitable dendrites were identified, tissue slices were visualized at 60× magnification using Type F immersion oil (Nikon, MXA22168). Z stacks were captured at a step size of 0.1 µm at ×60 using a Nikon Plan Apo ×60/1.40 numerical aperture (NA) oil-immersion objective in combination with a high-NA oil condenser (Nikon, MEL59500) on a Nikon Eclipse Ti2 inverted microscope with a Lumencor SOLA light engine and Hamamatsu ORCA-flash 4.0 digital camera. All images were 1,024 × 1,024 pixels.

Reconstructions of dendrites and dendritic spines were conducted by blinded experimenters using Neurolucida 360 (MBF Biosciences, v2.70.1). Nikon ND2 files were converted to 16-bit TIFF files using ImageJ and then imported to Neurolucida 360. Dendritic segments were traced using a user-guided semi-automated directional kernel algorithm. Initiation and termination points for dendrite reconstruction were determined with the following criteria: (1) ≥5 µm away from the distal tip of the dendrite, (2) consistent diameter, (3) level axis with limited smear in the z plane and (4) ≥20 µm in length. After the dendritic segment was traced, the experimenter verified that the points located on the dendrite were accurate in x, y and z planes and, if not, made manual adjustments to the trace. The diameter of the dendrite at each point was also verified to ensure the accuracy of the trace. Dendritic spines were traced using voxel clustering. The following parameters were used for spine identification: outer range, 7 µm; minimum height, 0.3 µm; detector sensitivity, 90–125% and minimum count, 8 voxels. The morphology of each dendritic spine was examined to verify that the axial smear did not cause misrepresentation. Merge and slice tools in Neurolucida 360 were used to correct inaccuracies. The position of every spine’s backbone point was verified by the blinded experimenter. If the backbone caused a misrepresentation, the dendritic spine was visualized from the z plane, and the experimenter’s re-oriented backbone points in the xy plane. Notably, repositioning in the xz or yz plane was performed while the spine was being viewed from the lateral angle. Morphometric analysis was conducted for each spine, and measurements categorized spines into thin, stubby, mushroom and filopodia classes. The following parameters were used for spine classifications: head-to-neck ratio, 1.1; length-to-head ratio, 2.5; mushroom head size, 0.35 µm and filopodium length, 3.0 µm. Spines with a head-to-neck ratio >1.1 and head diameter >0.35 µm were classified as mushroom. Spines were classified as filopodia or thin if the head-to-neck ratio was <1.1 and either (1) length-to-head ratio was >2.5 or (2) head size was <0.35 µm. Moreover, if the total length of the spine was >3.0 µm, the spine was classified as filopodia, and if the total length was <3.0 µm, then the spine was classified as thin. Dendrite and spine reconstructions were exported to Neurolucida Explorer (MBF Biosciences, v2.70.1), where data were collected for quantitative analysis. The dendritic spine measurements were exported and collected in Microsoft Excel. Spine density was calculated by determining the quantity of spines per 10 µm of dendrite length. Spine length was defined as the curvilinear backbone length from the insertion point to the most distal point of the spine head. Head diameter was defined as the breadth of the spine head at its widest cross-sectional point.

Dendritic spines display an enormous array of morphologies, including varying head diameters and neck lengths, which are inextricably linked with spine function. For example, spine head diameter can reflect the density of protein receptors at the postsynapse, which correlates with synapse strength36. The width and length of the spine neck influence the diffusion of signaling molecules and the degree to which individual synapses affect neuronal activity83. Measurements of spine head diameter and spine length as a ratio form the basis of classification into the common morphologic subclasses. Thin spines have short necks and small heads, whereas mushroom spines have longer necks and larger heads. Stubby spines are hypothesized to be transitional structures with a short neck and wide head. Subtle alterations in spine morphology and related biology can induce marked effects on connectivity patterns of neuronal circuits at microscale and downstream cognitive behavior30.

Our spine morphology and density measurements are consistent with similar studies assessing spine structure characteristics in human and nonhuman primates using confocal and light microscopy84,85,86,87. In a particularly elegant study, ref. 87 assayed fresh human brain tissue obtained from the neurosurgical operation and used iontophoretic microinjection of Lucifer yellow and high-resolution confocal microscopy. Our dendritic spine density and length measurements are highly consistent with details in ref. 87, which reported an average spine density of 1 per μm on layer III pyramidal neuron dendrites in the cingulate cortex of an 80-year-old male. The mean spine density in SFG is 1.3 per μm and 1.2 per μm in ITG (Supplementary Fig. 3c). The details in ref. 87 reported an average spine length of 1.4 μm on layer III dendrites in the cingulate cortex. The mean spine lengths are 1.54 μm in SFG and 1.46 μm in ITG (Supplementary Fig. 3d). However, spine volume measurements in ref. 87 averaged 0.36 μm3, whereas the mean spine volume is 0.26 μm3 in SFG and 0.27 μm3 in ITG (Supplementary Fig. 3e). This disconnect could be attributed to the lack of resolution using Golgi stain to measure the volume of the spine neck. The fluorescence microscopy approach by ref. 87 allows greater signal-to-noise ratios for more precise 3D measurement of the spine neck, which is approximately 100 nm3. However, bright-field microscopy of Golgi stain relies on light absorption, which may not provide sufficient signal-to-noise to accurately measure the volume of the spine neck.

Genetics

Genotyping was performed using the Infinium Global Screening Array, Affymetrix Genome-Wide HumanSNP Array 6.0 and Illumina OmniQuad Express platform88,89. The genotyping data underwent sample-level exclusions, including removing duplicated samples, those with a genotyping success rate below 95%, and samples with discordant gender information. At the probe level, additional filtering was applied based on specific quality-control criteria—a Hardy–Weinberg equilibrium P value threshold (<1 × 10−50), genotype call rate (<0.9) and misshape test (<1 × 10−9). For imputation on the Haplotype Reference Consortium r1.1, we used the Michigan Imputation Server, Minimac3 (v1.0.4) and Eagle (v2.3). Before imputation, the input data were prepared using the HRC-1000G-check-bim_ts.pl script (available at https://www.well.ox.ac.uk/~wrayner/tools/). Imputed genotypes with an information score greater than 0.3 were converted to a Plink binary file format using Plink 1.9 for downstream analysis.

Functional connectivity estimation

To estimate functional connectivity, we first used a functional atlas comprising 100 parcels (Schaefer2018_100Parcels_17Networks_order_FSLMNI152) generated from resting-state fMRI data of >1,000 participants20 to divide the brain into functionally homogenous regions. We then averaged the time series of voxels within each parcel and computed Pearson’s correlation between all pairs of parcels for each participant. To select the brain connection (that is, the parcel pair) most relevant to our molecular data, we examined the spatial overlap between each functional parcel and the brain areas at which we drew tissue samples. Parcels 25 (left hemisphere) and 74 (right hemisphere) have the greatest overlap with our SFG tissue samples, and parcels 14 (left hemisphere) and 65 (right hemisphere) have the greatest overlap with our ITG tissue samples. For associating with the molecular data, we only used the SFG–ITG connectivity estimate that matches the side of the brain from which we drew tissue samples for each participant, that is, the connectivity estimate between parcels 25 and 14 if we drew brain tissue samples from the left hemisphere for a given participant, and vice versa. We used the connectivity estimates of other parcel pairs for assessing regional specificity (‘Module-level association analysis’).

Structural covariation estimation

To estimate structural covariation, we used the nine regional morphological attributes (number of vertices, surface area, gray matter volume, average cortical thickness, standard deviation of cortical thickness, mean curvature, Gaussian curvature, fold index and curvature index) from Freesurfer cortical surface reconstruction with regions defined anatomically based on the DKT atlas22. Let Zi be the n × 9 morphological attribute matrix of region i with n being the number of participants. The standard approach90 estimates the structural covariation between regions i and j by correlating row k of Zi with row k of Zj for k = 1 to n. The assumption is that morphological attributes have a one-to-one correspondence across regions, that is, the fold index in region i covaries with the fold index in region j, the curvature index in region i covaries with the curvature index in region j, etc. However, the fold index of region i might also covary with, for example, the curvature index of region j. Therefore, to mitigate the one-to-one attribute correspondence assumption, we applied CCA to find combinations of morphological attributes that maximally covary between region pairs (Supplementary Fig. 1). Specifically, we applied CCA to Zi and Zj to find 9 × 9 projection matrices Ai and Aj that maximize their correlation, that is, max tr(AiTZiTZjAj) subject to AiTZiTZiAi = I and AjTZjTZjAj = I. Adopting a recent approach where cofluctuation in fMRI intensity at single time point resolution is used to estimate connectivity between two regions91, we used Sij = (ZiAi) ◦ (ZjAj) as estimates of structural covariation between regions i and j, where the symbol ◦ denotes element-wise product and each CCA dimension (that is, each column of Sij) captures a different mode of shape covariation. For associating with the molecular data, we split the participants based on the side of the brain from which we drew tissue samples and separately applied CCA to those two sets of participants. We focused our analysis on the structural covariation between ‘superior frontal’ and ‘inferior temporal’ in the DKT atlas, which best overlaps with where we sampled brain tissues and used other anatomical region pairs for assessing regional specificity (‘Module-level association analysis’).

Molecular module estimation

To determine the molecular systems of interacting proteins in each brain region, we clustered the proteins into co-abundance modules using a fully data-driven approach called SpeakEasy92, which takes in a similarity matrix and joins nodes into modules based on both local connectivity and global network structure. To estimate the degree of interaction between each pair of proteins, we correlated their abundance levels across participants. We repeated this estimation for all protein pairs to generate a protein correlation matrix for each region and applied SpeakEasy, which extracted ten modules for SFG and ten modules for ITG (Supplementary Table 2). Considering our goal is to model brain connectivity, which presumably arises from subprocesses related to synaptic communication, we applied Fisher’s exact test to annotate each module by GO terms and focused our analysis on modules (pMod) whose protein members are most enriched for synaptic communication terms. Using this criterion, we selected pMod6 for SFG and pMod8 for ITG, which happen to have the greatest protein overlap among all protein module pairs (OR = 13.16, P = 3.99 × 10−186). We also annotated the modules by brain cell types by applying GSEA33 with cell type-specific mean expression93 as scores and proteins within our modules as genesets. The selected protein modules also happen to be the most enriched for neurons among modules (Supplementary Table 2). To represent a protein module, we used the average abundance levels of its protein members. We applied the same procedures to the gene expression data, which extracted nine modules for SFG and nine modules for ITG (Supplementary Table 3). Based on synaptic GO term enrichment, we selected and focused our analysis on eMod5 for SFG and eMod4 for ITG, which also happen to have the greatest gene overlap among all expression module pairs (OR = 57.25, P = 4.94 × 10−323) and most enriched for neurons (Supplementary Table 3).

Proteomic characterization of dendritic spine attributes

To provide a simple characterization/quality check of the dendritic spine data, we associated each dendritic spine attribute with the protein data and examined its GO term enrichment to check if the dendritic spine attributes are enriched for the expected neuronal structures and functions. Specifically, we applied regression with each measured protein from a given brain region as the response and each spine attribute from the same brain region as a covariate of interest, while accounting for age of death, sex, PMI and year of education as confounding factors. We then applied GSEA to the set of t values of each dendritic spine attribute and declared significance at an α of 0.05 with Bonferroni correction for the number of GO terms (P < 5 × 10−6).

Module-level association analysis

Let Yij be a n × 1 vector corresponding to neuroimaging estimates of a brain attribute (functional connectivity or structural covariation between regions i and j) of n participants, Xi and Xj be n × 1 vectors corresponding to the average molecular values (protein abundance or gene expression level) of members within the synaptic modules of regions i and j and C be a n × d confound matrix. We modeled Yij as follows: Yij = α0 +  + Xiβi + Xjβj + Xi ◦ Xjβij + ϵ, with age at death, age of scan, sex, years of education, scanner, postmortem interval and side of brain molecular data as acquired and mean framewise displacement as confounds. Our primary question is whether Xiβi + Xjβj + Xi ◦ Xjβij can significantly explain variance in Yij across participants for i = SFG and j = ITG. Considering that the biophysical scale at which proteins reside might be far from region-level attributes, we tested whether cellular contextualization of these modules can better reveal their association with our brain attributes of interest. Accordingly, we extracted the dendritic spine component of the synaptic modules by using regression to fit the average molecular values of their members as responses with morphometric attributes of dendritic spines as covariates and used these components (that is, the fitted values) as Xi and Xj. To assess regional specificity, we fitted the same model with Yij set to brain attribute values of other region pairs (that is, non-SFG–ITG connections) and computed the percentage of other region pairs with lower P values. For associating with structural covariation, which has nine CCA dimensions, we separately fitted the model to each CCA dimension and applied the aggregated Cauchy association test (ACAT) to combine P values across the nine dimensions94. We opted to use ACAT over other P value combination tests because it provably provides optimal power in sparse settings94,95, which accentuates the CCA dimensions that have a stronger association with the synaptic modules, in contrast to, for example, Fisher’s method that weights all dimensions equally. In addition to the synaptic modules, we also tested the association between other modules and connectivity as an exploratory analysis. To aid interpretation as well as reduce the number of statistical tests, we used Hungarian clustering to first pair up SFG and ITG modules that represent similar systems based on ORs of their gene overlap. We then associated each module pair against functional connectivity and structural covariation.

Molecule-level association analysis

To find the specific molecules (proteins and genes) associated with our brain attributes of interest, we modeled Yij as follows: Yij = α0 +  + xmiβmi + Xjβj + xmi ◦ Xjβmij + ϵ, where xmi is a n × 1 vector of molecular values (abundance level of protein m or expression level of gene m in region i) of n participants. The other variables are as defined in the module-level association model. We used this model to avoid testing all pairwise combinations of molecules between SFG and ITG (7,7882 protein pairs or ~12,0002 gene pairs), which limits statistical power. We tested for molecules involved in SFG by setting i to SFG and j to ITG, and vice versa. We focused on testing the following three terms: xmiβmi, xmi ◦ Xjβmij, and xmiβmi + xmi ◦ Xjβmij, that is, effects that do not exclusively involve the synaptic modules. We declared significance at 0.05 with FDR correction over the combinations of molecules, regions and effects of interests. We considered a molecule as significant if any of the three effects is significant with FDR correction. Considering a molecule could be involved in multiple molecular processes, we again applied cellular contextualization by substituting the molecular values with their dendritic spine components and repeated the analysis. To test if cellular contextualization improves association strength, for each combination of effect and region, for example, the main effect of SFG, we compared the −log10(P) of an effect with versus without cellular contextualization using a permutation test. Let d be a q × 1 vector with elements corresponding to the differences in −log10(P) between with and without cellular contextualization for q molecules. For each permutation, we randomly selected q/2 molecules, flipped the sign of their −log10(P) differences and computed the average difference over molecules. We repeated this procedure 104 times and computed the proportion of times the nonpermuted average −log10(P) difference is smaller than the permuted average log10(P) differences, that is, a permutation-based P value. We further performed GSEA33 on GO terms and a connectivity-related geneset54 for result interpretation. We applied GSEA to the t values of all measured molecules (separately for proteins and genes) for each of the three effects on functional connectivity (separately for SFG and ITG) and declared significance at 0.05 with Bonferroni correction (P < 5 × 10−6). Using t values of all measured molecules control for background genes. We considered a GO term as significant if any of the three effects is significant. For xmiβmi + xmi ◦ Xjβmij, we only have an F value for each molecule. Therefore, we approximated a t value for each molecule by taking the square root of its F value and multiplying it by the sign of the mean t value of xmiβmi and xmi ◦ Xjβmij, that is, taking the sign of the more dominant effect. For structural covariation, we only have P values derived by aggregating the CCA dimensions. Therefore, for each molecule, we approximated its effect direction by averaging its t values on a given effect across the CCA dimensions and taking the sign of this average. We used signed −log10(P) as input to GSEA.

MetaXcan analysis

To compare our connectivity-associated proteins against previous imaging genetics studies, we used a functional mapping approach, called MetaXcan55, to convert SNP-level GWAS summary statistics to protein level. For each protein m, We first built a model to extract the genetic component of its protein abundance: Pm = ΣsϵSm wmsSs + εm, where Pm is a n × 1 vector containing the abundance levels of protein m; Ss is a n × 1 vector containing the dosage of SNP s; wms is the sth element of a lm × 1 model weight vector, wm, to be estimated; and Sm is the set of lm SNPs within ±1 Mb from the transcription start site of gene m that produces protein m. Following MetaXcan, we estimated wms using elastic net regression by applying GLMNET with its default settings, that is, tenfold cross-validation to set the sparsity parameter. We regressed out confounds including age of death, sex, postmortem interval, top three ancestry principal components and top ten protein principal components from the protein abundance measurements before fitting the models. Given wms, protein level GWAS z score, zm, of protein m can be estimated by the following: zm = ΣsϵSm wmsσs/σm × zs, where zs is the GWAS z score of SNP s, and σm and σs are the variance of protein m, and SNP s, respectively. We built models for SFG and ITG proteins separately and packaged the models into SQLite databases (https://doi.org/10.7303/syn52087434) so users can directly use them with the public MetaXcan software (https://github.com/hakyimlab/MetaXcan). We applied MetaXcan with our protein models to a GWAS on functional connectivity between SFG and ITG (rfMRI connectivity ICA100 edge 849 (ref. 6)) and examined which proteins found associated with SFG–ITG connectivity in our data show significant genetic effects based on zm.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.