Introduction

Diabetes is a chronic condition characterized by insufficient insulin (type 1 diabetes) or impaired insulin function (type 2 diabetes), resulting in elevated blood glucose levels [1]. Type 2 diabetes (T2D) is the most prevalent type of diabetes, and poses a significant threat to human health. The incidence of diabetes is high, with approximately 463 million individuals worldwide estimated to have diabetes as of 2019, according to the World Health Organization (WHO) [2]. Recently, diabetes has become more common worldwide, mainly because more people are becoming inactive. T2D can lead to various complications, including cardiovascular diseases, neuropathy, diabetic nephropathy and non-alcoholic fatty liver disease (NAFLD), and has garnered significant attention worldwide [3, 4].

NAFLD is a chronic metabolic disorder characterized by excessive hepatic fat accumulation, unrelated to alcohol consumption. It has a global prevalence of 25.2% [5]. The exact cause of NAFLD is not fully understood, but it is believed to be strongly associated with T2D [6]. A study has revealed that over 22.51% of individuals diagnosed with T2D also exhibit NAFLD [5]. Furthermore, the relationship between T2D and NAFLD is bidirectional [7]. Individuals with T2D often present with metabolic dysfunction, which collectively contributes to the pathogenesis of NAFLD [8]. In addition, NAFLD can adversely affect individuals with T2D, increasing their susceptibility to complications [9]. However, the association between NAFLD and T2D is not yet fully understood, indicating the need for further research to understand this association better.

Liver enzyme and function tests, imaging examinations, and liver biopsies are commonly used diagnostic methods for NAFLD [10]. However, each of these approaches has inherent limitations. For instance, liver enzyme and function tests cannot directly diagnose NAFLD [6]. Similarly, imaging methods fail to provide intricate pathological information and may exhibit reduced sensitivity for detecting mild cases of NAFLD [11]. Moreover, a liver biopsy is invasive and can be subject to sampling errors [12, 13]. Therefore, further studies are needed to develop innovative diagnostic methods.

This study used bioinformatics and systems biology techniques to examine how certain molecular processes could help treat T2D and NAFLD. GSE185011 and GSE89632 were retrieved from the Gene Expression Omnibus (GEO) database to determine the biological relationships between T2D and NAFLD. Next, common differentially expressed genes (DEGs) shared between these two diseases were identified using GEO2R. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment of mutual DEGs was performed using bioinformatics tools. The protein-protein interaction network (PPI) was used to search for hub genes. Finally, a type 2 diabetic mouse model with NAFLD (genetically diabetic leptin receptor-mutated (db/db) mice) was used to validate the prediction results, and three genes were identified. The sequential workflow of this study is illustrated in Fig. 1.

Fig. 1: Schematic diagram of the overall workflow of this study.
figure 1

The study retrieved datasets GSE185011 and GSE89632 from the GEO database to explore the relationship between type 2 diabetes (T2D) and non-alcoholic fatty liver disease (NAFLD). Common differentially expressed genes (DEGs) were identified using GEO2R. Gene Ontology (GO) and KEGG enrichment analyses were performed on these DEGs. A protein–protein interaction (PPI) network identified hub genes. Validation was done using a type 2 diabetic mouse model with NAFLD (db/db mice), identifying three key genes.

Materials and methods

Data extraction

We obtained data from the GEO (https://www.ncbi.nlm.nih.gov/geo/) [14] to identify the common genetic linked between T2D and NAFLD. The diabetic dataset GSE185011 (GPL24676 platform) included peripheral blood samples from 5 healthy individuals and 5 patients with T2D. The NAFLD dataset GSE89632 (GPL14951 platform) included 24 healthy individuals and 19 patients with NAFLD (Table 1).

Table 1 A Summary of GEO-derived Datasets.

Identification of DEGs

The DEGs in peripheral blood samples from patients and the control group were analyzed using GEO2R [14] software with adjusted P < 0.05, log2FC > 1 as up regulation, and log2FC < -1 for DEGs as down regulation. This study used Venn software to check for crossover DEGs.

GO and KEGG pathway enrichment

GO and KEGG analysis was performed using the DAVID database [15] (https://david.ncifcrf.gov/) to investigate possible functions, including biological process (BP), molecular function (MF), and cellular component (CC) as well as pathways enriched in these DEGs. Statistical significance was set at adjusted P < 0.05.

PPI network construction and hub gene extraction

The STRING database (https://www.STRING-db.org) [16] was used to build PPI networks to examine the relationships between two datasets. Cytoscape software (v3.9.0) (https://www.cytoscape.org/) was used to visualize and analyze the PPI network. Additionally, the CytoHubba plugin [17] in Cytoscape was used to select the ten hub genes.

Animals

The Institutional Animal Care and Use Committee of the Huaqiao University School of Medicine approved this animal study. The lean leptin gene knockout mice (db/db) and non-homozygous littermates (db/m) were a gift from professor Qian Chen (Xiamen University, China). After rearing neonatal db/db and db/m mice under standard feeding conditions until 8 months of age, the mice were photographed and analyzed. Their body and liver weights were recorded, blood glucose levels were measured, and complete liver specimens were excised and photographed for subsequent analyses. Male db/db and db/m mice with C57BL/6 J genetic background were used in this study. The mice were randomly divided into two groups (db/db and db/m) with at least three mice in each group.

Hematoxylin and eosin staining

The mouse liver was fixed at 4 °C polyformaldehyde for 2 days, followed by embedding in paraffin. Paraffin-embedded tissues were sectioned at 5 μm thickness, and subsequently stained with hematoxylin and eosin (H&E) staining. Images were captured using a Leica camera.

Quantitative real-time polymerase chain reaction (qRT-PCR)

Total RNA was isolated from the liver using TRNzol Universal Reagent. For cDNA synthesis, 2 μg of the isolated total RNA was used as the template. RNA was reverse transcribed to cDNA using the ReverTra Ace qPCR RT Master Mix kit (Toyobo, Osaka, Japan). qRT-PCR analysis was conducted using a Stratagene™ Mx3005P qPCR Instrument to determine the expression levels of the target genes. The data acquired for each target gene were normalized using GAPDH as an internal reference gene. All primers were designed using the NCBI for Biotechnology Information Primer Design Tool. Primers were synthesized by Sangon Biotech Company (Shanghai, China), and their details are presented in Table 2.

Table 2 Primer sequences for qRT-PCR.

Statistics analysis

A moderated t-test was used to identify the DEGs, and statistical significance was set at an adjusted P < 0.05 using the Benjamini & Hochberg method [18]. The qRT-PCR data were analyzed using the GraphPad Prism software (version 8.0; La Jolla, CA, USA). All qRT-PCR data are expressed as the mean ± standard deviation (SD) from a minimum of three independent experiments. Statistical analyses for animal experiments were conducted using the Student’s t-test and statistical significance was set at P < 0.05.

Results

Common DEGs identification

First, 3856 DEGs were discovered in GSE89632; out of these, 2243 were upregulated and 1613 were downregulated. In GSE185011, 656 DEGs were identified; 206 were upregulated and 450 were downregulated. We identified 53 common DEGs (18 upregulated and 35 downregulated) between the T2D and NAFLD datasets. Figure 2 shows a cross-analysis comparison of the two datasets, and Table 3 lists all DEGs.

Fig. 2: Identification of differentially expressed genes (DEGs) in Type 2 diabetes (T2D) (GSE185011) and non-alcoholic fatty liver disease (NAFLD) (GSE89632) using GEO2R.
figure 2

A Volcano plots of the DEGs in GSE89632. The negative log10-transformed adjusted P values (Y axis) are plotted against the average log2 fold changes (X axis) in gene expressions. Identified DEGs are shown in red (log2FC > 1) and blue (log2FC < -1) with adjusted P < 0.05. B Heatmap of the DEGs in GSE89632. C Upregulated genes shared between GSE89632 and GSE185011. D Volcano plots of the DEGs in GSE185011. The negative log10-transformed adjusted P values (Y axis) are plotted against the average log2 fold changes (X axis) in gene expressions. Identified DEGs are shown in red (log2FC > 1) and blue (log2FC < -1) with adjusted P < 0.05. E Heatmap of the DEGs in GSE185011. F Downregulated genes shared between GSE89632 and GSE185011.

Table 3 Differentially expressed genes (DEGs) identification.

GO and KEGG analysis

Figure 3 shows that DEGs were strongly linked to transcriptional regulation, namely regulation of transcription in the BP category; cytosol, nuclear matrix, and endosome in the CC category; and RNA binding and ion channel binding in the MF category. Based on the KEGG pathway analysis, the most affected pathways were the ferroptosis signaling pathways.

Fig. 3: Functional analyses of common differentially expressed genes (DEGs).
figure 3

Red, blue, green, and dark blue indicated biological process (BP), molecular function (MF), and cellular component (CC), and KEGG pathway analyses, respectively.

PPI establishment and hub gene identification

Figure 4A shows the shared DEG-based PPI network with 27 nodes and 72 edges. The top ten DEGs identified were CD44, CASP3, FYN, KLF4, HNRNPM, HNRNPU, FUBP1, RUNX1, NOTCH3, and ANXA2 (Fig. 4B).

Fig. 4: Protein-protein interaction (PPI) network establishment and hub gene identification.
figure 4

A PPI network of common genes. Genes in red and blue boxes represent upregulated genes and downregulated genes, respectively. B Ten most significant genes involved in the PPI network.

NAFLD in type 2 diabetic mouse model

To verify the gene expression in the context of T2D accompanied by NAFLD, we used db/db mice as an experimental model; db/db mice, which are characterized by leptin deficiency, serve as a robust type 2 diabetic mouse model with a profound fatty liver phenotype (Fig. 5A). In the 8-month model of db/db mice, we observed a significant increase in both body size and weight compared to the control group of db/m mice (Fig. 5B, C). Blood glucose measurements demonstrated a pronounced elevation in db/db mice (Fig. 5D). Furthermore, the liver images displayed notable hepatomegaly, characterized by a pale appearance and extensive lipid accumulation. Quantitative evaluation of the liver weight confirmed a substantial increase (Fig. 5E, F). Histological analysis of H&E-stained liver sections revealed prominent hepatic steatosis and disruption of normal liver tissue architecture in db/db mice (Fig. 5G). Taken together, these results substantiate the successful establishment of a mouse model of T2D accompanied by NAFLD.

Fig. 5: Non-alcoholic fatty liver disease (NAFLD) in diabetic db/db mice.
figure 5

A Schematic illustration of the experimental procedure. db/m and db/db mice were fed normal diet until the age of 8 months (M). B Images comparing db/m and db/db mice. Blue arrows indicate db/m mice, red arrows indicate db/db mice. C Body weight of db/m and db/db mice. D Blood glucose analysis of db/m and db/db mice. E Images comparing the livers of db/m and db/db mice. F Liver weight of db/m and db/db mice. Statistical analyses in C, D, and F were performed using Student’s t-test. *P < 0.05; ***P < 0.001. G H&E staining of liver sections from db/m and db/db mice. Scale bars, 100 μm (upper panels) and 50 μm (lower panels). Each dot represents one mouse; n ≥ 3 in each group.

The expression of the targeted gene was validated through qRT-PCR analysis in a type 2 diabetic mouse model with NAFLD

The qRT-PCR results showed that the mRNA expression of FYN, HNRNPU, and FUBP1 in the livers of db/db and db/m mice was similar to that of the hub genes in the PPI analysis, and the expression levels of FYN, HNRNPU, and FUBP1 were significantly downregulated in db/db mice (Fig. 6). These findings suggest that the expression of differentially expressed genes identified through bioinformatics analysis aligns with the expected trend in a type 2 diabetic mouse model with NAFLD.

Fig. 6: The mRNA levels of FYN, HNRNPU, and FUBP1 in db/db and db/m mouse livers.
figure 6

A–C qRT-PCR analysis of FYN, HNRNPU, and FUBP1 in the liver of db/db (n = 3) and db/m (n = 3) mice. Statistical analyses in A–C were performed using Student’s t-test. **P < 0.01; ***P < 0.001.

Discussion

Diabetes and NAFLD are two chronic metabolic disorders that have become significant threats to human health [3, 4, 19]. The prevalence of diabetes is increasingly global due to rising obesity rates and sedentary lifestyles, while NAFLD develops when fat accumulates within the liver [20]. Unraveling the molecular pathways that drive these illnesses is essential for crafting successful therapies. In this study, we used bioinformatics and systems biology methods to explore the common molecular processes in T2D and NAFLD.

Our study revealed 53 DEGs between T2D and NAFLD. These DEGs were associated with various biological processes, including signal transduction, transcriptional regulation, cell proliferation, gene expression, and apoptosis. To gain insight into the pathways involved, we conducted KEGG enrichment analyses. The results indicated that the ferroptosis signaling pathway was most significantly affected. Ferroptosis is a type of regulated cell death dependent on iron, characterized by the accumulation of lipid peroxides [21]. Several studies suggested that ferroptosis was involved in the dysfunction and death of pancreatic β-cells in T2D [22, 23]. This process contributes to the loss of insulin-producing cells, exacerbating glucose intolerance and diabetes progression.

To validate these predictions, we established a type 2 diabetic mouse model with NAFLD. Additionally, we measured the mRNA expression levels of FYN, HNRNPU, and FUBP1 in the liver tissues of db/db and db/m mice. The qRT-PCR results confirmed that these genes were differentially expressed, which is consistent with the results of the PPI analysis.

The Fyn protein [24] produced by the FYN gene, is involved in various biological processes such as cell proliferation, differentiation, migration, and immune responses [25, 26]. The link between the FYN gene and diabetes has been investigated in several studies. Under high-glucose conditions, Fyn activated and promoted the activation of Rho-associated coiled-coil containing protein kinase, leading to F-actin reorganization and subsequent podocyte damage [27]. Among mice predisposed to prediabetes due to a high-fat diet, Fyn kinase expression and activation in the spinal cord increased, resulting in heightened sensitivity to touch and diminished response to heat [28].

The association between the FYN gene and NAFLD has been investigated in a study [27]. Fyn kinase may influence intestinal epithelial permeability, a critical factor implicated in the development of NAFLD. When this barrier is compromised, it allows more bacterial endotoxins to enter, which can lead to liver inflammation and the onset of steatohepatitis [29].

The HNRNPU gene encodes a multifunctional protein that plays a vital role in RNA binding, gene transcription regulation, chromatin organization, and the cellular response to DNA damage [30, 31]. This protein is crucial for maintaining cellular functions, and its dysregulation is associated with various diseases, including cancer and diabetes [32, 33]. HNRNPU, along with related heterogeneous nuclear ribonucleoproteins (hnRNPs), significantly modulates gene expression related to insulin regulation, oxidative stress, and kidney function, which are particularly relevant in the context of diabetes [34]. These proteins contribute to protective effects against common diabetes complications, such as hypertension and kidney injury, positioning them as potential therapeutic targets for managing diabetic conditions.

Even though the connection between the HNRNPU gene and NAFLD hasn’t been extensively researched, studies on hnRNPs in liver diseases, including NAFLD, provide important insights. A research examining gene expression in liver tissues has identified differential expression of several genes, including those encoding hnRNPs, among patients with NAFLD [35]. These findings suggest a potential role for hnRNPs in the pathogenesis of NAFLD.

The FUBP1 gene encodes Far Upstream Element-Binding Protein 1 (Fubp1) [36], which is instrumental in regulating gene expression, particularly that of the c-Myc oncogene [37]. This protein plays critical roles in various cellular processes, including cell proliferation, apoptosis, and differentiation [36]. Notably, FUBP1 upregulates hexokinase genes (HK1 and HK2), which are key enzymes in the glycolysis pathway. Glycolysis is essential for glucose metabolism, making its regulation directly relevant to diabetes, as disruptions in these pathways can lead to insulin resistance and glucose homeostasis issues [38].

The regulation of the c-Myc oncogene by FUBP1 is important in the context of liver diseases. Dysregulation of these processes can contribute to NAFLD, by affecting liver cell turnover and survival [39]. While direct evidence linking FUBP1 to diabetes and NAFLD is not well-established, its roles in gene regulation, metabolism, and cell survival suggest it could indirectly influence the development and progression of these conditions.

This study was preliminary and had several limitations. Further experimental studies are warranted to explore the specific roles and regulatory mechanisms of these genes and their pathways in T2D and NAFLD pathogenesis.

Conclusion

This study provides valuable insights into the molecular mechanisms underlying the development of T2D and NAFLD. The identified hub genes and pathways present promising prospects as therapeutic targets to address these prevalent metabolic disorders.