Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 15;25(8):1214.
doi: 10.3390/e25081214.

A Machine Learning Approach to Simulate Gene Expression and Infer Gene Regulatory Networks

Affiliations

A Machine Learning Approach to Simulate Gene Expression and Infer Gene Regulatory Networks

Francesco Zito et al. Entropy (Basel). .

Abstract

The ability to simulate gene expression and infer gene regulatory networks has vast potential applications in various fields, including medicine, agriculture, and environmental science. In recent years, machine learning approaches to simulate gene expression and infer gene regulatory networks have gained significant attention as a promising area of research. By simulating gene expression, we can gain insights into the complex mechanisms that control gene expression and how they are affected by various environmental factors. This knowledge can be used to develop new treatments for genetic diseases, improve crop yields, and better understand the evolution of species. In this article, we address this issue by focusing on a novel method capable of simulating the gene expression regulation of a group of genes and their mutual interactions. Our framework enables us to simulate the regulation of gene expression in response to alterations or perturbations that can affect the expression of a gene. We use both artificial and real benchmarks to empirically evaluate the effectiveness of our methodology. Furthermore, we compare our method with existing ones to understand its advantages and disadvantages. We also present future ideas for improvement to enhance the effectiveness of our method. Overall, our approach has the potential to greatly improve the field of gene expression simulation and gene regulatory network inference, possibly leading to significant advancements in genetics.

Keywords: complex network; gene regulatory network; machine learning; metaheuristic; reverse engineering; time-series forecasting.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
An example of a gene regulatory network that includes gene regulation information.
Figure 2
Figure 2
Process to infer a gene regulatory network.
Figure 3
Figure 3
Visual representation of an agent.
Figure 4
Figure 4
Representation of the perturbation functions considered. The two perturbation functions (a,b) share the same parameters φb, φd, and φw, which denote the initial value of the i-th gene to be perturbed, the overall duration of the perturbation and the width of the perturbation, respectively. Additionally, the trapezium perturbation function (b) requires another parameter φp, which represents the number of time steps for which the peak value is maintained.
Figure 5
Figure 5
Function to transform a regulatory value into a probability value.
Figure 6
Figure 6
Regulation of expression of eight genes using the dataset with ID 17.
Figure 7
Figure 7
Regulation of expression of ten genes using the dataset with ID 10.
Figure 8
Figure 8
Comparing gene expression regulation performed by different models on the SOS DNA Repair dataset. The solid lines represent the actual values of gene expression for the two selected genes, while the dashed lines are the predictions made by the models.
Figure 9
Figure 9
SOS DNA Repair [39].
Figure 10
Figure 10
Trapezium perturbation function on the gene lexA in the SOS DNA Repair dataset.
Figure 11
Figure 11
Results obtained by our methodology taking into account the two types of perturbations: instant perturbation function and trapezium perturbation function. The dataset ID is an identifier that represents the dataset used in that experiment. The full list of datasets is reported in Table 3.
Figure 12
Figure 12
This figure presents a comparison between our approach (labeled as “Our”) and the state-of-the-art method for the SOS DNA Repair dataset, with the results sourced from [11].
Figure 13
Figure 13
This figure compares the performance of our approach (labeled as “Our”) with the state-of-the-art method for DREAM4 datasets. The values represent the average of the Area Under the Curve (AUC) obtained for each instance, as per [11], where the results from other methods are used for comparison.

References

    1. Gout J.F., Kahn D., Duret L., Paramecium Post-Genomics Consortium The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution. PLoS Genet. 2010;6:e1000944. doi: 10.1371/annotation/c55d5089-ba2f-449d-8696-2bc8395978db. - DOI - PMC - PubMed
    1. Karlebach G., Shamir R. Modeling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 2008;9:770–780. doi: 10.1038/nrm2503. - DOI - PubMed
    1. Shu H., Zhou J., Lian Q., Li H., Zhao D., Zeng J., Ma J. Modeling gene regulatory networks using neural network architectures. Nat. Comput. Sci. 2021;1:491–501. doi: 10.1038/s43588-021-00099-8. - DOI - PubMed
    1. Aubin-Frankowski P.C., Vert J.P. Gene regulation inference from single-cell RNA-seq data with linear differential equations and velocity inference. Bioinformatics. 2020;36:4774–4780. doi: 10.1093/bioinformatics/btaa576. - DOI - PubMed
    1. Pratapa A., Jalihal A.P., Law J.N., Bharadwaj A., Murali T.M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods. 2020;17:147–154. doi: 10.1038/s41592-019-0690-6. - DOI - PMC - PubMed

Grants and funding

This research received no external funding.

LinkOut - more resources