Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 12;17(10):6214-6224.
doi: 10.1021/acs.jctc.1c00492. Epub 2021 Sep 13.

Machine-Learned Molecular Surface and Its Application to Implicit Solvent Simulations

Affiliations

Machine-Learned Molecular Surface and Its Application to Implicit Solvent Simulations

Haixin Wei et al. J Chem Theory Comput. .

Abstract

Implicit solvent models, such as Poisson-Boltzmann models, play important roles in computational studies of biomolecules. A vital step in almost all implicit solvent models is to determine the solvent-solute interface, and the solvent excluded surface (SES) is the most widely used interface definition in these models. However, classical algorithms used for computing SES are geometry-based, so that they are neither suitable for parallel implementations nor convenient for obtaining surface derivatives. To address the limitations, we explored a machine learning strategy to obtain a level set formulation for the SES. The training process was conducted in three steps, eventually leading to a model with over 95% agreement with the classical SES. Visualization of tested molecular surfaces shows that the machine-learned SES overlaps with the classical SES in almost all situations. Further analyses show that the machine-learned SES is incredibly stable in terms of rotational variation of tested molecules. Our timing analysis shows that the machine-learned SES is roughly 2.5 times as efficient as the classical SES routine implemented in Amber/PBSA on a tested central processing unit (CPU) platform. We expect further performance gain on massively parallel platforms such as graphics processing units (GPUs) given the ease in converting the machine-learned SES to a parallel procedure. We also implemented the machine-learned SES into the Amber/PBSA program to study its performance on reaction field energy calculation. The analysis shows that the two sets of reaction field energies are highly consistent with a 1% deviation on average. Given its level set formulation, we expect the machine-learned SES to be applied in molecular simulations that require either surface derivatives or high efficiency on parallel computing platforms.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1.
Figure 1.
Structure of an artificial neural network with one hidden layer. Here, each circular node represents an artificial neuron, and an arrow represents a connection from the output of one artificial neuron to the input of another.
Figure 2.
Figure 2.
Incremental training of the model. The ANN model was set up with different numbers of hidden layers; each hidden layer is of 200 neurons. The model was trained with the first training data set. The analysis was also repeated with two additional training sets, and the figures are shown in Figure S2, Supporting Information.
Figure 3.
Figure 3.
Superimposed rendering of the machine-learned SES (blue) and the classical SES (red) of representative molecules. (a) PDB ID: 1enh, all-α protein; (b) PDB ID: 1pgb, all-β protein; (c) PDB ID: 1shg, α/β protein; (d) PDB ID: 1w0u, protein/protein complex; (e) PDB ID: 3czw, RNA duplex; and (f) PDB ID: 3fdt, protein/DNA complex.
Figure 4.
Figure 4.
Superimposed rendering of machine-learned SES (blue) and the classical SES (red) of the GC complex. Starting from the hydrogen-bonded structure, the two molecules are manually pulled apart for 0.5 Å each step. The standalone figures of the two surfaces are included as Figures S8 and S9 in the Supporting Information.
Figure 5.
Figure 5.
Detailed views of SES as predicted by the first and second versions of the machine-learned SES model. The PDB id of the tested molecule is 1shg. Panels (a) and (c) are from the first version of the model, outside vision and the inside vision, respectively. Panels (b) and (d) are from a second version of the model, outside vision and the inside vision, respectively.
Figure 6.
Figure 6.
Timing comparison of the machine-learned SES and the classical SES. Both sets of data were collected on a single core of an INTEL Xeon E5-4620 CPU. The grid spacing was set to 0.95 Å as used in the collection of training data. The regression line between the two sets of timing data is y = 0.4114x.
Figure 7.
Figure 7.
PB reaction field energies with different molecular surfaces. (a) Correlation between energies from machine-learned SES and classical SES. (b) Correlation between energies from density function and classical SES. (c) Energy differences in panel (a). (d) Energy differences in panel (b). The lines in panels (a) and (b) are the diagonal line “y = x” as reference.

Similar articles

Cited by

  • A Concise Review of Biomolecule Visualization.
    Li H, Wei X. Li H, et al. Curr Issues Mol Biol. 2024 Feb 2;46(2):1318-1334. doi: 10.3390/cimb46020084. Curr Issues Mol Biol. 2024. PMID: 38392202 Free PMC article. Review.
  • AmberTools.
    Case DA, Aktulga HM, Belfon K, Cerutti DS, Cisneros GA, Cruzeiro VWD, Forouzesh N, Giese TJ, Götz AW, Gohlke H, Izadi S, Kasavajhala K, Kaymak MC, King E, Kurtzman T, Lee TS, Li P, Liu J, Luchko T, Luo R, Manathunga M, Machado MR, Nguyen HM, O'Hearn KA, Onufriev AV, Pan F, Pantano S, Qi R, Rahnamoun A, Risheh A, Schott-Verdugo S, Shajan A, Swails J, Wang J, Wei H, Wu X, Wu Y, Zhang S, Zhao S, Zhu Q, Cheatham TE 3rd, Roe DR, Roitberg A, Simmerling C, York DM, Nagan MC, Merz KM Jr. Case DA, et al. J Chem Inf Model. 2023 Oct 23;63(20):6183-6191. doi: 10.1021/acs.jcim.3c01153. Epub 2023 Oct 8. J Chem Inf Model. 2023. PMID: 37805934 Free PMC article.
  • Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment.
    Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Bass L, et al. J Chem Theory Comput. 2024 Jan 9;20(1):396-410. doi: 10.1021/acs.jctc.3c00981. Epub 2023 Dec 27. J Chem Theory Comput. 2024. PMID: 38149593 Free PMC article.
  • Grid-Robust Efficient Neural Interface Model for Universal Molecule Surface Construction from Point Clouds.
    Wu Y, Wei H, Zhu Q, Luo R. Wu Y, et al. J Phys Chem Lett. 2023 Oct 12;14(40):9034-9041. doi: 10.1021/acs.jpclett.3c02176. Epub 2023 Oct 2. J Phys Chem Lett. 2023. PMID: 37782231 Free PMC article.

References

    1. Richards FM Areas, volumes, packing, and protein structure. Annu. Rev. Biophys. Bioeng 1977, 6, 151–176. - PubMed
    1. Connolly ML Solvent-accessible surfaces of proteins and nucleic acids. Science 1983, 221, 709–713. - PubMed
    1. Connolly ML Analytical molecular surface calculation. J. Appl. Crystallogr 1983, 16, 548–558.
    1. Gilson MK; Sharp KA; Honig BH Calculating the electrostatic potential of molecules in solution - method and error assessment. J. Comput. Chem 1988, 9, 327–335.
    1. Rocchia W; Sridharan S; Nicholls A; Alexov E; Chiabrera A; Honig B Rapid grid-based construction of the molecular surface and the use of induced surface charge to calculate reaction field energies: Applications to the molecular systems and geometric objects. J. Comput. Chem 2002, 23, 128–137. - PubMed