A general optimization protocol for molecular property prediction using a deep learning network
Journal
Briefings in bioinformatics
Journal Volume
23
Journal Issue
1
Date Issued
2022
Author(s)
Chen J.-H
Abstract
The key to generating the best deep learning model for predicting molecular property is to test and apply various optimization methods. While individual optimization methods from different past works outside the pharmaceutical domain each succeeded in improving the model performance, better improvement may be achieved when specific combinations of these methods and practices are applied. In this work, three high-performance optimization methods in the literature that have been shown to dramatically improve model performance from other fields are used and discussed, eventually resulting in a general procedure for generating optimized CNN models on different properties of molecules. The three techniques are the dynamic batch size strategy for different enumeration ratios of the SMILES representation of compounds, Bayesian optimization for selecting the hyperparameters of a model and feature learning using chemical features obtained by a feedforward neural network, which are concatenated with the learned molecular feature vector. A total of seven different molecular properties (water solubility, lipophilicity, hydration energy, electronic properties, blood-brain barrier permeability and inhibition) are used. We demonstrate how each of the three techniques can affect the model and how the best model can generally benefit from using Bayesian optimization combined with dynamic batch size tuning. ? The Author(s) 2021. Published by Oxford University Press.
Subjects
CNN
deep learning
drug discovery
optimization
article
blood brain barrier
facial expression
feed forward neural network
hydration
lipophilicity
prediction
water solubility
Other Subjects
Bayes theorem; solubility; Bayes Theorem; Deep Learning; Neural Networks, Computer; Solubility
Type
journal article
