Different molecular enumeration influences in deep learning: An example using aqueous solubility

Chen J.-H;Tseng Y.J.

標題:	Different molecular enumeration influences in deep learning: An example using aqueous solubility
作者:	Chen J.-H YUFENG JANE TSENG
關鍵字:	biological sciences;cheminformatics;drug discovery;medicinal chemistry;article;attention;conformation;convolutional neural network;deep learning;drug solubility;prediction;water solubility;algorithm;chemistry;solubility;water;Algorithms;Deep Learning;Neural Networks, Computer;Solubility;Water
公開日期:	2021
卷:	22
期:	3
來源出版物:	Briefings in Bioinformatics
摘要:	Aqueous solubility is the key property driving many chemical and biological phenomena and impacts experimental and computational attempts to assess those phenomena. Accurate prediction of solubility is essential and challenging, even with modern computational algorithms. Fingerprint-based, feature-based and molecular graph-based representations have all been used with different deep learning methods for aqueous solubility prediction. It has been clearly demonstrated that different molecular representations impact the model prediction and explainability. In this work, we reviewed different representations and also focused on using graph and line notations for modeling. In general, one canonical chemical structure is used to represent one molecule when computing its properties. We carefully examined the commonly used simplified molecular-input line-entry specification (SMILES) notation representing a single molecule and proposed to use the full enumerations in SMILES to achieve better accuracy. A convolutional neural network (CNN) was used. The full enumeration of SMILES can improve the presentation of a molecule and describe the molecule with all possible angles. This CNN model can be very robust when dealing with large datasets since no additional explicit chemistry knowledge is necessary to predict the solubility. Also, traditionally it is hard to use a neural network to explain the contribution of chemical substructures to a single property. We demonstrated the use of attention in the decoding network to detect the part of a molecule that is relevant to solubility, which can be used to explain the contribution from the CNN. ? 2020 The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
URI:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85107088611&doi=10.1093%2fbib%2fbbaa092&partnerID=40&md5=cce160b7885a742cb0ac414a121cf1a4 https://scholars.lib.ntu.edu.tw/handle/123456789/607497
ISSN:	14675463
DOI:	10.1093/bib/bbaa092
顯示於：	資訊工程學系

顯示文件完整紀錄

SCOPUS^TM
Citations

checked on 2024/4/25

WEB OF SCIENCE^TM
Citations

checked on 2023/11/25

Page view(s)

checked on 2024/4/20

Google Scholar^TM

檢查

Altmetric

TAIR相關文章

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM