https://scholars.lib.ntu.edu.tw/handle/123456789/607497
標題: | Different molecular enumeration influences in deep learning: An example using aqueous solubility | 作者: | Chen J.-H YUFENG JANE TSENG |
關鍵字: | biological sciences;cheminformatics;drug discovery;medicinal chemistry;article;attention;conformation;convolutional neural network;deep learning;drug solubility;prediction;water solubility;algorithm;chemistry;solubility;water;Algorithms;Deep Learning;Neural Networks, Computer;Solubility;Water | 公開日期: | 2021 | 卷: | 22 | 期: | 3 | 來源出版物: | Briefings in Bioinformatics | 摘要: | Aqueous solubility is the key property driving many chemical and biological phenomena and impacts experimental and computational attempts to assess those phenomena. Accurate prediction of solubility is essential and challenging, even with modern computational algorithms. Fingerprint-based, feature-based and molecular graph-based representations have all been used with different deep learning methods for aqueous solubility prediction. It has been clearly demonstrated that different molecular representations impact the model prediction and explainability. In this work, we reviewed different representations and also focused on using graph and line notations for modeling. In general, one canonical chemical structure is used to represent one molecule when computing its properties. We carefully examined the commonly used simplified molecular-input line-entry specification (SMILES) notation representing a single molecule and proposed to use the full enumerations in SMILES to achieve better accuracy. A convolutional neural network (CNN) was used. The full enumeration of SMILES can improve the presentation of a molecule and describe the molecule with all possible angles. This CNN model can be very robust when dealing with large datasets since no additional explicit chemistry knowledge is necessary to predict the solubility. Also, traditionally it is hard to use a neural network to explain the contribution of chemical substructures to a single property. We demonstrated the use of attention in the decoding network to detect the part of a molecule that is relevant to solubility, which can be used to explain the contribution from the CNN. ? 2020 The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85107088611&doi=10.1093%2fbib%2fbbaa092&partnerID=40&md5=cce160b7885a742cb0ac414a121cf1a4 https://scholars.lib.ntu.edu.tw/handle/123456789/607497 |
ISSN: | 14675463 | DOI: | 10.1093/bib/bbaa092 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。