Nfor, Oswald NdiOswald NdiNforPEI-MING HUANGWu, Ming-FangMing-FangWuKE-CHENG CHENChou, Ying-HsiangYing-HsiangChouMONG-WEI LINZhong, Ji-HanJi-HanZhongSHUENN-WEN KUOLee, Yu-KwangYu-KwangLeeCHIH-HUNG HSUJANG-MING LEELiaw, Yung-PoYung-PoLiaw2025-05-142025-05-142025-12https://scholars.lib.ntu.edu.tw/handle/123456789/729274Background: Esophageal cancer (EC) presents a significant public health challenge globally, particularly in regions with high alcohol consumption. Its etiology is multifactorial, involving both genetic predispositions and lifestyle factors. Methods: This study aimed to develop a personalized risk prediction model for EC by integrating genetic polymorphisms (rs671 and rs1229984) with virtually generated alcohol consumption data, utilizing advanced artificial intelligence and machine learning techniques. We analyzed data from 86,845 individuals, including 763 diagnosed EC patients, sourced from the Taiwan Biobank. Eight machine learning models were employed: Bayesian Network, Decision Tree, Ensemble, Gradient Boosting, Logistic Regression, LASSO, Random Forest, and Support Vector Machines (SVM). A unique aspect of our approach was the virtual generation of alcohol consumption data, allowing us to evaluate risk profiles under both consuming and non-consuming scenarios. Results: Our analysis revealed that individuals with the genotypes rs671 = AG and rs1229984 = CC exhibited the highest probabilities of developing EC, with values ranging from 0.2041 to 0.9181. Notably, abstaining from alcohol could decrease their risk by approximately 16.29-49.58%. The Ensemble model demonstrated exceptional performance, achieving an area under the curve (AUC) of 0.9577 and a sensitivity of 0.9211. This transition from consumption to abstinence indicated a potential risk reduction of nearly 50% for individuals with high-risk genotypes. Conclusion: Overall, our findings highlight the importance of integrating virtually generated alcohol data for more precise personalized risk assessments for EC.enCancersEsophagusPersonalized medicinePredictive medicineRisk assessment[SDGs]SDG3Personalized prediction of esophageal cancer risk based on virtually generated alcohol data.journal article10.1186/s12967-025-06383-9401560232-s2.0-105001384900