Evaluating algorithmic fairness of machine learning models in predicting underweight, overweight, and adiposity across socioeconomic and caste groups in India: evidence from the longitudinal ageing study in India

JOHN TAYU LEEHsu, Sheng HuiSheng HuiHsuLi, Vincent Cheng-ShengVincent Cheng-ShengLiAnindya, KanyaKanyaAnindyaChen, Meng-HuanMeng-HuanChenCHARLOTTE WANGShen, Toby Kai-BoToby Kai-BoShenLiu, Valerie Tzu NingValerie Tzu NingLiuChen, Hsiao-HuiHsiao-HuiChenAtun, RifatRifatAtunKuo, Po-Chih2025-12-302025-12-302025-11-26https://scholars.lib.ntu.edu.tw/handle/123456789/734827Machine learning (ML) models are increasingly applied to predict body mass index (BMI) and related outcomes, yet their fairness across socioeconomic and caste groups remains uncertain, particularly in contexts of structural inequality. Using nationally representative data from more than 55,000 adults aged 45 years and older in the Longitudinal Ageing Study in India (LASI), we evaluated the accuracy and fairness of multiple ML algorithms-including Random Forest, XGBoost, Gradient Boosting, LightGBM, Deep Neural Networks, and Deep Cross Networks-alongside logistic regression for predicting underweight, overweight, and central adiposity. Models were trained on 80% of the data and tested on 20%, with performance assessed using AUROC, accuracy, sensitivity, specificity, and precision. Fairness was evaluated through subgroup analyses across socioeconomic and caste groups and equity-based metrics such as Equalized Odds and Demographic Parity. Feature importance was examined using SHAP values, and bias-mitigation methods were implemented at pre-processing, in-processing, and post-processing stages. Tree-based models, particularly LightGBM and Gradient Boosting, achieved the highest AUROC values (0.79-0.84). Incorporating socioeconomic and health-related variables improved prediction, but fairness gaps persisted: performance declined for scheduled tribes and lower socioeconomic groups. SHAP analyses identified grip strength, gender, and residence as key drivers of prediction differences. Among mitigation strategies, Reject Option Classification and Equalized Odds Post-processing moderately reduced subgroup disparities but sometimes decreased overall performance, whereas other approaches yielded minimal gains. ML models can effectively predict obesity and adiposity risk in India, but addressing bias is essential for equitable application. Continued refinement of fairness-aware ML methods is needed to support inclusive and effective public-health decision-making.enEvaluating algorithmic fairness of machine learning models in predicting underweight, overweight, and adiposity across socioeconomic and caste groups in India: evidence from the longitudinal ageing study in Indiajournal article10.1371/journal.pdig.000095141296741