Huang, Shun-ChenShun-ChenHuangLin, Yi-JiaYi-JiaLinWen, Mei-ChinMei-ChinWenWEI-CHOU LINFang, Pei-WeiPei-WeiFangLiang, Peir-InPeir-InLiangChuang, Hao-WenHao-WenChuangChien, Hui-PingHui-PingChienChen, Tai-DiTai-DiChen2023-11-022023-11-022023-05-012045-2322https://scholars.lib.ntu.edu.tw/handle/123456789/636765Interstitial inflammation scoring is incorporated into the Banff Classification of Renal Allograft Pathology and is essential for the diagnosis of T-cell mediated rejection. However, its reproducibility, including inter-rater and intra-rater reliabilities, has not been carefully investigated. In this study, eight renal pathologists from different hospitals independently scored 45 kidney allograft biopsies with varying extents of interstitial inflammation. Inter-rater reliabilities and intra-rater reliabilities were investigated by kappa statistics and conditional agreement probabilities. Individual pathologists' scoring patterns were examined by chi-squared tests and proportions tests. The mean pairwise kappa values for inter-rater reliability were 0.27, 0.30, and 0.26 for the Banff i score, ti score, and i-IFTA, respectively. No rater pair performed consistently better or worse than others on all three scorings. After dichotomizing the scores into two groups (none/mild and moderate/severe inflammation), the averaged conditional agreements ranged from 47.1% to 50.0%. The distributions of the scores differed, but some pathologists persistently scored higher or lower than others. Given the important role of interstitial inflammation scoring in the diagnosis of T-cell mediated rejection, transplant practitioners should be aware of the possible clinical implications of the far-from-optimal reproducibility.en[SDGs]SDG3Unsatisfactory reproducibility of interstitial inflammation scoring in allograft kidney biopsyjournal article10.1038/s41598-023-33908-3371277722-s2.0-85158066375https://api.elsevier.com/content/abstract/scopus_id/85158066375