首页 / 院系成果 / 成果详情页

Machine learning to improve HIV screening using routine data in Kenya 期刊论文

编号：

39362CCCA6CACEEA8E332C190DC3E34E
作者：

Friedman, Jonathan D.^[1] Mwangi, Jonathan M.^[2] Muthoka, Kennedy J.^[3] Otieno, Benedette A.^[3] Odhiambo, Jacob O.^[3] Miruka, Frederick O.^[2] Nyagah, Lilly M.^[4] Mwele, Pascal M.^[3] Obat, Edmon O.^[5] Omoro, Gonza O.^[6] Ndisha, Margaret M.^[2] Kimanga, Davies O.^[2]
语种：

英文
期刊：

JOURNAL OF THE INTERNATIONAL AIDS SOCIETY ISSN：1758-2652 2025 年 28 卷 4 期 ; APR
疾病分类：

艾滋病
关键词：

artificial intelligence electronic health records HIV infection diagnosis Kenya machine learning routine data
摘要：

IntroductionOptimal use of HIV testing resources accelerates progress towards ending HIV as a global threat. In Kenya, current testing practices yield a 2.8% positivity rate for new diagnoses reported through the national HIV electronic medical record (EMR) system. Increasingly, researchers have explored the potential for machine learning to improve the identification of people with undiagnosed HIV for referral for HIV testing. However, few studies have used routinely collected programme data as the basis for implementing a real-time clinical decision support system to improve HIV screening. In this study, we applied machine learning to routine programme data from Kenya''s EMR to predict the probability that an individual seeking care is undiagnosed HIV positive and should be prioritized for testing.MethodsWe combined de-identified individual-level EMR data from 167,509 individuals without a previous HIV diagnosis who were tested between June and November 2022. We included demographics, clinical histories and HIV-relevant behavioural practices with open-source data that describes population-level behavioural practices as other variables in the model. We used multiple imputations to address high rates of missing data, selecting the optimal technique based on out-of-sample error. We generated a stratified 60-20-20 train-validate-test split to assess model generalizability. We trained four machine learning algorithms including logistic regression, Random Forest, AdaBoost and XGBoost. Models were evaluated using Area Under the Precision-Recall Curve (AUCPR), a metric that is well-suited to cases of class imbalance such as this, in which there are far more negative test results than positive.ResultsAll model types demonstrated predictive performance on the test set with AUCPR that exceeded the current positivity rate. XGBoost generated the greatest AUCPR, 10.5 times greater than the rate of positive test results.ConclusionsOur study demonstrated that machine learning applied to routine HIV testing data may be used as a clinical decision support tool to refer persons for HIV testing. The resulting model could be integrated in the screening form of an EMR and used as a real-time decision support tool to inform testing decisions. Although issues of data quality and missing data remained, these challenges could be addressed using sound data preparation techniques.
推荐引用方式
GB/T 7714：

Friedman Jonathan D.,Mwangi Jonathan M.,Muthoka Kennedy J., et al. Machine learning to improve HIV screening using routine data in Kenya [J].JOURNAL OF THE INTERNATIONAL AIDS SOCIETY,2025,28(4).
APA：

Friedman Jonathan D.,Mwangi Jonathan M.,Muthoka Kennedy J.,Otieno Benedette A.,&Kimanga Davies O..(2025).Machine learning to improve HIV screening using routine data in Kenya .JOURNAL OF THE INTERNATIONAL AIDS SOCIETY,28(4).
MLA：

Friedman Jonathan D., et al. "Machine learning to improve HIV screening using routine data in Kenya" .JOURNAL OF THE INTERNATIONAL AIDS SOCIETY 28,4(2025).
数据来源自科睿唯安Web of Science核心合集
入库时间：

2025/5/14 11:58:41
更新时间：

2025/5/14 11:58:41

浏览次数：13  下载次数：0

分享到：

浏览次数：13

下载次数：0

打印次数：0