|
Volume 13, Issue 5 (May 2026), Pages: 235-245
----------------------------------------------
Original Research Paper
AI-based cardiovascular risk stratification using population health data: An intelligent risk assessment agent (IRAA)
Author(s):
Oualid Ali *
Affiliation(s):
College of Arts and Science, Applied Science University, Manama, Kingdom of Bahrain
Full text
Full Text - PDF
* Corresponding Author.
Corresponding author's ORCID profile: https://orcid.org/0009-0001-1433-017X
Digital Object Identifier (DOI)
https://doi.org/10.21833/ijaas.2026.05.022
Abstract
Cardiovascular disease remains one of the leading causes of death worldwide, and identifying Artificial Intelligence paradigms that can support early diagnosis and preventive measures is therefore of great importance. Although many machine learning (ML) studies report high classification accuracy using cardiovascular risk predictors, these results can be misleading because of strong class imbalance and the population-screening nature of the data. In this study, we developed an explainable AI-based IRAA that focuses on cardiovascular risk categorization rather than binary diagnosis. To systematically evaluate different ML models using an imbalance- and risk-sensitive assessment framework, we employed a large population-based health investigation dataset. The proposed system achieved a stable ROC-AUC of approximately 0.83 and a PR-AUC of around 0.31, identifying more than 63% of heart disease cases within the top 25% of risk groups and nearly 78% within the top 30%. These results demonstrate the potential of the model for early screening and case prioritization rather than final clinical decision-making. To improve transparency and user trust, SHAP-based explanations were integrated into a conversational IRAA interface, enabling doctors and users to understand how demographic, lifestyle, and comorbidity factors contribute to an individual’s risk assessment. This functionality helps bridge the gap between the interpretability of complex predictive models and user understanding. The findings highlight the limitations of accuracy-focused evaluation methods and support a shift toward explainable and risk-aware AI-based cardiovascular screening at the population level.
© 2026 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords
Cardiovascular risk assessment, Explainable artificial intelligence, Risk stratification, Preventive health screening, Machine learning in healthcare
Article history
Received 19 January 2026, Received in revised form 20 May 2026, Accepted 24 May 2026
Data availability
The dataset used in this study was derived from the 2021 Behavioral Risk Factor Surveillance System (BRFSS) published by the Centers for Disease Control and Prevention (CDC). The processed dataset analyzed during the current study is publicly available through the Kaggle repository: https://www.kaggle.com/datasets/alphiree/cardiovascular-diseases-risk-prediction-dataset The original BRFSS dataset is available from the CDC repository: https://www.cdc.gov/brfss/
Funding
Funding was received for this study from the Applied Science University, Manama, Kingdom of Bahrain.
Acknowledgment
The author gratefully acknowledges the support provided by Applied Science University, Manama, Kingdom of Bahrain, for funding and supporting this research work.
Compliance with ethical standards
Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Citation:
Ali O (2026). AI-based cardiovascular risk stratification using population health data: An intelligent risk assessment agent (IRAA). International Journal of Advanced and Applied Sciences, 13(5): 235-245
Permanent Link to this page
----------------------------------------------
References (28)- Al Khatib N, Chehab A, Tamim H, Dakik HA, and Tajeddine R (2026). Machine learning-based cardiovascular risk calculator for non-cardiac surgery. Open Heart, 13: e003565. https://doi.org/10.1136/openhrt-2025-003565 [Google Scholar] PMid:41500561 PMCid:PMC12781980
- Almutairi M and Dardouri S (2025). Intelligent hybrid modeling for heart disease prediction. Information, 16(10): 869. https://doi.org/10.3390/info16100869 [Google Scholar]
- Babicki M, Ledwoch J, Zieliński T et al. (2025). Assessment of cardiovascular risk factors and effect of lifestyle in individuals without cardiovascular disease, diabetes or chronic kidney disease. Scientific Reports, 15: 13544. https://doi.org/10.1038/s41598-025-98215-5 [Google Scholar] PMid:40253544 PMCid:PMC12009300
- Banerjee T and Paçal İ (2025). A systematic review of machine learning in heart disease prediction. Turkish Journal of Biology, 49(5): 600-634. https://doi.org/10.55730/1300-0152.2766 [Google Scholar] PMid:41246228 PMCid:PMC12614364
- Bhatt CM, Patel P, Ghetia T, and Mazzeo PL (2023). Effective heart disease prediction using machine learning techniques. Algorithms, 16(2): 88. https://doi.org/10.3390/a16020088 [Google Scholar]
- Bilal A, Alzahrani A, Almohammadi K, Saleem M, Farooq MS, and Sarwar R (2025). Explainable AI-driven intelligent system for precision forecasting in cardiovascular disease. Frontiers in Medicine, 12: 1596335. https://doi.org/10.3389/fmed.2025.1596335 [Google Scholar] PMid:40703259 PMCid:PMC12283576
- Chang V, Bhavani VR, Xu AQ, and Hossain MA (2022). An artificial intelligence model for heart disease detection using machine learning algorithms. Healthcare Analytics, 2: 100016. https://doi.org/10.1016/j.health.2022.100016 [Google Scholar]
- Coronnello C and Francipane MG (2022). Moving towards induced pluripotent stem cell-based therapies with artificial intelligence and machine learning. Stem Cell Reviews and Reports, 18: 559-569. https://doi.org/10.1007/s12015-021-10302-y [Google Scholar] PMid:34843066 PMCid:PMC8930923
- Gul G, Korejo IA, Hakro DN, Alqahtani H, Abbasi A, Babar M, Al Rahbi O, and Ali NI (2026). Machine learning and ensemble methods for cardiovascular disease prediction: A systematic review of approaches, performance trends, and research challenges. Computers, 15(1): 25. https://doi.org/10.3390/computers15010025 [Google Scholar]
- Hamid M, Hajjej F, Alluhaidan AS, and bin Mannie NW (2025). Fine tuned CatBoost machine learning approach for early detection of cardiovascular disease through predictive modeling. Scientific Reports, 15: 31199. https://doi.org/10.1038/s41598-025-13790-x [Google Scholar] PMid:40854918 PMCid:PMC12378338
- Hossain ME, Uddin S, and Khan A (2021). Network analytics and machine learning for predictive risk modelling of cardiovascular disease in patients with type 2 diabetes. Expert Systems with Applications, 164: 113918. https://doi.org/10.1016/j.eswa.2020.113918 [Google Scholar]
- Jiang ZZ, Jiang YF, Wang Y, Zhou Y, Peng RL, Ma CY, and Liu XT (2025). Development of an explainable machine learning model for 3-year cardiovascular risk prediction in new-onset type 2 diabetes using the TyG index and ultrasound features. BMC Medical Informatics and Decision Making, 25: 409. https://doi.org/10.1186/s12911-025-03247-6 [Google Scholar] PMid:41188894 PMCid:PMC12584365
- Kissi SA, Talukder MGM, and Iqbal MZ (2025). Data-driven predictive modelling of lifestyle risk factors for cardiovascular health. Electronics, 14(14): 2906. https://doi.org/10.3390/electronics14142906 [Google Scholar]
- Lippert M, Dumont KA, Birkeland S, Nainamalai V, Solvin H, Suther KR, Bendz B, Elle OJ, and Brun H (2024). Cardiac anatomic digital twins: Findings from a single national centre. European Heart Journal - Digital Health, 5(6): 725-734. https://doi.org/10.1093/ehjdh/ztae070 [Google Scholar] PMid:39563912 PMCid:PMC11570384
- Martin B, Bennett TD, DeWitt PE, Russell S, and Sanchez-Pinto LN (2025). Use of the area under the precision-recall curve to evaluate prediction models of rare critical illness events. Pediatric Critical Care Medicine, 26(6): e855-e859. https://doi.org/10.1097/PCC.0000000000003752 [Google Scholar] PMid:40304543 PMCid:PMC12133047
- Muneer S, Ghazal TM, Alyas T, Raza MA, Abbas S, AlZoubi O, and Ali O (2024). Explainable AI-driven chatbot system for heart disease prediction using machine learning. International Journal of Advanced Computer Science and Applications, 15(12): 249-261. https://doi.org/10.14569/IJACSA.2024.0151227 [Google Scholar]
- Reátegui R, Tandazo-Malla C, Suárez R, and Ramírez-Cerna L (2025). Cardiovascular risk prediction via ensemble machine learning and oversampling methods. Scientific Reports, 15: 43576. https://doi.org/10.1038/s41598-025-30895-5 [Google Scholar] PMid:41326754 PMCid:PMC12698720
- Segar MW, Jaeger BC, Patel KV et al. (2021). Development and validation of machine learning–based race-specific models to predict 10-year risk of heart failure: A multicohort analysis. Circulation, 143(24): 2370-2383. https://doi.org/10.1161/CIRCULATIONAHA.120.053134 [Google Scholar] PMid:33845593 PMCid:PMC9976274
- Sen J and Bhattacharya S (2025). An explainable hybrid framework for early detection of cardiovascular diseases using Categorical Boosting and Bees algorithm. Scientific Reports, 15: 45748. https://doi.org/10.1038/s41598-025-28514-4 [Google Scholar] PMid:41390781 PMCid:PMC12756275
- Shah P, Shukla M, Dholakia NH, and Gupta H (2025). Predicting cardiovascular risk with hybrid ensemble learning and explainable AI. Scientific Reports, 15: 17927. https://doi.org/10.1038/s41598-025-01650-7 [Google Scholar] PMid:40410273 PMCid:PMC12102235
- Shannaq B (2025). A smart medical assistant robot for explainable AI-based Alzheimer’s disease prediction using big data analytics. International Journal of Advanced and Applied Sciences, 12(10): 118-128. https://doi.org/10.21833/ijaas.2025.10.014 [Google Scholar]
- Shannaq B, Adebiaye R, Owusu T, and Al-Zeidi A (2024). An intelligent online human-computer interaction tool for adapting educational content to diverse learning capabilities across Arab cultures: Challenges and strategies. Journal of Infrastructure, Policy and Development, 8(9): 7172. https://doi.org/10.24294/jipd.v8i9.7172 [Google Scholar]
- Shannaq B, Al Shamsi I, and Majeed SNA (2019). Management information system for predicting quantity martials. TEM Journal, 8(4): 1143-1149. https://doi.org/10.18421/TEM84-06 [Google Scholar]
- Talaat FM, Elnaggar AR, Shaban WM, Shehata M, and Elhosseini M (2024). CardioRiskNet: A hybrid AI-based model for explainable risk prediction and prognosis in cardiovascular disease. Bioengineering, 11(8): 822. https://doi.org/10.3390/bioengineering11080822 [Google Scholar] PMid:39199780 PMCid:PMC11351968
- Talukder MA, Talaat AS, Kazi M, and Khraisat A (2025). XAI-HD: An explainable artificial intelligence framework for heart disease detection. Artificial Intelligence Review, 58: 385. https://doi.org/10.1007/s10462-025-11385-6 [Google Scholar]
- Wah JNK (2025). Revolutionizing e-health: The transformative role of AI-powered hybrid chatbots in healthcare solutions. Frontiers in Public Health, 13: 1530799. https://doi.org/10.3389/fpubh.2025.1530799 [Google Scholar] PMid:40017541 PMCid:PMC11865260
- Wei X, Rao C, Xiao X, Chen L, and Goh M (2023). Risk assessment of cardiovascular disease based on SOLSSA-CatBoost model. Expert Systems with Applications, 219: 119648. https://doi.org/10.1016/j.eswa.2023.119648 [Google Scholar]
- Zhu H, Asiaee A, Azinfar L, Li J, Liang H, Irajizad E, Do KA, and Long JP (2025). AUPRC: A metric for evaluating the performance of in-silico perturbation methods in identifying differentially expressed genes. Briefings in Bioinformatics, 26(5): bbaf426. https://doi.org/10.1093/bib/bbaf426 [Google Scholar] PMid:40889115 PMCid:PMC12400816
|