Search by:
Information Window as a Methodology for Assessing the Safety and Effectiveness Balance of Artificial Intelligence Medical Systems
Full text (PDF)
UDC: 004.8:614.253:004.056
Publication Language: Ukrainian
Stuc. intelekt. 2025; 30(4):10-23
Abstract: The article examines methodological approaches to assessing the safety and clinical efficacy of artificial intelligence (AI) systems in healthcare. A comparative analysis of the “benefit–risk” concept in pharmacology and medical AI systems is conducted to adapt the concept of the therapeutic window to information technologies. A model of the information window is proposed as a formalized tool for evaluating the balance between expected benefits and potential risks; this concept refers to the range of decision-making parameters within which an AI system maintains an optimal balance between safety and efficacy. Current regulatory requirements, technical performance indicators, ethical aspects, and social implications are taken into account. The necessity of phased testing of AI systems, analogous to clinical trials of pharmaceuticals, is substantiated. The prospects for developing international standards to define acceptable benefit–risk indices are outlined. The importance of continuous auditing, model updating, and establishing mechanisms of accountability is emphasized. The proposed approach contributes to the development of unified standards for the safe and effective implementation of AI in medical practice.
Keywords: artificial intelligence, safety, efficacy, therapeutic window, information window, modern regulatory requirements.
References:
- Azhazha, M., Venher, O., & Fursin, O. (2023). Kontseptsiia tsyfrovoho marketynhu 4.0: evoliutsiia, kharakterystyka, typolohiia. Humanities Studies: Collection of Scientific Papers / za red. V. Voronkovoi. Zaporizhzhia: Publishing house “Helvetica”, 14(91), 135–147. https://doi.org/10.32782/hst-2023-14-91-16
- Amann, J., Blasimme, A., Vayena, E., et al. (2020). Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Medical Informatics and Decision Making, 20, 310. https://doi.org/10.1186/s12911-020-01332-6
- Holovenko N.Ya (2004) Fyzyko-khymycheskaia farmakolohyia. Odessa. 2004 Fenyks. 740 s.
- Muller, P. Y., & Milton, M. N. (2012). The determination and interpretation of the therapeutic index in drug development. Nature Reviews Drug Discovery, 11(10), 751–761. https://doi.org/10.1038/nrd3801
- World Health Organization. (2025, March 25). Ethics and governance of artificial intelligence for health: Guidance on large multi-modal models (98 p.). https://www.who.int/publications/i/item/9789240084759
- Holovenko, M. Ya. (2012). Farmakokinetychnyi monitorynh — zaporuka ratsionalnoi farmakoterapii. Zhurnal NAMN Ukrainy, 18(4), 440–445.
- Chen, F., et al. (2024). Unmasking bias in artificial intelligence: A systematic review of bias detection and mitigation strategies in electronic health record-based models. Journal of the American Medical Informatics Association, 31(5), 1172–1183. https://doi.org/10.1093/jamia/ocae060
- Shevchenko, A. I. (red.) (2023). Stratehiia rozvytku shtuchnoho intelektu v Ukraini. Kyiv: Vydavnytstvo «Torpeda». 306 s.
- Vysotskyi, A. A., Surikov, O. O., & Vasyliuk-Zaitseva, S. V. (2023). Rozvytok shtuchnoho intelektu v suchasnii medytsyni. Ukrainskyi medychnyi chasopys, 2(154), 1–4. https://doi.org/10.32471/umj.1680-3051.154.2412213
- Svintsitskyi, A. V., Klymova, V. V., & Sendetskyi, S. S. (2024). Vykorystannia shtuchnoho intelektu v medytsyni, khirurhii, stomatolohii, onkolohii. Klinichna onkolohiia, 14(3[55]), 1–4. https://doi.org/10.32471/clinicaloncology.2663-466X.56-4.33692
- Wiens, J., Saria, S., Sendak, M., et al. (2019). Do no harm: A roadmap for responsible machine learning for health care. Nature Medicine, 25(9), 1337–1340. https://doi.org/10.1038/s41591-019-0548-6
- Beam, A. L., & Kohane, I. S. (2018). Big data and machine learning in health care. JAMA, 319(13), 1317–1318. https://doi.org/10.1001/jama.2017.18391
- Soenksen, L., Ma, Y., Zeng, C., Boussioux, L., Villalobos Carballo, K., Na, L., Wiberg, H., Li, M. L., Fuentes, I., & Bertsimas, D. (2022). Integrated multimodal artificial intelligence framework for healthcare applications. npj Digital Medicine, 5, 149. https://doi.org/10.1038/s41746-022-00689-4
- U.S. Food and Drug Administration. (2024). FDA proposes framework to advance credibility of AI models used for drug and biological product submissions. https://www.fda.gov/news-events/press-announcements/fda-proposes-framework-advance-credibility-ai-models-used-drug-and-biological-product-submissions
- Bai, T., Lan, H., & Tiwari, R. (2021). Bayesian approaches to benefit–risk assessment for diagnostic tests. Journal of Biopharmaceutical Statistics, 31(4), 541–558. https://doi.org/10.1080/10543406.2021.1931272
- Vuong, Q., Metcalfe, R. K., Harari, O., Mills, E. J., & Park, J. J. H. (2025, February 12). Benefit–risk assessment of medical products using Bayesian multi-criteria augmented decision analysis for clinical development. medRxiv. https://doi.org/10.1101/2023.08.31.23294918
- Farah L, Borget I, Martelli N, Vallee A. Suitability of the Current Health Technology Assessment of Innovative Artificial Intelligence-Based Medical Devices: Scoping Literature Review. J Med Internet Res 2024;26:e51514. doi: 10.2196/51514.
- Fraser, H., & Bello Villarino, J.-M. (2024). Acceptable risks in Europe’s proposed AI Act: Reasonableness and other principles for deciding how much risk management is enough. European Journal of Risk Regulation, 15(2), 431–446. https://doi.org/10.1017/err.2023.57
- Gulshan, V., Peng, L., Coram, M., et al. (2016). Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA, 316(22), 2402–2410. https://doi.org/10.1001/jama.2016.17216
- Gulshan, V., Peng, L., Coram, M., et al. (2016). Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA, 316(22), 2402–2410. https://doi.org/10.1001/jama.2016.17216
- Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542, 115–118. https://doi.org/10.1038/nature21056
- Rajpurkar, P., Irvin, J., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C., Shpanskaya, K., Lungren, M. P., & Ng, A. Y. (2017). Radiologist-level pneumonia detection on chest X-rays with deep learning: CheXNet. arXiv preprint arXiv:1711.052
- Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier. arXiv preprint arXiv:1602.04938. https://arxiv.org/abs/1602.04938
- Yim, J., Chopra, R., De Fauw, J., & Ledsam, J. (2020). Using AI to predict retinal disease progression. Nature. DeepMind. https://www.nature.com/articles/s41586-020-2974-4