Search by:
Technical Approach to Converting Medical Information System (MIS) "ESCULAP" Archives for Artificial Intelligence Tasks: Experience of State Institution of Science “Center of Innovative Healthcare Technologies” State Administrative Department
Full text (PDF)
UDC: 004.4:004.8:004.72
Publication Language: Ukrainian
Stuc. intelekt. 2026; 31(1):117-126
Abstract: The digital transformation of healthcare requires converting accumulated medical data into formats suitable for analysis. A significant portion of medical information in Ukraine is stored in archives of legacy systems, including the medical information system (MIS) “Esculap”. These data rely on the dBase structure (.DBF and .FPT files), which prevents their direct use in modern artificial intelligence and machine learning applications. This paper describes a technical approach to converting such archives. An algorithm was developed in Python using the libraries dbfread, pandas, and numpy. The proposed method enables extraction and systematization of depersonalized patient data, diagnoses, and treatment histories. Particular attention is given to resolving text encoding issues and processing MEMO fields stored in .FPT files. The result of the study is the transformation of relational tables in legacy formats into CSV files compatible with contemporary analytical tools. The resulting datasets can be used for machine learning tasks, neural network development, and statistical research in public health. The proposed approach was tested on archival data of the State Institution of Science “Center of Innovative Healthcare Technologies” of the State Administrative Department. The developed algorithm ensures accurate preparation of longitudinal medical data for further morbidity analysis. The software tool enables annual updates of research databases by converting newly generated archival records into a structured and analysis-ready format. The practical implementation of the algorithm included the inventory of 195 archival files, identification of complete .DBF/.FPT pairs, and validation of MEMO field integrity. The conversion process accounted for Windows-1251 encoding specifics, the presence of corrupted or incomplete tables, and the risk of automatic data type alteration when opening CSV files in spreadsheet editors. This approach minimized information loss and preserved logical links between patients, clinical episodes, and textual medical records.
Keywords: medical information systems, MIS “Esculap”, data conversion, Python, depersonalization, dBase format, public health, artificial intelligence, machine learning
References:
- Cabinet of Ministers of Ukraine. (2025). “Strategy for the Development of the Healthcare System until 2030 ” (Order № 34-р). Retrieved from https://zakon.rada.gov.ua/laws/show/34-2025-%D1%80?lang=en#Text
- Kalkatawi, M. Beyond the upgrade: unraveling the complexities of health information system migration. Discov Health Systems 4, 7 (2025). https://doi.org/10.1007/s44250-025-00186-x
- Chen, W., Xie, F., McCarthy, D. P., Reynolds, K. L., Lee, M., Coleman, K. J., Getahun, D., Koebnick, C., & Jacobsen, S. J. (2023). Research data warehouse: using electronic health records to support clinical and translational research. JAMIA Open, 6(2), e2023–039. https://doi.org/10.1093/jamiaopen/ooad039
- Huang, C., Koppel, R., McGreevey, J. D. III, Craven, C. K., & Schreiber, R. (2020). Transitions from one electronic health record to another: challenges, pitfalls, and recommendations. Applied Clinical Informatics, 11(5), 742–754. https://doi.org/10.1055/s-0040-1718535
- Markus, A. F., Kors, J. A., & Rijnbeek, P. R. (2021). The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies. Journal of Biomedical Informatics, 113, 103655. https://doi.org/10.1016/j.jbi.2020.103655
- dBASE LLC. (n.d.). dBASE Table File Format (DB7). Retrieved from https://www.dbase.com/Knowledgebase/INT/db7_file_fmt.htm
- Microsoft Corporation. (n.d.). MEMO File Structure (.FPT) - Visual FoxPro 9.0. Retrieved from https://www.vfphelp.com/help/_5WN12PC0N.htm
- Microsoft Corporation. (n.d.). Compound Index File Structure (CDX) - Visual FoxPro 9.0 Help. Retrieved from https://www.vfphelp.com/help/_5wn12pbyu.htm
- Microsoft Corporation. (2024, December 17). Back Up and Restore of SQL Server Databases. Retrieved from https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/back-up-and-restore-of-sql-server-databases
- Microsoft Corporation. (n.d.). Index File Structure (.IDX) - Visual FoxPro 9.0 SP2 Help. Retrieved from https://www.vfphelp.com/help/html/fbad54cc-cf7f-4add-a0d9-ddbeec5e00cc.htm
- Ole Martin Bjørndalen. (n.d.). DBFread documentation - Introduction. Retrieved from https://dbfread.readthedocs.io/en/latest/introduction.html
- Python Software Foundation. (n.d.). Python 3.x Documentation. Retrieved from https://docs.python.org/3/
- Pandas Development Team. (2025). pandas: powerful Python data analysis toolkit. Retrieved from https://pandas.pydata.org/docs/
- NumPy Developers. (n.d.). NumPy Documentation. Retrieved from https://numpy.org/doc/
- Ole Martin Bjørndalen. (n.d.). DBFread - Read DBF Files with Python. Retrieved from https://dbfread.readthedocs.io/en/latest/
- Chardet Developers. (n.d.). chardet 5.2.0 documentation. Retrieved from https://chardet.readthedocs.io/
- Python Software Foundation. (n.d.). os - Miscellaneous operating system interfaces. In Python 3.x documentation. Retrieved from https://docs.python.org/3/library/os.html
- Python Software Foundation. (n.d.). logging - Logging facility for Python. In Python 3.x documentation. Retrieved from https://docs.python.org/3/library/logging.html
- Microsoft. (n.d.). Windows Terminal Documentation. Retrieved from https://learn.microsoft.com/en-us/windows/terminal/
- Apple Inc. (n.d.). What is Terminal on Mac? Retrieved from https://support.apple.com/guide/terminal/what-is-terminal-trmld4c92d55/mac
- Elphsoft. (n.d.). DBF Commander Professional Online Help. Retrieved from https://dbf-software.com/help
- Horachuk A. M. Esculap DBF Decoder (core) v1.0: computer program. Certificate of copyright registration for a work No. 141779, January 26, 2026. Kyiv: Ukrainian National Office of Intellectual Property and Innovations, 2026.