A data-centric approach to building ai models for determining the credit rating of fintech company clients based on open banking

Search by:

Year of publication

Author name

Paper title

https://doi.org/10.15407/jai2025.01.132

A data-centric approach to building ai models for determining the credit rating of fintech company clients based on open banking

Fostyak M.¹, Demkiv L.¹

¹ Ivan Franko Lviv National University

lidia.demkiv@gmail.com

https://orcid.org/0000-0003-4670-5834 https://orcid.org/0009-0002-0185-6364

Full text (PDF)

UDC: 004.8
Publication Language: English
Stuc. intelekt. 2025; 30(1):132-140

Abstract: This paper implements a data-centric approach to training and testing models that assess a client’s credit rating based on their transactions that are obtained using open banking. Well-defined logic rules were used to create synthetic data, and additional synthetic datasets were generated using GANs. The training and testing of a machine learning model (Random Forest) and neural network models implemented in TensorFlow and PyTorch with identical architectures were performed. The process of adapting the models to data updates is examined. Data clustering was used to identify patterns of behaviour among clients who did not repay their loans. The quality of the generated data was evaluated. Classification results are presented depending on the type of model, the amount of data, and the consideration of client behaviour patterns. The results showed that Random Forest models provide the highest accuracy regardless of the type of dataset. The accuracy of training neural networks depends significantly on the structure of the datasets.

Keywords: machine learning, neural networks, synthetic data, generative adversarial networks, open banking, credit rating.

References:

Fu Y-G., Fang C., Liu Y. (2023) Disjunctive belief rule-based reasoning for decision making with incomplete information, Information Sciences, V. 625, P. 49 – 64. https://doi.org/10.1016/j.ins.2023.01.010
Gao F., Zhang A., Bi W., Ma J. (2021) A greedy belief rule base generation and learning method for classification problem, Applied Soft Computing, V. 98, 106856. https://doi.org/10.1016/j.asoc.2020.106856
Li J., Liu P, Chen L., Pedrycz W., Ding W. (2024). An Integrated Fusion Framework for Ensemble Learning Leveraging Gradient-Boosting and Fuzzy Rule-Based Models. IEEE Transactions on Artificial Intelligence. V:5, I:11. DOI: 10.1109/TAI.2024.3424427
E. Strelcenia, S. Prakoonwit , “A Survey on GAN Techniques for Data Augmentation to Address the Imbalanced Data Issues in Credit Card Fraud Detection,” Mach. Learn. Knowl. Extr. 2023, 5(1), 304-329, [Online]. Available: https://doi.org/10.3390/make5010019
Ngwenduna K.S., Mbuvha R. (2021) Alleviating Class Imbalance in Actuarial Applications Using Generative Adversarial Networks. Risks 9(3). 49, https://doi.org/10.3390/risks9030049
Muñoz-Cancino R., Bravo C., Ríos S., Graña M. (2022). Assessment of creditworthiness models privacy-preserving training with synthetic data. Quantitative Finance. Risk Management. arXiv:2301.01212. https://doi.org/10.48550/arXiv.2301.01212
Ramzan F., Sartori C., Consoli S., Recupero D.R. (2024) Generative Adversarial Networks for Synthetic Data Generation in Finance: Evaluating Statistical Similarities and Quality Assessment. AI 2024, 5(2), 667-685. https://doi.org/10.3390/ai5020035
Potluru V.K., Borrajo D. (2024) Synthetic Data Applications in Finance. Computer Science. Machine Learning. arXiv:2401.00081 https://doi.org/10.48550/arXiv.2401.00081
Roa L., Correa-Bahnsen A., Suarez G., Cortés-Tejada F. (2021) Super-app behavioral patterns in credit risk models: Financial, statistical and regulatory implications. Expert Systems with Applications. V. 169, 114486. https://doi.org/10.1016/j.eswa.2020.114486
Fostyak М., Demkiv L. (2024) Development of data mesh data platform with ml domain of data analysism, Electronics and information technologies. 2024. 27. doi: https://doi.org/10.30970/eli
Fostyak М., (2024) Development of an AI domain in a data mesh network for customer credit classification using transaction data, IEEE 19th International Conference on Computer Science and Information Technologies (CSIT) (in press).
Aas K., Jullum M., Loland A. (2021) Explaining individual predictions when features are dependent: More accurate approximations to Shapley values, Artificial Intelligence. V. 298, 103502. https://doi.org/10.1016/j.artint.2021.103502
Branka Hadji Misheva, Joerg Osterrieder, Ali Hirsa, Onkar Kulkarni, Stephen Fung Lin (2021) Explainable AI in Credit Risk Management. arXiv:2103.00949. https://doi.org/10.48550/arXiv.2103.00949
Talaat, F.M., Aljadani, A., Badawy, M. et al. Toward interpretable credit scoring: integrating explainable artificial intelligence with deep learning for credit card default prediction. Neural Comput & Applic 36, 4847–4865 (2024). https://doi.org/10.1007/s00521-023-09232-2
L. O. Hjelkrem, P. E. Lange, (2023) Explaining Deep Learning Models for Credit Scoring with SHAP: A Case Study Using Open Banking Data,” J. Risk Financial Manag. 16(4), 221; https://doi.org/10.3390/jrfm16040221

View full text (PDF)

Artificial intelligence

Scientific journal

Search by:

A data-centric approach to building ai models for determining the credit rating of fintech company clients based on open banking