Artificial intelligence

Scientific journal

ISSN 2710-1673

ONLINE: ISSN 2710-1681

Select your language


A data-centric approach to building ai models for determining the credit rating of fintech company clients based on open banking

Fostyak M.1, Demkiv L.1
1 Ivan Franko Lviv National University
lidia.demkiv@gmail.com

Full text (PDF)

UDC: 004.8
Publication Language: English
Stuc. intelekt. 2025; 30(1):132-140

Abstract: This paper implements a data-centric approach to training and testing models that assess a client’s credit rating based on their transactions that are obtained using open banking. Well-defined logic rules were used to create synthetic data, and additional synthetic datasets were generated using GANs. The training and testing of a machine learning model (Random Forest) and neural network models implemented in TensorFlow and PyTorch with identical architectures were performed. The process of adapting the models to data updates is examined. Data clustering was used to identify patterns of behaviour among clients who did not repay their loans. The quality of the generated data was evaluated. Classification results are presented depending on the type of model, the amount of data, and the consideration of client behaviour patterns. The results showed that Random Forest models provide the highest accuracy regardless of the type of dataset. The accuracy of training neural networks depends significantly on the structure of the datasets.

Keywords: machine learning, neural networks, synthetic data, generative adversarial networks, open banking, credit rating.

References:

  1. Fu Y-G., Fang C., Liu Y. (2023) Disjunctive belief rule-based reasoning for decision making with incomplete information, Information Sciences, V. 625, P. 49 – 64. https://doi.org/10.1016/j.ins.2023.01.010
  2. Gao F., Zhang A., Bi W., Ma J. (2021) A greedy belief rule base generation and learning method for classification problem, Applied Soft Computing, V. 98, 106856. https://doi.org/10.1016/j.asoc.2020.106856
  3. Li J., Liu P, Chen L., Pedrycz W., Ding W. (2024). An Integrated Fusion Framework for Ensemble Learning Leveraging Gradient-Boosting and Fuzzy Rule-Based Models. IEEE Transactions on Artificial Intelligence. V:5, I:11. DOI: 10.1109/TAI.2024.3424427
  4. E. Strelcenia, S. Prakoonwit , “A Survey on GAN Techniques for Data Augmentation to Address the Imbalanced Data Issues in Credit Card Fraud Detection,” Mach. Learn. Knowl. Extr. 2023, 5(1), 304-329, [Online]. Available: https://doi.org/10.3390/make5010019
  5. Ngwenduna K.S., Mbuvha R. (2021) Alleviating Class Imbalance in Actuarial Applications Using Generative Adversarial Networks. Risks 9(3). 49, https://doi.org/10.3390/risks9030049
  6. Muñoz-Cancino R., Bravo C., Ríos S., Graña M. (2022). Assessment of creditworthiness models privacy-preserving training with synthetic data. Quantitative Finance. Risk Management. arXiv:2301.01212. https://doi.org/10.48550/arXiv.2301.01212
  7. Ramzan F., Sartori C., Consoli S., Recupero D.R. (2024) Generative Adversarial Networks for Synthetic Data Generation in Finance: Evaluating Statistical Similarities and Quality Assessment. AI 2024, 5(2), 667-685. https://doi.org/10.3390/ai5020035
  8. Potluru V.K., Borrajo D. (2024) Synthetic Data Applications in Finance. Computer Science. Machine Learning. arXiv:2401.00081 https://doi.org/10.48550/arXiv.2401.00081
  9. Roa L., Correa-Bahnsen A., Suarez G., Cortés-Tejada F. (2021) Super-app behavioral patterns in credit risk models: Financial, statistical and regulatory implications. Expert Systems with Applications. V. 169, 114486. https://doi.org/10.1016/j.eswa.2020.114486
  10. Fostyak М., Demkiv L. (2024) Development of data mesh data platform with ml domain of data analysism, Electronics and information technologies. 2024. 27. doi: https://doi.org/10.30970/eli
  11. Fostyak М., (2024) Development of an AI domain in a data mesh network for customer credit classification using transaction data, IEEE 19th International Conference on Computer Science and Information Technologies (CSIT) (in press).
  12. Aas K., Jullum M., Loland A. (2021) Explaining individual predictions when features are dependent: More accurate approximations to Shapley values, Artificial Intelligence. V. 298, 103502. https://doi.org/10.1016/j.artint.2021.103502
  13. Branka Hadji Misheva, Joerg Osterrieder, Ali Hirsa, Onkar Kulkarni, Stephen Fung Lin (2021) Explainable AI in Credit Risk Management. arXiv:2103.00949. https://doi.org/10.48550/arXiv.2103.00949
  14. Talaat, F.M., Aljadani, A., Badawy, M. et al. Toward interpretable credit scoring: integrating explainable artificial intelligence with deep learning for credit card default prediction. Neural Comput & Applic 36, 4847–4865 (2024). https://doi.org/10.1007/s00521-023-09232-2
  15. L. O. Hjelkrem, P. E. Lange, (2023) Explaining Deep Learning Models for Credit Scoring with SHAP: A Case Study Using Open Banking Data,” J. Risk Financial Manag. 16(4), 221; https://doi.org/10.3390/jrfm16040221

View full text (PDF)