Automated Online Detection of Harmful and Dangerous Content in Social Networks: a Systematic Review

Search by:

Year of publication

Author name

Paper title

https://doi.org/10.15407/jai2025.04.108

Automated Online Detection of Harmful and Dangerous Content in Social Networks: a Systematic Review

Ostrovska O.¹, Lyashkevych V.¹

¹ Ivan Franko Lviv National University

oksana.ostrovska@lnu.edu.ua; vasyl.liashkevych@lnu.edu.ua

https://orcid.org/0009-0006-3376-8448 https://orcid.org/0000-0003-2810-6061

Full text (PDF)

UDC: 004.93
Publication Language: English
Stuc. intelekt. 2025; 30(4):108-116

Abstract: The proliferation of social networks has transformed global communication, yet it has concurrently facilitated the rapid and widespread dissemination of harmful content. This content, encompassing disinformation, hate speech, extremism, and cyberbullying, poses significant threats to individual well-being, social cohesion, and democratic processes. The sheer volume and velocity of user-generated data render manual moderation untenable, necessitating the development of sophisticated automated detection systems - therefore, the social networks are one of the best environments for swindlers, cheaters, and other inadequate people to spread off harmful and dangerous content. This review systematically analyzes scientific literature to map the landscape of existing solutions and identify critical gaps for future research. A thorough literature review was conducted following formal protocol. Key scholarly databases such as IEEE Xplore, ACM Digital Library, Scopus, and arXiv were searched using a comprehensive set of keywords related to malicious content, social media, and detection methodologies. The review focused on peer-reviewed articles, conference proceedings, and scholarly books published in the last five years, supplemented by foundational works. The analysis reveals a clear evolution in detection methodologies, from traditional machine learning (ML) models reliant on manual feature engineering to advanced deep learning architectures. A taxonomy of harmful content is established, clarifying the distinctions between phenomena such as disinformation, extremism, and cyberbullying. The review critically examines the dominant detection paradigms: content-based methods using Natural Language Processing (NLP), structure-based methods leveraging Graph Neural Networks (GNNs) to analyze propagation patterns, and emerging hybrid and multi-modal approaches. Despite significant progress, all current methods face fundamental limitations, including a critical lack of contextual understanding, susceptibility to algorithmic bias, vulnerability to adversarial attacks, and a pervasive lack of transparency and explainability. Current methodologies for harmful content detection, while increasingly sophisticated, remain largely reactive and operate on static snapshots of data. They are insufficient for the early, robust, and context-aware identification of evolving threats. A significant research gap exists for a new generation of systems that deeply integrate dynamic content and network analysis. Future research should focus on developing proactive solutions founded on temporal graph learning and Complex Event Processing (CEP), with explainability integrated by design, to effectively model, detect, and mitigate harmful scenarios as they unfold in real-time.

Keywords: harmful content detection, social network analysis, natural language processing, graph neural networks, information diffusion, content moderation.

References:

Roberts, S. T. (2019). Behind the screen: Content moderation in the shadows of social media. Yale University Press.
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Yu, P. S. (2020). A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1), 4-24.
Al-Sarem, M., et al. (2024). Comparative Analysis of Graph Neural Networks and Transformers for Robust Fake News Detection: A Verification and Reimplementation Study. Electronics, 13(23), 4784.
Mathew, B., et al. (2021). HateXplain: A benchmark dataset for explainable hate speech detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(17), 14867-14875.
Monti, F., Frasca, F., Eynard, D., Mannion, D., & Bronstein, M. M. (2019). Fake news detection on social media using geometric deep learning. arXiv preprint arXiv:1902.06673.
Awan, I., Sutch, H., & Carter, B. (2019). Extremism Online: An analysis of the role of social media in the growth of far-right extremism. Centre for Analysis of the Radical Right.
Bian, T., et al. (2020). Rumor detection on social media with bidirectional graph convolutional networks. Proceedings of the AAAI conference on artificial intelligence, 34(01), 549-556.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Dixon, L., et al. (2018). Measuring and mitigating unintended bias in text classification. Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society.
Truică, C. O., Constantinescu, A. T., & Apostol, E. S. (2024). StopHC: A Harmful Content Detection and Mitigation Architecture for Social Media Platforms. arXiv preprint arXiv:2411.06138.
Treen, K. M., Williams, H. T., & O'Neill, S. J. (2020). Online misinformation about climate change. Wiley Interdisciplinary Reviews: Climate Change, 11(5).
Fortuna, P., & Nunes, S. (2018). A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), 51(4), 1-30.
Gillespie, T. (2018). Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press.
Goel, S., Anderson, A., Hofman, J., & Watts, D. J. (2016). The structural virality of online diffusion. Management Science, 62(1), 180-196.
Gorwa, R. (2019). Algorithmic content moderation: Technical and political challenges in the automation of platform governance. Big Data & Society, 6(1).
Kiela, D., et al. (2020). The hateful memes challenge: Detecting hate speech in multimodal memes. Advances in Neural Information Processing Systems, 33, 2611-2624.
Vaswani, A., et al. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151

View full text (PDF)

Artificial intelligence

Scientific journal

Search by:

Automated Online Detection of Harmful and Dangerous Content in Social Networks: a Systematic Review