Artificial intelligence

Scientific journal

ISSN 2710-1673

ONLINE: ISSN 2710-1681

Select your language


Authorship attribution system

Marchenko O.1, Nykonenko A.2, Rossada T.2, Melnikov E.2
1 National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
2 Taras Shevchenko National University of Kyiv

Full text (PDF)

UDC: 68Т50
Publication Language: Ukrainian
Stuc. intelekt. 2016; 21(2):77-85

Abstract: A new effective system for identification and verification of text authorship has been developed. The system is created on the base of machine learning. The originality of the proposed model is caused by the unique profile of the author attributes that allows getting extra-high performance accuracy using the method of the Support Vector Machine (SVM).

Keywords: authorship identification, machine learning, support vector machine

References:

  1. Scikit-learn http://scikit-learn.org/stable/
  2. Numpy http://www.numpy.org/
  3. Fissette, M. Author identification in short texts. Thesis, Department of Artificial Intelligence, 2010, Radboud University.
  4. George K. Mikros and Kostas Perifanos Authorship Identification in Large Email Collections: Experiments Using Features that Belong to Different Linguistic Levels - Notebook for PAN at CLEF 2011.
  5. Rachel M. Green, John W. Sheppard Comparing Frequency- and Style-Based Features for Twitter Author Identification// Proceedings of the Twenty-Sixth International FLAIRS, St. Pete Beach, Florida, USA, 2013, May 22 – 24.
  6. Roman Kern Grammar Checker Features for Author Identification and Author Profiling// Notebook for PAN at CLEF 2013.
  7. Zheng, Rong, Li, Jiexun, Huang, Zan and Chen, Hsinchun. A Framework for Authorship Identification of Online Messages: Writing Style Features and Classification Techniques. Journal of the American Society for Information Science and Technology (JASIST), 57(3):378-393 (2006).
  8. Tie-Yun Qian, Bing Liu, Qing Li, Jianfeng Si Review Authorship Attribution in a Similarity Space. Journal of Computer Science and Technology.2015. 30. pp.1200-1213.
  9. Sindhu Raghavan, Adriana Kovashka, and Raymond Mooney Authorship attribution using probabilistic context-free grammars. In Proceedings of ACL-2010, pages 38-42.
  10. Nykonenko A.O., Doslidzhennya statystychnoyi skhozhosti-zv"yaznosti // Visnyk KNU imeni Tarasa Shevchenka, seriya fizyko-matematychni nauky. ˗ 2016. ˗ # 1 ˗ C. 131-136.
  11. https://unplag.com/blog/
  12. Lewis, D. D., Yang, Y., Rose, T. G., & Li, F. (2004). RCV1: A new benchmark collection for text categorization research// The Journal of Machine Learning Research, 5, 361-397.

View full text (PDF)