Artificial intelligence

Scientific journal

ISSN 2710-1673

ONLINE: ISSN 2710-1681

Select your language


Determination of the attributes of authorship of natural texts

Shynkarenko V.1, Demidovich I.1
1 Dnipropetrovsk national university of railway transport named after academician V. Lazaryan

Full text (PDF)

UDC: 004.93+519.25
Publication Language: Ukrainian
Stuc. intelekt. 2018; 23(3): 27-35

Abstract: The possibility of defining the authorship of natural language texts and its fragments was explored by minimum distance classification in space images. In n-dimensional Euclidean space the image forms by measurement signs of statistic and recurrent analysis, complexity indicators. The method of recurrent analysis of time series was adapted to the analysis of natural language texts. Certain signs weren’t efficient enough in authorship determination; in 85% of cases at least one of the methods allows to establish authorship; the modified method of recurrent analysis has the same level of efficiency as statistical and complexity analysis.

Keywords:

References:

  1. Wimmer, G., Altmann, G., Hřebíček, L,Ondrejovič, S., Wimmerová, S. (2003) Úvod doanalýzy textov. Bratislava, – 344 p.
  2. Popesku, I.I., Altmann, G. (2006) Some aspects ofword frequencies. Glottometrics. №13, – P. 23-46.
  3. Köhler, R., Altmann, G. (2005) Aims and Methodsof Quantitative Linguistics. Problems ofQuantitative Linguistics. Chernivci, – P. 12-42.
  4. Perebyjnis, V.S. (2002) Statystychni metody dljalingvistiv: Navchal'nyj posibnyk. Vinnycja, – 168 s.
  5. Alekseev, P.M. (2005) Frequency dictionaries.Quantitative Linguistik : ein internationalesHandbuch = Quantitative linguistics : aninternational handbook/ edited by Reinhard Kohler,Gabriel Altmann, Rajmund G. Piotrowski. Berlin –New York. – P. 312-324.
  6. Popescu, I. (2009) Word frequency studies. Berlin–New York, – 276 p.
  7. Suhorol's'ka, S.M., Fedorenko, O.I. (2009) Metodylingvistychnyh doslidzhen': Navch. posibnyk. L'viv,– 348 s.
  8. Chatuev, M.B., Chepovskii, A.M. (2011)Chastotnye metody v komp'yuternoi lingvistike. –M.: MGUP. – 88 s.
  9. Fomenko, V.P., Fomenko, T.G. (1996) Avtorskiiinvariant russkikh literaturnykh tekstov. Novayakhronologiya Gretsii: Antichnost' v srednevekov'e.T. 2. M.: Izd-vo MGU, – S.768-820.
  10. Baevskii, V.S. (2001) Lingvisticheskie,matematicheskie, simeoticheskie i komp'yuternyemodeli v istorii i teorii literatury. M., – 312 s.
  11. Buk, S. (2011) Slov’jans'kyj dosvid ukladannjachastotnyh slovnykiv movy pys'mennyka. Problemyslov’janoznavstva. L'viv, – S. 217-224.
  12. Buzikashvili, N.E., Samoylov, D.V., Kryilova, G.A.(2000) N-grammyi v lingvistike. Sbornik: Metodyi isredstva rabotyi s dokumentami. M.: DitorialURRS, – 376 s.
  13. Taranuha, V.Yu. (2014) Ispolzovaniekombinirovannyih kriteriev dlyaavtomatizirovannogo opredeleniya zaimstvovaniy.«Innovatsii v nauke»: sbornik statey po materialamXXXII mezhdunarodnoy nauchno-prakticheskoykonferentsii. Novosibirsk: Izd. «SibAK». – S. 15-18.
  14. Kozhina, M.N., Duskaeva, L.R., Salimovskiy, V.A.(2008) Stilistika russkogo yazyika. M.: Flinta:Nauka.  464 s.
  15. William, B. Cavnar, John M. (1994) Trenkle NGram-Based Text Categorization. Michigan, –P. 161–175.
  16. Rogushina, Yu.V. (2007) Ispolzovanie kriterievotsenki udobochitaemosti teksta dlya poiskainformatsii, sootvetstvuyuschey realnyimpotrebnostyam polzovatelya. Problemiprogramyuvannya. Kyiv, – S. 76-88.
  17. Zbilut, J.P., Webber, Jr.C.L. (1992) Embeddingsand delays as derived from quantification ofrecurrence plots. Physics Letters A.– V.171. № 3-4.– P. 199–203.
  18. Tu, Dzh., Gonsales, R. (1978) Printsipyiraspoznavaniya obrazov. M., – 411 s.
  19. Kiselev, V.B. (2006) Rekurrentnyiy analiz – teoriyai praktika. Nauchno-tehnicheskiy vestnikinformatsionnyih tehnologiy, mehaniki i optiki.№29, – SPb. – S. 118-127

View full text (PDF)