Artificial intelligence

Scientific journal

ISSN 2710-1673

ONLINE: ISSN 2710-1681

Select your language


Automated search of named entities in unmarked ukrainian texts

Glybovets A.1
1 National university «Kyiv-Mohyla academy»

Full text (PDF)

UDC: 681.3
Publication Language: Ukrainian
Stuc. intelekt. 2017; 22(2):45-51

Abstract: The paper describes the created and implemented algorithm of the search for named entities in the texts in the Ukrainian language. The software tools created on the basis of them allow to allocate the named entities and connections between them in graphic mode. The utility is implemented as a web application. With the help of this software tool, a body of annotated NERs of texts of 122 texts was created. There are such kinds of entities as persons, organizations and geographical objects. The body consists of 2,731 named entities.

Keywords: named entities, natural language processing, allocation of named entities.

References:

  1. «CoNLL 2017 | CoNLL.». [Elektr. Resurs].URL: http://www.conll.org/
  2. «Message Understanding Conference - 6: A Brief History - NYU.» [Elektr. Resurs]. URL:http://nlp.cs.nyu.edu/muc/muc6-history-coling.ps
  3. David Crystal. “A dictionary of linguistics and phonetics (sixth edition)”, 2008.
  4. «PCRE - Perl Compatible Regular Expressions.» [Elektr. Resurs] URL: http://www.pcre.org/
  5. «LanguageTool.Org.» [Elektr. Resurs] URL: https://www.languagetool.org/
  6. MUC-6. [Електронний ресурс].URL: http://www.cs.nyu.edu/cs/faculty/grishman/muc6.html
  7. A Borthwick- Ph. D. Thesis New York University, 1999 - A Maximum Entropy Approach to NamedEntity Recognition
  8. David Pierce and Claire Cardie. 2001. Limitations of co-training for natural language learning fromlarge datasets. EMNLP.
  9. Radu Florian, Abe Ittycheriah, Hongyan Jing, and Tong Zhang. 2003. Named entity recognitionthrough classifier combination. In Proceedings of the seventh conference on Natural language learning atHLT-NAACL 2003 - Volume 4 (CONLL '03), Vol. 4. Association for Computational Linguistics,Stroudsburg, PA, USA, 168-171.
  10. Huong Thanh Le, Luan Van Tran, Xuan Hoai Nguyen, and Thi Hien Nguyen. 2015. OptimizingGenetic Algorithm in Feature Selection for Named Entity Recognition. In Proceedings of the SixthInternational Symposium on Information and Communication Technology (SoICT 2015). ACM, NewYork, NY, USA, 11-16.
  11. «GLR parser - Wikipedia.» [Elektr. Resurs]. URL: https://en.wikipedia.org/wiki/GLR_parser
  12. “Yargy is a GLR-parser, that uses russian morphology for facts extraction process, and written in purepython”. [Elektr. Resurs]. URL: https://github.com/bureaucratic-labs/yargy
  13. «zik.ua Analytics - Market Share Stats & Traffic Ranking - SimilarWeb.» [Електронний ресурс]. URL:https://www.similarweb.com/website/zik.ua
  14. «brat rapid annotation tool.» [Електронний ресурс]. URL: http://brat.nlplab.org/
  15. Metod obchislenny semantichnoi blizkosti dly sliv prirodnoy movi / A. V. Anisimov, М. М. Glybovets, О. О.Marchenko, В. К. Kislenko // Scientific notes of NaUKMA. Computer Science. - 2011. - Т. 125. - С. 8-12.

View full text (PDF)