Search by:
Semantic relatedness calculation method
Full text (PDF)
UDC: 68T50
Publication Language: Ukrainian
Stuc. intelekt. 2016; 21(3):32-42
Abstract: The work is dedicated to the problem of semantic relatedness calculation based on text corpora. At the beginning of the work, we present a brief overview of existing approaches to solve the problem and consider the basic benchmark corpora. Then we describe our own method and main hypotheses on which it is based. The paper presents more than 70 hypotheses that can be used in the calculation of semantic relatedness and a new, high-performance relatedness measure model based on machine learning. The model can flexibly switch between subsets of hypotheses and demonstrate high efficiency on different benchmarks sets.
Keywords: semantic relatedness, distributional semantics, machine learning
References:
- Budanitsky A. Evaluating wordnet-based measures of lexical semantic relatedness/ A. Budanitsky, G. Hirst - Computational Linguistics, 32(1), 2006 - pp. 13–47.
- Miller G.A. WordNet: A Lexical Database for English/ G.A. Miller - Communications of the ACM, Vol. 38, No. 11, 1995 – pp. 39-41.
- Lenat D.B. CYC: a large-scale investment in knowledge infrastructure/ D.B. Lenat - Communications of the ACM, Vol. 38, No. 11, 1995 - pp. 33-38.
- Morozova Yu.I. Izvlechenie perevodnogo slovarya znachimyih slovosochetaniy iz parallelnyih tekstov s ispolzovaniem metodov distributivnoy semantiki/ Morozova Yu.I.// Novyie informatsionnyie tehnologii v avtomatizirovannyih sistemah: materialyi shestnadtsatogo nauchno-prakticheskogo seminara. - M.: Mosk. int. elektroniki i matematiki natsionalnogo issledovatelskogo universiteta «Vyisshaya shkola ekonomiki», 2013 - S. 268-272.
- Agirre E. A study on similarity and relatedness using distributional and WordNet-based approaches/ E. Agirre, E. Alfonseca, K. Hall, J. Kravalova, M. Pasca, and A. Soroa// Ann. Conf. of the North American Chapter - the Association for Computational Linguistics, 2009 – pp. 87-95.
- Szumlanski S. A New Set of Norms for Semantic Relatedness Measures/ S. Szumlanski, F. Gomez, and V. Sims. - ACL '13, 2013 - pp. 890—895.
- Radinsky K. A word at a time: computing word relatedness using temporal semantic analysis/ K. Radinsky, E. Agichtein, E. Gabrilovich, S. Markovitch// Proceedings of the 20th international conference on World wide web, Hyderabad, India, 2011 – pp. 172-180.
- Guy H. Large-scale learning of word relatedness with constraints/ H. Guy, G. Dror, E. Gabrilovich, and Y. Koren // Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2012 - pp. 1406-1414.
- Spearman C. The proof and measurement of association between two things/ Spearman C. - American Journal of Psychology N15, 1904 – pp. 72–101.
- Rodgers J.L. Thirteen ways to look at the correlation coefficient/ J.L. Rodgers, W.A. Nicewander - The American Statistician, 42(1), 1988 – pp. 59-66.
- Nykonenko A.O., DoslIdzhennya statistichnoYi shozhostI-zv’yaznostI/ Nikonenko A.O. // VIsnik KNU ImenI Tarasa Shevchenka, serIya fIziko-matematichnI nauki. — 2016. — # 1 — C. 131—136.
- Baroni M. Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors/ M. Baroni, G. Dinu, G. Kruszewski - In ACL, 2014 – pp. 238–247.
- Rosenfeld R. A maximum entropy approach to adaptive statistical language modeling computer speech and language/ R. Rosenfeld - Computer Speech and Language, 10, 1996 – pp.187–228.
- Associated Press [Електронний ресурс]. – Режим доступу: http://www.ap.org/
- Gateway to facts [Електронний ресурс]. – Режим доступу: https://noodls.com/
- Public Relations News [Електронний ресурс]. – Режим доступу: http://www.prnewsonline.com/
- Tibshirani R. Regression Shrinkage and Selection via the lasso/ Tibshirani R. // Journal of the Royal Statistical Society. Series B (methodological) 58 (1). Wiley, 1996 – pp. 267–288.
- Vilnis L. Word Representations via Gaussian Embedding/ L. Vilnis, A McCallum // International Conference on Learning Representations (ICLR), 2015 – pp. 128-136.
- Li S. A generative word embedding model and its low rank positive semidefinite solution/ S. Li, J. Zhu, C. Miao // In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015 - pp. 1599–1609.
- Cai Y. Differential Evolutionary Algorithm Based on Multiple Vector Metrics for Semantic Similarity Assessment in Continuous Vector Space/ Y. Cai, W. Lu, X. Che, K. Shi - DMS 2015 – pp. 241-249.
- Szumlanski S. Automatically acquiring a semantic network of related concepts/ Szumlanski S., Gomez F. // In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM), 2010 – pp. 19–28.
- Jabeen S. Exploiting Wikipedia semantics for computing word associations / Jabeen S., Victoria University of Wellington, 2014 – pp.54-62.
- Halawi G. Large-scale learning of word relatedness with constraints/ G. Halawi, G. Dror, E. Gabrilovich, Y. Koren // In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD’12, New York, USA, 2012 - pp. 1406-1415.
- Liu B. Computing semantic relatedness using a word-text mutual guidance model/ Liu B., Feng J., Liu M., Liu F., Wang X., Li P. // NLPCC 2014. CCIS, vol. 496, Springer, Heidelberg, 2014 - pp. 67–78.