Comparison of distributed language models on medium-resourced languages

Makrai Márton: Comparison of distributed language models on medium-resourced languages.

[thumbnail of msznykonf_011_022-033.pdf]
Preview
Cikk, tanulmány, mű
msznykonf_011_022-033.pdf

Download (191kB) | Preview

Abstract

word2vec and GloVe are the two most successful open-source tools that compute distributed language models from gigaword corpora. word2vec implements the neural network style architectures skip-gram and cbow, learning parameters using each word as a training sample, while GloVe factorizes the cooccurrence-matrix (or more precisely a matrix of conditional probabilities) as a whole. In the present work, we compare the two systems on two tasks: a Hungarian equivalent of a popular word analogy task and word translation between European languages including medium-resourced ones e.g. Hungarian, Lithuanian and Slovenian.

Item Type: Conference or Workshop Item
Journal or Publication Title: Magyar Számítógépes Nyelvészeti Konferencia
Date: 2015
Volume: 11
ISBN: 978-963-306-359-0
Page Range: pp. 22-33
Event Title: Magyar Számítógépes Nyelvészeti Konferencia (11.) (2015) (Szeged)
Related URLs: http://acta.bibl.u-szeged.hu/58552/
Uncontrolled Keywords: Nyelvészet - számítógép alkalmazása
Additional Information: Bibliogr.: 33. p. ; összefoglalás angol nyelven
Date Deposited: 2019. Jun. 28. 08:09
Last Modified: 2022. Nov. 08. 11:49
URI: http://acta.bibl.u-szeged.hu/id/eprint/58918

Actions (login required)

View Item View Item