Utilizing word embeddings for part-of-speech tagging

Berend, Gábor: Utilizing word embeddings for part-of-speech tagging. Magyar Számítógépes Nyelvészeti Konferencia, (10). pp. 59-67. (2016)

[img] Cikk, tanulmány, mű
msznykonf_012_059-067.pdf

Download (122kB)

Abstract

In this paper, we illustrate the power of distributed word representations for the part-of-speech tagging of Hungarian texts. We trained CRF models for POS-tagging that made use of features derived from the sparse coding of the word embeddings of Hungarian words as signals. We show that relying on such a representation, it is possible to avoid the creation of language specific features for achieving reliable performance. We evaluated our models on all the subsections of the Szeged Treebank both using MSD and universal morphology tag sets. Furthermore, we also report results for inter-subcorpora experiments.

Item Type: Article
Event Title: Magyar Számítógépes Nyelvészeti Konferencia (12.) (2016) (Szeged)
Journal or Publication Title: Magyar Számítógépes Nyelvészeti Konferencia
Date: 2016
Volume: 10
Page Range: pp. 59-67
ISBN: 978-963-306-450-4
Uncontrolled Keywords: Nyelvészet - számítógép alkalmazása
Additional Information: Bibliogr.: p. 66-67. ; összefoglalás angol nyelven
Date Deposited: 2019. Jul. 01. 08:51
Last Modified: 2019. Jul. 01. 08:51
URI: http://acta.bibl.u-szeged.hu/id/eprint/58962

Actions (login required)

View Item View Item