Mitigating the knowledge acquisition bottleneck for Hungarian word sense disambiguation using multilingual transformers

Berend, Gábor: Mitigating the knowledge acquisition bottleneck for Hungarian word sense disambiguation using multilingual transformers. In: Magyar Számítógépes Nyelvészeti Konferencia, (17). pp. 77-89. (2021)

[img]
Preview
Cikk, tanulmány, mű
msznykonf_017_077-089.pdf

Download (308kB) | Preview

Abstract

A major hurdle in training all-words word sense disambiguation (WSD) systems for new domains and/or languages is the limited availability of sense annotated training corpora and that their construction is an extremely costly and labor-intensive process. In this paper, we investigate the utilization of multilingual transformer-based language models for performing cross-lingual WSD in the zero-shot setting. Our empirical results suggest that by relying on the intriguing multilingual abilities of pre-trained language models, we can infer reliable sense labels to Hungarian textual utterances in the all-word WSD setting by purely relying on sense-annotated training data in English.

Item Type: Article
Heading title: Szemantika
Journal or Publication Title: Magyar Számítógépes Nyelvészeti Konferencia
Date: 2021
Volume: 17
ISBN: 978-963-306-781-9
Page Range: pp. 77-89
Language: English
Event Title: Magyar számítógépes nyelvészeti konferencia (17.) (2021) (Szeged)
Related URLs: http://acta.bibl.u-szeged.hu/73340/
Uncontrolled Keywords: Nyelvészet - számítógép alkalmazása
Additional Information: Bibliogr.: p. 86-89. és a lábjegyzetekben ; összefoglalás angol nyelven
Subjects: 01. Natural sciences
01. Natural sciences > 01.02. Computer and information sciences
06. Humanities
06. Humanities > 06.02. Languages and Literature
Date Deposited: 2021. Sep. 28. 10:55
Last Modified: 2021. Sep. 28. 10:55
URI: http://acta.bibl.u-szeged.hu/id/eprint/73359

Actions (login required)

View Item View Item