Analysing the semantic content of static Hungarian embedding spaces

Ficsor, Tamás and Berend, Gábor: Analysing the semantic content of static Hungarian embedding spaces. In: Magyar Számítógépes Nyelvészeti Konferencia, (17). pp. 91-105. (2021)

[img]
Preview
Cikk, tanulmány, mű
msznykonf_017_091-105.pdf

Download (687kB) | Preview

Abstract

Word embeddings can encode semantic features and have achieved many recent successes in solving NLP tasks. Although word embeddings have high success on several downstream tasks, there is no trivial approach to extract lexical information from them. We propose a transformation that amplifies desired semantic features in the basis of the embedding space. We generate these semantic features by a distant supervised approach, to make them applicable for Hungarian embedding spaces. We propose the Hellinger distance in order to perform a transformation to an interpretable embedding space. Furthermore, we extend our research to sparse word representations as well, since sparse representations are considered to be highly interpretable.

Item Type: Article
Heading title: Szemantika
Journal or Publication Title: Magyar Számítógépes Nyelvészeti Konferencia
Date: 2021
Volume: 17
ISBN: 978-963-306-781-9
Page Range: pp. 91-105
Language: English
Event Title: Magyar számítógépes nyelvészeti konferencia (17.) (2021) (Szeged)
Related URLs: http://acta.bibl.u-szeged.hu/73340/
Uncontrolled Keywords: Nyelvészet - számítógép alkalmazása
Additional Information: Bibliogr.: p. 103-105. és a lábjegyzetekben ; összefoglalás angol nyelven
Subjects: 01. Natural sciences
01. Natural sciences > 01.02. Computer and information sciences
06. Humanities
06. Humanities > 06.02. Languages and Literature
Date Deposited: 2021. Sep. 28. 11:12
Last Modified: 2021. Sep. 28. 11:12
URI: http://acta.bibl.u-szeged.hu/id/eprint/73360

Actions (login required)

View Item View Item