Lightweight diacritics restoration for V4 languages

Csanády Bálint and Lukács András: Lightweight diacritics restoration for V4 languages.

[thumbnail of msznykonf_018_549-559.pdf] Cikk, tanulmány, mű

Download (527kB)


Diacritics restoration became a ubiquitous task in the Latinalphabet-based English-dominated Internet language environment. In this article, we describe a small footprint 1D convolution-based approach, which works on character-level. The model even runs locally in a web browser, and surpasses the performance of similarly sized models. We evaluate our model on the languages of the Visegrád Group, with emphasis on Hungarian.

Item Type: Conference or Workshop Item
Heading title: Poszter, laptopos bemutató
Journal or Publication Title: Magyar Számítógépes Nyelvészeti Konferencia
Date: 2022
Volume: 18
ISBN: 978-963-306-848-9
Page Range: pp. 549-559
Language: English
Place of Publication: Szeged
Event Title: Magyar számítógépes nyelvészeti konferencia (18.) (2022) (Szeged)
Related URLs:
Uncontrolled Keywords: Nyelvészet - számítógép alkalmazása
Additional Information: Bibliogr.: p. 558-559. és a lábjegyzetekben ; ill. ; összefoglalás angol nyelven
Subjects: 01. Natural sciences
01. Natural sciences > 01.02. Computer and information sciences
06. Humanities
06. Humanities > 06.02. Languages and Literature
Date Deposited: 2022. May. 25. 13:59
Last Modified: 2022. Nov. 08. 11:49

Actions (login required)

View Item View Item