Information extraction from Wikipedia using pattern learning

Miháltz Márton: Information extraction from Wikipedia using pattern learning. In: Acta cybernetica, (19) 4. pp. 677-694. (2010)

[thumbnail of Mihaltz_2010_ActaCybernetica.pdf]

Előnézet

Cikk, tanulmány, mű
Mihaltz_2010_ActaCybernetica.pdf
Letöltés (284kB) | Előnézet

Absztrakt (kivonat)

In this paper we present solutions for the crucial task of extracting structured information from massive free-text resources, such as Wikipedia, for the sake of semantic databases serving upcoming Semantic Web technologies. We demonstrate both a verb frame-based approach using deep natural language processing techniques with extraction patterns developed by human knowledge experts and machine learning methods using shallow linguistic processing. We also propose a method for learning verb frame-based extraction patterns automatically from labeled data. We show that labeled training data can be produced with only minimal human effort by utilizing existing semantic resources and the special characteristics of Wikipedia. Custom solutions for named entity recognition are also possible in this scenario. We present evaluation and comparison of the different approaches for several different relations.

Mű típusa:	Cikk, tanulmány, mű
Befoglaló folyóirat/kiadvány címe:	Acta cybernetica
Dátum:	2010
Kötet:	19
Szám:	4
ISSN:	0324-721X
Oldalak:	pp. 677-694
Nyelv:	angol
Kiadás helye:	Szeged
Konferencia neve:	Conference on Hungarian Computational Linguistics (7.) (2010) (Szeged)
Befoglaló mű URL:	http://acta.bibl.u-szeged.hu/38530/
Kulcsszavak:	Számítástechnika, Nyelvészet - számítógép alkalmazása
Megjegyzések:	Bibliogr.: p. 692-694. ; összefoglalás angol nyelven
Szakterület:	01. Természettudományok 01. Természettudományok > 01.02. Számítás- és információtudomány 06. Bölcsészettudományok 06. Bölcsészettudományok > 06.02. Nyelvek és irodalom
Feltöltés dátuma:	2016. okt. 15. 12:24
Utolsó módosítás:	2022. jún. 17. 11:10
URI:	http://acta.bibl.u-szeged.hu/id/eprint/12888

Bővebben:

Tétel nézet