Extracting human protein information from MEDLINE using a full-sentence parser

Busa-Fekete Róbert and Kocsor András: Extracting human protein information from MEDLINE using a full-sentence parser. In: Acta cybernetica, (18) 3. pp. 391-402. (2008)

[thumbnail of Busa_2008_ActaCybernetica.pdf]
Cikk, tanulmány, mű

Download (233kB) | Preview


Today, a fair number of systems are available for the task of processing biological data. The development of effective systems is of great importance since they can support both the research and the everyday work of biologists. It is well known that biological databases are large both in size and number, hence data processing technologies are required for the fast and effective management of the contents stored in databases like MEDLINE. A possible solution for content management is the application of natural language processing methods to help make this task easier. With our approach we would like to learn more about the interactions of human genes using full-sentence parsing. Given a sentence, the syntactic parser assigns to it a syntactic structure, which consists of a set of labelled links connecting pairs of words. The parser also produces a constituent representation of a sentence (showing noun phrases, verb phrases, and so on). Here we show experimentally that using the syntactic information of each abstract, the biological interactions of genes can be predicted. Hence, it is worth developing the kind of information extraction (IE) system that can retrieve information about gene interactions just by using syntactic information contained in these text. Our IE system can handle certain types of gene interactions with the help of machine learning (ML) methodologies (Hidden Markov Models, Artificial Neural Networks, Decision Trees, Support Vector Machines). The experiments and practical usage show clearly that our system can provide a useful intuitive guide for biological researchers in their investigations and in the design of their experiments.

Item Type: Article
Journal or Publication Title: Acta cybernetica
Date: 2008
Volume: 18
Number: 3
ISSN: 0324-721X
Page Range: pp. 391-402
Language: English
Place of Publication: Szeged
Event Title: Conference for PhD Students in Computer Science (5.) (2006) (Szeged)
Related URLs: http://acta.bibl.u-szeged.hu/38525/
Uncontrolled Keywords: Számítástechnika, Kibernetika
Additional Information: Bibliogr.: p. 401-402. ; összefoglalás angol nyelven
Subjects: 01. Natural sciences
01. Natural sciences > 01.02. Computer and information sciences
Date Deposited: 2016. Oct. 15. 12:25
Last Modified: 2022. Jun. 16. 14:40
URI: http://acta.bibl.u-szeged.hu/id/eprint/12826

Actions (login required)

View Item View Item