Mining high utility itemsets in massive transactional datasets

Thi, Vu Duc and Nguyen, Huy Duc: Mining high utility itemsets in massive transactional datasets. Acta cybernetica, (20) 2. pp. 341-346. (2011)

[img] Cikk, tanulmány, mű
actacyb_20_2_2011_6.pdf

Download (359kB)

Abstract

Mining High Utility Itemsets from a transaction database is to find itemsets that have utility beyond an user-specified threshold. Existing High Utility Itemsets mining algorithms suffer from many problems when being applied to massive transactional datasets. One major problem is the high memory dependency: the gigantic data structure built is assumed to fit in the computer main memory. This paper proposes a new disk-based High Utility Itemsets mining algorithm, which achieves its efficiency by applying three new ideas. First, transactional data is converted into a new database layout called Transactional Array that prevents multiple scanning of the database during the mining phase. Second, for each frequent item, a relatively small independent tree is built for summarizing co-occurrences. Finally, a simple and non-recursive mining process reduces the memory requirements as minimum candidacy generation and counting is needed. We have tested our algorithm on several very large transactional databases and the results show that our algorithm works efficiently.

Item Type: Article
Journal or Publication Title: Acta cybernetica
Date: 2011
Volume: 20
Number: 2
Page Range: pp. 341-346
ISSN: 0324-721X
Language: angol
Heading title: Regular papers
DOI: https://doi.org/10.14232/actacyb.20.2.2011.6
Uncontrolled Keywords: Természettudomány, Informatika
Additional Information: Bibliogr.: 346. p.; Abstract
Date Deposited: 2016. Oct. 15. 12:24
Last Modified: 2018. Jun. 05. 14:22
URI: http://acta.bibl.u-szeged.hu/id/eprint/12913

Actions (login required)

View Item View Item