Downloads: 119 | Views: 170
Review Papers | Computer Science & Engineering | India | Volume 8 Issue 1, January 2019
A Comparative Analysis of Various Algorithms for High Utility Itemset Mining
Abstract: Frequent pattern mining has been an important topic since the concept of frequent itemsets was first introduced by Agrawal et al . Given a dataset of transactions, frequent pattern mining finds the itemsets whose support (i. e. the percentage of transactions containing the itemset) is no less than a given minimum support threshold. However, neither the number of occurrences of an item in a transaction, nor the importance of an item, is considered in frequent pattern mining. Itemsets with more occurrences or importance may be more interesting to users, since they may bring more profit. In light of this, high utility itemset mining has been studied [9, 15, 42, 35]. In high utility itemset mining, the term utility refers to the importance of an itemset; e. g. , the total profit the itemset brings. An itemset is a High Utility Itemset (HUI) if the utility of the itemset is no less than a given minimum threshold. High utility itemset mining focuses more on the utility values in the dataset, which are usually related to profits for the business. Such utilities are interesting to the business owners, who could gain more profits from them. For example, supermarkets use frequent itemset mining to find merchandises customers usually buy together, so as to make recommendations to customers. However, with high utility itemset mining, supermarkets will be able to recommend not only the merchandises people usually buy together, but also the merchandises which will lead to more profits for the store.1 Most of the frequent pattern mining algorithms prune off itemsets in an early stage based on the popular Apriori property : every sub-pattern of a frequent pattern must be frequent (also called the downward closure property). However, this property does not hold in high utility itemset mining, which makes mining high utility itemsets more challenging. The state-of-the-art approaches achieve good performance when the dataset is relatively small. However, the volume of data can grow so faster than expected, that a single machine may not be able to handle a very large amount of data.
Keywords: RUP/FRUP-GROWTH algorithm, HUI, data mining, apriori, big data
Edition: Volume 8 Issue 1, January 2019,
Pages: 2097 - 2100