Impact Statement:This article contributes to a utility-based targeted pattern discovery model for artificial intelligence and data science. To the best of our knowledge, it is the first a...Show More
Abstract:
Traditional high-utility itemset mining (HUIM) aims to determine all high-utility itemsets (HUIs) that satisfy the minimum utility threshold in transaction databases. How...Show MoreMetadata
Impact Statement:
This article contributes to a utility-based targeted pattern discovery model for artificial intelligence and data science. To the best of our knowledge, it is the first article that proposes a realistic utility-based solution for the targeted pattern discovery instead of all pattern discovery from a real-world dataset. The designed TargetUM method can be a benchmark of target-based utility mining. The proposed method addresses several challenges and achieves state-of-the-art performance on massive datasets. TargetUM can provide acceptable querying performance. This targeted utility mining problem formulation and efficient algorithm contribute to the artificial intelligence systems in many applications, such as market basket analysis, risk prediction, smart retail, intrusion detection, and so on.
Abstract:
Traditional high-utility itemset mining (HUIM) aims to determine all high-utility itemsets (HUIs) that satisfy the minimum utility threshold in transaction databases. However, in most applications, not all HUIs are interesting because only specific parts are required. Thus, targeted mining based on user preferences is more important than traditional mining tasks. This article is the first to propose a target-based HUIM problem and to provide a clear formulation of the targeted utility mining task in a quantitative transaction database. A tree-based algorithm known as Target-based high-Utility iteMset querying (TargetUM) is proposed. The algorithm uses a lexicographic querying tree and three effective pruning strategies to improve the mining efficiency. We implemented experimental validation on several real and synthetic databases, and the results demonstrate that the performance of TargetUM is satisfactory, complete, and correct. Finally, owing to the lexicographic querying tree, the database no longer needs to be scanned repeatedly for multiple queries.
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 4, Issue: 4, August 2023)