Skip to main content
Log in

UGMINE: utility-based graph mining

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Frequent pattern mining extracts most frequent patterns from databases. These frequency-based frameworks have limitations in representing users’ interest in many cases. In business decision-making, not all patterns are of the same importance. To solve this problem, utility has been incorporated in transactional and sequential databases. A graph is a relatively complex but highly useful data structure. Although frequency-based graph mining has many real-life applications, it has limitations similar to other frequency-based frameworks. To the best of our knowledge, there is no complete framework developed for mining utility-based patterns from graphs. In this work, we propose a complete framework for utility-based graph pattern mining. A complete algorithm named UGMINE is presented for high utility subgraph mining. We introduce a pruning technique named RMU pruning for effective pruning of the candidate pattern search space that grows exponentially. We conduct experiments on various datasets to analyze the performance of the algorithm. Our experimental results show the effectiveness of UGMINE to extract high utility subgraph patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://pubchem.ncbi.nlm.nih.gov/

References

  1. Ahmed CF, Tanbeer SK, Jeong B, Lee Y (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  2. Ahmed CF, Tanbeer SK, Jeong BS (2010) A novel approach for mining high-utility sequential patterns in sequence databases. ETRI J 32(5):676–686

    Article  Google Scholar 

  3. Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2011) HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34(2):181–198

    Article  Google Scholar 

  4. Bogdanov P, Mongiovì M, Singh AK (2011) Mining heavy subgraphs in time-evolving networks. In: 2011 IEEE 11th international conference on data mining. IEEE, pp 81–90

  5. Cai C, Fu A, Cheng C, Kwong W (1998) Mining association rules with weighted items. In: IDEAS’98, pp 68–77

  6. Chen Y, Zhao X, Lin X, Wang Y, Guo D (2019) Efficient mining of frequent patterns on uncertain graphs. IEEE Trans Knowl Data Eng 31(2):287–300

    Article  Google Scholar 

  7. Chowdhury MES, Ahmed CF, Leung CK (2022) A new approach for mining correlated frequent subgraphs. ACM Trans Manag Inf Syst 13(1):9.1–9.28

    Article  Google Scholar 

  8. Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) Fhm: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International symposium on methodologies for intelligent systems. Springer, pp 83–92

  9. Gan W, Lin JCW, Fournier-Viger P, Chao HC, Philip SY (2020) Huopm: High-utility occupancy pattern mining. IEEE Tran Cyber 50(3):1195–1208

    Article  Google Scholar 

  10. Gan W, Lin JCW, Zhang J, Fournier-Viger P, Chao HC, Yu PS (2020) Fast utility mining on sequence data. IEEE Tran Cyber 51(2):487–500

    Article  Google Scholar 

  11. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. In: Data mining and knowledge discovery, vol 8. Springer, pp 53–87

  12. Islam MA, Ahmed CF, Leung CK, Hoi CS (2018) WFSM-MaxPWS: an efficient approach for mining weighted frequent subgraphs from edge-weighted graph databases. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 664–676

  13. Jiang C, Coenen F, Zito M (2010) Frequent sub-graph mining on edge weighted graphs. In: International conference on data warehousing and knowledge discovery. Springer, pp 77–88

  14. Khare A, Goyal V, Baride S, Prasad SK, McDermott M, Shah D (2017) Distributed algorithm for high-utility subgraph pattern mining over big data platforms. In: 2017 IEEE 24th international conference on high performance computing (HiPC). IEEE, pp 263–272

  15. Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings 2001 IEEE International Conference on Data Mining. IEEE, pp 313–320

  16. Lan GC, Hong TP, Lee HY (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intell 41(2):439–452

    Article  Google Scholar 

  17. Lan GC, Hong TP, Lee HY, Wang SL, Tsai CW (2013) Enhancing the efficiency in mining weighted frequent itemsets. In: 2013 IEEE International conference on systems, man, and cybernetics, pp 1104–1108

  18. Le NT, Vo B, Nguyen LB, Fujita H, Le B (2020) Mining weighted subgraphs in a single large graph. Inf Sci 514:149–165

    Article  MathSciNet  MATH  Google Scholar 

  19. Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424

    Article  Google Scholar 

  20. Lin JCW, Djenouri Y, Srivastava G, Li Y, Yu PS (2021) Scalable mining of high-utility sequential patterns with three-tier MapReduce model. ACM Trans Knowl Discov Data 16(3):60.1–60.26

    Google Scholar 

  21. Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: PAKDD. Springer, pp 689–695

  22. Malliaros FD, Skianis K (2015) Graph-based term weighting for text categorization. In: IEEE/ACM International conference on advances in social networks analysis and mining, pp 1473–1479

  23. Nouioua M, Fournier-Viger P, Wu CW, Lin JCW, Gan W (2021) FHUQI-Miner: Fast high utility quantitative itemset mining. Appl Intell 51:6785–6809

    Article  Google Scholar 

  24. Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC (2001) PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings 17th international conference on data engineering. IEEE, pp 215–224

  25. Pramanik S, Goswami A (2021) Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm. Appl Intell:1–17

  26. Preti G, Lissandrini M, Mottin D, Velegrakis Y (2018) Beyond frequencies: Graph pattern mining in multi-weighted graphs. In: EDBT, pp 169–180

  27. Rozenshtein P, Gionis A (2019) Mining temporal networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, pp 3225–3226

  28. Singh K, Singh SS, Kumar A, Biswas B (2019) TKEH: an efficient algorithm for mining top-k high utility itemsets. Appl Intell 49(3):1078–1097

    Article  Google Scholar 

  29. Singh K, Singh SS, Kumar A, Shakya HK, Biswas B (2018) CHN: an efficient algorithm for mining closed high utility itemsets with negative utility. IEEE Trans Knowl Data Eng:1–1

  30. Song W, Zheng C, Huang C, Liu L (2021) Heuristically mining the top-k high-utility itemsets with cross-entropy optimization. Appl Intell:1–16

  31. Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements. In: International conference on extending database technology. Springer, pp 1–17

  32. Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: KDD’97, pp 67–73

  33. Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 661–666

  34. Truong T, Duong H, Le B, Fournier-Viger P (2019) Efficient vertical mining of high average-utility itemsets based on novel upper-bounds. IEEE Trans Knowl Data Eng 31(2):301–314

    Article  Google Scholar 

  35. Tung N, Nguyen LT, Nguyen TD, Vo B (2021) An efficient method for mining multi-level high utility itemsets. Appl Intell:1–22

  36. Vo B, Coenen F, Le B (2013) A new method for mining frequent weighted itemsets based on WIT-trees. Expert Syst Appl 40(4):1256–1264

    Article  Google Scholar 

  37. Wale N, Watson IA, Karypis G (2008) Comparison of descriptor spaces for chemical compound retrieval and classification. Knowl Inf Syst 14(3):347–375

    Article  Google Scholar 

  38. Wang JZ, Chen YC, Shih WY, Yang L, Liu YS, Huang JL (2020) Mining high-utility temporal patterns on time interval–based data. ACM Trans Intell Syst Technol (TIST) 11(4):43:1–43:31

    Google Scholar 

  39. Wang JZ, Huang JL (2018) On incremental high utility sequential pattern mining. ACM Trans Intell Syst Technol (TIST) 9(5):55:1–55:26

    Google Scholar 

  40. Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (WAR). In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 270–274

  41. Wu JMT, Lin JCW, Tamrakar A (2019) High-utility itemset mining with effective pruning strategies. ACM Trans Knowl Discov Data 13(6):58.1–58.22

    Article  Google Scholar 

  42. Yan X (2002) Han, j.: gspan: graph-based substructure pattern mining. In: ICDM. IEEE, pp 721–724

  43. Yang J, Su W, Li S, Dalkilic MM (2012) WIGM: discovery of subgraph patterns in a large weighted graph. In: Proceedings of the 2012 SIAM International Conference on Data Mining. SIAM, pp 1083–1094

  44. Yin J, Zheng Z, Cao L (2012) USpan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 660–668

  45. Yun U (2008) A new framework for detecting weighted sequential patterns in large sequence databases. Knowl-Based Syst 21(2):110–122

    Article  Google Scholar 

  46. Yun U, Leggett JJ (2006) WSpan: Weighted sequential pattern mining in large sequence databases. In: 2006 3rd international IEEE conference intelligent systems, pp 512–517

  47. Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105

    Article  Google Scholar 

  48. Gan W, Lin JCW, Zhang J, Chao HC, Fujita H, Yu PS (2020) ProUM: Projection-based utility mining on sequence data. Inf Sci 513:222–240

    Article  Google Scholar 

  49. Truong T, Duong H, Le B, Fournier-Viger P, Yun U, Fujita H (2021) Efficient algorithms for mining frequent high utility sequences with constraints. Inf Sci 568:239–264

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to express our deep gratitude to the anonymous reviewers of this article. We believe their useful comments have played a significant role in improving the quality of this work, which was supported by Natural Sciences and Engineering Research Council of Canada (NSERC) and University of Manitoba.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chowdhury Farhan Ahmed.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alam, M.T., Roy, A., Ahmed, C.F. et al. UGMINE: utility-based graph mining. Appl Intell 53, 49–68 (2023). https://doi.org/10.1007/s10489-022-03385-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03385-8

Keywords

Navigation