Extracting Non-redundant Correlated Purchase Behaviors by Utility Measure

Gan, Wensheng; Lin, Jerry Chun-Wei; Fournier-Viger, Philippe; Chao, Han-Chieh

doi:10.1007/978-3-319-64283-3_32

Wensheng Gan¹⁵,
Jerry Chun-Wei Lin¹⁵,
Philippe Fournier-Viger¹⁶ &
…
Han-Chieh Chao^15,17

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10440))

Included in the following conference series:

International Conference on Big Data Analytics and Knowledge Discovery

1653 Accesses
1 Citations

Abstract

In the high-utility itemset mining (HUIM) model, the low-utility patterns sometimes with a very high-utility pattern will be considered as a valuable pattern even if this behavior may be not highly correlated. A more intelligent system that provides non-redundant and correlated behavior based on utility measure is desired. In this paper, we first present a novel method, called extracting non-redundant correlated purchase behaviors by utility measure, to determine the high qualified patterns, which can lead to higher recall and better precision. In the proposed projection-based approach, efficient projection mechanism and a sorted downward closure property are developed to reduce the database size. Two pruning strategies are further developed to efficiently and effectively discover the desired patterns. An extensive experimental study showed that the proposed algorithm considerably outperforms the existing HUIM algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Frequent itemset mining dataset repository. http://fimi.ua.ac.be/data/
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5, 914–925 (1993)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: The International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
Agrawal, R., Srikant, R.: Quest synthetic data generator. http://www.Almaden.ibm.com/cs/quest/syndata.html
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Le, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Article Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Choi, Y.K.: A framework for mining interesting high utility patterns with a strong frequency affinity. Inf. Sci. 181(21), 4878–4894 (2011)
Google Scholar
Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: The International Conference on Data Mining, pp. 19–26 (2003)
Google Scholar
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3), Article 9 (2006)
Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)
Article MathSciNet Google Scholar
Fournier-Viger, P., Wu, C.W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. Found. Intell. Syst. 8502, 83–92 (2014)
Google Scholar
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Article Google Scholar
Lin, J.C.W., Gan, W., Hong, T.P., Tseng, V.S.: Efficient algorithms for mining up-to-date high-utility patterns. Adv. Eng. Inform. 29(3), 648–661 (2015)
Article Google Scholar
Lin, J.C.-W., Gan, W., Fournier-Viger, P., Hong, T.-P.: Mining discriminative high utility patterns. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, T.-P. (eds.) ACIIDS 2016. LNCS, vol. 9622, pp. 219–229. Springer, Heidelberg (2016). doi:10.1007/978-3-662-49390-8_21
Chapter Google Scholar
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P.: Mining high-utility itemsets with multiple minimum utility thresholds. In: ACM International Conference on Computer Science & Software Engineering, pp. 9–17 (2015)
Google Scholar
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Google Scholar
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS, vol. 3518, pp. 689–695. Springer, Heidelberg (2005). doi:10.1007/11430919_79
Chapter Google Scholar
Omiecinski, E.R.: Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15(1), 57–69 (2003)
Article MathSciNet Google Scholar
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Google Scholar
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Wu, T., Chen, Y., Han, J.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Min. Knowl. Discov. 21(3), 371–397 (2010)
Article MathSciNet Google Scholar
Yao, H., Hamilton, J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM International Conference on Data Mining, pp. 211–225 (2004)
Google Scholar

Download references

Acknowledgments

This research was partially supported by the National Natural Science Foundation of China (NSFC) under grant No. 61503092 and by the Tencent Project under grant CCF-Tencent IAGR20160115.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Wensheng Gan, Jerry Chun-Wei Lin & Han-Chieh Chao
School of Natural Sciences and Humanities, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger
Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan
Han-Chieh Chao

Authors

Wensheng Gan
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Han-Chieh Chao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerry Chun-Wei Lin .

Editor information

Editors and Affiliations

LIAS/ISAE-ENSMA, Chasseneuil, France
Ladjel Bellatreche
University of Texas at Arlington, Arlington, Texas, USA
Sharma Chakravarthy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gan, W., Lin, J.CW., Fournier-Viger, P., Chao, HC. (2017). Extracting Non-redundant Correlated Purchase Behaviors by Utility Measure. In: Bellatreche, L., Chakravarthy, S. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2017. Lecture Notes in Computer Science(), vol 10440. Springer, Cham. https://doi.org/10.1007/978-3-319-64283-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-64283-3_32
Published: 03 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64282-6
Online ISBN: 978-3-319-64283-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics