Skip to main content

On Mining Proportional Fault-Tolerant Frequent Itemsets

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8421))

Abstract

Mining robust frequent itemsets has attracted much attention due to its wide applications in noisy data. In this paper, we study the problem of mining proportional fault-tolerant frequent itemsets in a large transactional database. A fault-tolerant frequent itemset allows a small amount of errors in each item and each supporting transaction. This problem is challenging since the anti-monotone property does not hold for candidate generation and the problem of fault-tolerant support counting is known to be NP-hard. We propose techniques that substantially speed up the state-of-the-art algorithm for the problem. We also develop an efficient heuristic method to solve an approximation version of the problem. Our experimental results show that the proposed speedup techniques are effective. In addition, our heuristic algorithm is much faster than the exact algorithms while the error is acceptable.

This work was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China [Project No. CityU 122512].

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD 1998, pp. 94–105 (1998)

    Google Scholar 

  2. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993, pp. 207–216 (1993)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994, pp. 487–499 (1994)

    Google Scholar 

  4. Besson, J., Pensa, R.G., Robardet, C., Boulicaut, J.-F.: Constraint-based mining of fault-tolerant patterns from boolean data. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 55–71. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Cheng, H., Yu, P.S., Han, J.: Approximate frequent itemset mining in the presence of random noise. In: Soft Computing for Knowledge Discovery and Data Mining, pp. 363–389 (2008)

    Google Scholar 

  6. Cong, G., Tung, K., Anthony, Xu, X., Pan, F., Yang, J.: FARMER: finding interesting rule groups in microarray datasets. In: SIGMOD 2004, pp. 143–154 (2004)

    Google Scholar 

  7. Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense implicit communities in the web graph. ACM Trans. Web 3(2), 7:1–7:36 (2009)

    Google Scholar 

  8. Gupta, R., Fang, G., Field, B., Steinbach, M., Kumar, V.: Quantitative evaluation of approximate frequent pattern mining algorithms. In: KDD 2008, pp. 301–309 (2008)

    Google Scholar 

  9. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)

    Article  MathSciNet  Google Scholar 

  10. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000, pp. 1–12 (2000)

    Google Scholar 

  11. Koh, J.-L., Yo, P.-W.: An efficient approach for mining fault-tolerant frequent patterns based on bit vector representations. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 568–575. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  12. Kriegel, H.-P., Kröger, P., Zimek, A.: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. TKDD 3(1), 1:1–1:58 (2009)

    Google Scholar 

  13. Lee, G., Peng, S.-L., Lin, Y.-T.: Proportional fault-tolerant data mining with applications to bioinformatics. Information Systems Frontiers 11(4), 461–469 (2009)

    Article  Google Scholar 

  14. Liu, J., Paulsen, S., Sun, X., Wang, W., Nobel, A., Prins, J.: Mining approximate frequent itemsets in the presence of noise: algorithm and analysis. In: SDM 2006, pp. 405–416 (2006)

    Google Scholar 

  15. Pei, J., Tung, A.K.H., Han, J.: Fault-tolerant frequent pattern mining: Problems and challenges. In: DMKD 2001, pp. 7–12 (2001)

    Google Scholar 

  16. Poernomo, A.K., Gopalkrishnan, V.: Mining statistical information of frequent fault-tolerant patterns in transactional databases. In: ICDM 2007, pp. 272–281 (2007)

    Google Scholar 

  17. Poernomo, A.K., Gopalkrishnan, V.: Towards efficient mining of proportional fault-tolerant frequent itemsets. In: KDD 2009, pp. 697–706 (2009)

    Google Scholar 

  18. Seppänen, J.K., Mannila, H.: Dense itemsets. In: KDD 2004, pp. 683–688 (2004)

    Google Scholar 

  19. Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In: ICDM 2006, pp. 1059–1063 (2006)

    Google Scholar 

  20. Wang, X., Borgelt, C., Kruse, R.: Fuzzy frequent pattern discovering based on recursive elimination. In: ICMLA 2005, pp. 391–396 (2005)

    Google Scholar 

  21. Wang, S.-S., Lee, S.-Y.: Mining fault-tolerant frequent patterns in large databases. In: International Computer Symposium 2002 (2002)

    Google Scholar 

  22. Yang, C., Fayyad, U., Bradley, P.S.: Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: KDD 2001, pp. 194–203 (2001)

    Google Scholar 

  23. Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: KDD 1997, pp. 283–286 (1997)

    Google Scholar 

  24. Zeng, J.-J., Lee, G., Lee, C.-C.: Mining fault-tolerant frequent patterns efficiently with powerful pruning. In: SAC 2008, pp. 927–931 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Liu, S., Poon, C.K. (2014). On Mining Proportional Fault-Tolerant Frequent Itemsets. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds) Database Systems for Advanced Applications. DASFAA 2014. Lecture Notes in Computer Science, vol 8421. Springer, Cham. https://doi.org/10.1007/978-3-319-05810-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05810-8_23

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05809-2

  • Online ISBN: 978-3-319-05810-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics