skip to main content
10.1145/2351476.2351500acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
short-paper

Mining probabilistic datasets vertically

Published: 08 August 2012 Publication History

Abstract

As frequent pattern mining plays an important role in various real-life applications, it has been the subject of numerous studies. Most of the studies mine transactional datasets of precise data. However, there are situations in which data are uncertain. Over the few years, Apriori-based, tree-based, and hyperlinked array structure based mining algorithms have been proposed to mine frequent patterns from these probabilistic datasets of uncertain data. These algorithms view the datasets "horizontally" as collections of transactions, and each records a set of items contained in that transaction. In this paper, we consider an alternative representation such that probabilistic datasets of uncertain data can be viewed "vertically" as collections of vectors. The vector for each item indicates which transactions contain that item. We also propose an algorithm called U-VIPER to mine these probabilistic datasets "vertically for frequent patterns.

References

[1]
C. C. Aggarwal et al. Frequent pattern mining algorithms with uncertain data. In Managing and Mining Uncertain Data, ch. 15. Springer, 2009.
[2]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. VLDB 1994, pp. 487--499.
[3]
R. Agrawal et al. Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD 1993, pp. 207--216.
[4]
T. Bernecker et al. Probabilistic frequent itemset mining in uncertain databases. In Proc. ACM KDD 2009, pp. 119--127.
[5]
T. Calders et al. Efficient pattern mining of uncertain data with sampling. In Proc. PAKDD 2010, Part I, pp. 480--487.
[6]
J. Chen et al. Scrubbing query results from probabilistic databases. In Proc. IDEAS 2011, pp. 79--87.
[7]
C.-K. Chui et al. Mining frequent itemsets from uncertain data. In Proc. PAKDD 2007, pp. 47--58.
[8]
E. Desmier et al. A clustering-based visualization of colocation patterns. In Proc. IDEAS 2011, pp. 70--78.
[9]
G. Dong and J. Bailey. Contrast Data Mining: Concepts, Algorithms, and Applications. Chapman & Hall/CRC, 2012.
[10]
J. Han et al. Mining frequent patterns without candidate generation. In Proc. ACM SIGMOD 2000, pp. 1--12.
[11]
C. K.-S. Leung. Mining uncertain data. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery, 1(4), July/Aug. 2011, pp. 316--329.
[12]
C. K.-S. Leung and B. Hao. Mining of frequent itemsets from streams of uncertain data. In Proc. IEEE ICDE 2009, pp. 1663--1670.
[13]
C. K.-S. Leung and L. Sun. Equivalence class transformation based mining of frequent itemsets from uncertain data. In Proc. ACM SAC 2011, pp. 983--984.
[14]
C. K.-S. Leung and S. K. Tanbeer. Fast tree-based mining of frequent itemsets from uncertain data. In Proc. DASFAA 2012, Part I, pp. 272--287.
[15]
C. K.-S. Leung and S. K. Tanbeer. Mining popular patterns from transactional databases. In Proc. DaWaK 2012, pp. 291--302.
[16]
C. K.-S. Leung and S. K. Tanbeer. Mining social networks for significant friend groups. In Proc. DASFAA Workshops 2012, pp. 180--192.
[17]
C. K.-S. Leung et al. A landmark-model based system for mining frequent patterns from uncertain data streams. In Proc. IDEAS 2011, pp. 249--250.
[18]
C. K.-S. Leung et al. A tree-based approach for frequent pattern mining from uncertain data. In Proc. PAKDD 2008, pp. 653--661.
[19]
C. K.-S. Leung et al. uCFS2: an enhanced system that mines uncertain data for constrained frequent sets. In Proc. IDEAS 2010, pp. 32--37.
[20]
J. Pei et al. H-Mine: Hyper-structure mining of frequent patterns in large databases. In Proc. IEEE ICDM 2001, pp. 441--448.
[21]
P. Shenoy et al. Turbo-charging vertical mining of large databases. In Proc. ACM SIGMOD 2000, pp. 22--33.
[22]
M. J. Zaki et al. New algorithms for fast discovery of association rules. In Proc. KDD 1997, pp. 283--286.
[23]
Q. Zhang et al. Finding frequent items in probabilistic data. In Proc. ACM SIGMOD 2008, pp. 819--832.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IDEAS '12: Proceedings of the 16th International Database Engineering & Applications Sysmposium
August 2012
261 pages
ISBN:9781450312349
DOI:10.1145/2351476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Charles University: Charles University
  • BytePress
  • Concordia University: Concordia University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 August 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data mining
  2. frequent patterns
  3. probabilistic databases
  4. vertical data representation
  5. vertical mining

Qualifiers

  • Short-paper

Conference

IDEAS '12
Sponsor:
  • Charles University
  • Concordia University

Acceptance Rates

Overall Acceptance Rate 74 of 210 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Q-Eclat: Vertical Mining of Interesting Quantitative PatternsProceedings of the 26th International Database Engineered Applications Symposium10.1145/3548785.3548808(25-33)Online publication date: 22-Aug-2022
  • (2021)Explainable Data Analytics for Disease and Healthcare InformaticsProceedings of the 25th International Database Engineering & Applications Symposium10.1145/3472163.3472175(65-74)Online publication date: 14-Jul-2021
  • (2021)UP-tree & UP-MineEngineering Applications of Artificial Intelligence10.1016/j.engappai.2021.104477106:COnline publication date: 1-Nov-2021
  • (2020)Vertical Data Mining from Relational Data and Its Application to COVID-19 DataBig Data Analyses, Services, and Smart Data10.1007/978-981-15-8731-3_8(106-116)Online publication date: 11-Sep-2020
  • (2017)Social Media MiningProceedings of the 21st International Database Engineering & Applications Symposium10.1145/3105831.3105854(20-29)Online publication date: 12-Jul-2017
  • (2015)Frequent Subgraph Mining from Streams of Uncertain DataProceedings of the Eighth International C* Conference on Computer Science & Software Engineering10.1145/2790798.2790799(18-27)Online publication date: 13-Jul-2015
  • (2014)A machine learning approach for stock price predictionProceedings of the 18th International Database Engineering & Applications Symposium10.1145/2628194.2628211(274-277)Online publication date: 7-Jul-2014
  • (2014)Uncertain Frequent Pattern MiningFrequent Pattern Mining10.1007/978-3-319-07821-2_14(339-367)Online publication date: 30-Aug-2014
  • (2013)Stream mining of frequent sets with limited memoryProceedings of the 28th Annual ACM Symposium on Applied Computing10.1145/2480362.2480398(173-175)Online publication date: 18-Mar-2013
  • (2013)Stream Mining of Frequent Patterns from Delayed Batches of Uncertain DataProceedings of the 15th International Conference on Data Warehousing and Knowledge Discovery - Volume 805710.1007/978-3-642-40131-2_18(209-221)Online publication date: 26-Aug-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media