Finding Patterns in Large Star Schemas at the Right Aggregation Level

Silva, Andreia; Antunes, Cláudia

doi:10.1007/978-3-642-34620-0_30

Andreia Silva²² &
Cláudia Antunes²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7647))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

1019 Accesses
3 Citations

Abstract

There are many stand-alone algorithms to mine different types of patterns in traditional databases. However, to effectively and efficiently mine databases with more complex and large data tables is still a growing challenge in data mining. The nature of data streams makes streaming techniques a promising way to handle large amounts of data, since their main ideas are to avoid multiple scans and optimize memory usage. In this paper we propose in detail an algorithm for finding frequent patterns in large databases following a star schema, based on streaming techniques. It is able to mine traditional star schemas, as well as stars with degenerate dimensions. It is able to aggregate the rows in the fact table that relate to the same business fact, and therefore find patterns at the right business level. Experimental results show that the algorithm is accurate and performs better than the traditional approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Crestana-Jensen, V., Soparkar, N.: Frequent itemset counting across multiple tables. In: PADKK 2000: Proc. of the 4th Pacific-Asia Conf. on Knowl. Discovery and Data Mining, Current Issues and New Applications, London, pp. 49–61 (2000)
Google Scholar
Dehaspe, L., Raedt, L.: Mining Association Rules in Multiple Relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)
Chapter Google Scholar
Fumarola, F., Ciampi, A., Appice, A., Malerba, D.: A Sliding Window Algorithm for Relational Frequent Patterns Mining from Data Streams. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 385–392. Springer, Heidelberg (2009)
Chapter Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities: Next generation data mining (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of the 2000 ACM SIGMOD, pp. 1–12. ACM, New York (2000)
Chapter Google Scholar
Hou, W., Yang, B., Xie, Y., Wu, C.: Mining multi-relational frequent patterns in data streams. In: BIFE 2009: Proc. of the Second Intern. Conf. on Business Intelligence and Financial Engineering, pp. 205–209 (2009)
Google Scholar
Kimball, R., Ross, M.: The Data warehouse Toolkit, 2nd edn. John Wiley & Sons, Inc., New York (2002)
Google Scholar
Liu, H., Lin, Y., Han, J.: Methods for mining frequent items in data streams: an overview. Knowl. Inf. Syst. 26, 1–30 (2011)
Article Google Scholar
Ng, E., Fu, A., Wang, K.: Mining association rules from stars. In: ICDM 2002: Proc. of the 2002 IEEE Intern. Conf. on DM, Japan, pp. 322–329. IEEE (2002)
Google Scholar
Silva, A., Antunes, C.: Pattern Mining on Stars with FP-Growth. In: Torra, V., Narukawa, Y., Daumas, M. (eds.) MDAI 2010. LNCS, vol. 6408, pp. 175–186. Springer, Heidelberg (2010)
Chapter Google Scholar
Silva, A., Antunes, C.: Mining Patterns from Large Star Schemas Based on Streaming Algorithms. In: Lee, R. (ed.) Computer and Information Science 2012. SCI, vol. 429, pp. 139–150. Springer, Heidelberg (2012)
Chapter Google Scholar
Xu, L.-J., Xie, K.-L.: A novel algorithm for frequent itemset mining in data warehouses. Journal of Zhejiang University - Science A 7(2), 216–224 (2006)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Instituto Superior Técnico, Technical University of Lisbon, Lisbon, Portugal
Andreia Silva & Cláudia Antunes

Authors

Andreia Silva
View author publications
You can also search for this author in PubMed Google Scholar
Cláudia Antunes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IIIA-CSIC, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra
Toho Gakuen, 3-1-10, Naka, 186-0004, Kunitachi, Tokyo, Japan
Yasuo Narukawa
Universitat de Girona, Campus Montilivi, building EPS-4, 17071, Girona, Spain
Beatriz López & Mateu Villaret &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Silva, A., Antunes, C. (2012). Finding Patterns in Large Star Schemas at the Right Aggregation Level. In: Torra, V., Narukawa, Y., López, B., Villaret, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2012. Lecture Notes in Computer Science(), vol 7647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34620-0_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-34620-0_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34619-4
Online ISBN: 978-3-642-34620-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics