Abstract
Calendar-based pattern mining aims at identifying patterns on specific calendar partitions. Potential calendar partitions are for example: every Monday, every first working day of each month, every holiday. Providing flexible mining capabilities for calendar-based partitions is especially challenging in a data stream scenario. The calendar partitions of interest are not known a priori and at each point in time only a subset of the detailed data is available. We show how a data warehouse approach can be applied to this problem. The data warehouse that keeps track of frequent itemsets holding on different partitions of the original stream has low storage requirements. Nevertheless, it allows to derive sets of patterns that are complete and precise. This work demonstrates the effectiveness of our approach by a series of experiments.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. ACM SIGMOD, pp. 207–216. ACM Press, New York (1993)
Ramaswamy, S., Mahajan, S., Silberschatz, A.: On the discovery of interesting patterns in association rules. In: Proc. of the VLDB Conf., pp. 368–379 (1998)
Demaine, E.D., L´opez-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, Springer, Heidelberg (2002)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., et al. (eds.) Data Mining: Next Generation Challenges and Future Directions, AAAI/MIT Press (2003)
Xie, Z.-j., Chen, H., Li, C.: MFIS-Mining Frequent Itemsets on Data Streams. In: Proceeding of the Advanced Data Mining and Applications, pp. 1085–1093 (2006)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceeding of the 2000 SIGMOD Conference, Dallas, Texas, May 2000, pp. 1–12 (2000)
Karp, R.M., Papadimitriou, C.H., Shenker, S.: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Systems (2003)
Monteiro, R.S., Zimbrão, G., Schwarz, H., Mitschang, B., Souza, J.M.: DWFIST: The Data Warehouse of Frequent Itemsets Tactics Approach. In: Darmont, J., Boussaid, O. (eds.) Processing and Managing Complex Data for Decision Support, pp. 185–214. Idea Group Publishing, USA (2006)
Li, Y., Ning, P., Wang, X.S., Jajodia, S.: Discovering calendar-based temporal association rules. In: Proc. Int. Symp. Temp. Representation and Reasoning, pp. 111–118 (2001)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings KDD 1998, pp. 80–86. AAAI Press, New York, USA (1998)
Manku, G., Motwani, R.: Approximate frequency counts over data streams. In: VLDB Conf., pp. 346–357 (2002)
Mannila, H., Toivonen, H.: Multiple Uses of Frequent Sets and Condensed Representations. In: Proceedings KDD 1996, pp. 189–194. AAAI Press, Portland (1996)
Monteiro, R.S., Zimbrão, G., Schwarz, H., Mitschang, B., Souza, J.M.: Building the Data Warehouse of Frequent Itemsets in the DWFIST Approach. In: Proceedings of the 15th Int. Symp. on Methodologies for Intelligent Systems, May 2005, Saratoga Springs, NY (2005)
Özden, B., Ramaswamy, S., Silberschatz, A.: Cyclic association rules. In: Proc. of the 14th Int’l Conf. on Data Engineering, pp. 412–421 (1998)
Agrawal, R., Arning, A., Bollinger, T., Mehta, M., Shafer, J., Srikant, R.: The quest data mining system. In: Proc. of the 2nd KDD, August 1996, Portland, Oregon (1996)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Monteiro, R.S., Zimbrão, G., Schwarz, H., Mitschang, B., de Souza, J.M. (2007). DWFIST: Leveraging Calendar-Based Pattern Mining in Data Streams. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-74553-2_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74552-5
Online ISBN: 978-3-540-74553-2
eBook Packages: Computer ScienceComputer Science (R0)