Abstract
This paper proposes an approach to represent and analyze the content of workflow logs in a data warehouse. When analyzing workflow logs one big problem arises: typically, an underlying workflow model consists of loops (frequently interleaving), often implemented by using goto-statements. These structures increase the number of possible execution paths significantly - in theory even indefinitely. In a naive Data Warehouse (DWH) implementation one would represent all possible execution paths by means of a dimension. However, this would lead to a huge or even infinite number of elements in the dimension. In this paper, we present a novel approach for analyzing workflow logs including loops and goto-statements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
PHP PEG has been developed by Hamish Friedlander. Available at: https://github.com/hafriedlander/php-peg.
References
http://www.teradata.com/Teradata-Aster/overview/. Accessed 04 December 2014
http://www.xes-standard.org/. Accessed 04 December 2014
Process mining manifesto. IEEE CIS Task Force on Process Mining. http://www.win.tue.nl/ieeetfpm/doku.php?id=shared:process_mining_manifesto. Accessed 04 December 2014
Sql for pattern matching. https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956. Accessed 04 December 2014
Streaminsight. http://msdn.microsoft.com/en-us/library/ee391416. Accessed 04 December 2014
Andrzejewski, W., BÈ©bel, B.: FOCUS: An Index FOr ContinuoUS subsequence pattern queries. In: Morzy, T., HĂ€rder, T., Wrembel, R. (eds.) ADBIS 2012. LNCS, vol. 7503, pp. 29â42. Springer, Heidelberg (2012)
BÈ©bel, B., Morzy, M., Morzy, T., KrĂłlikowski, Z., Wrembel, R.: OLAP-like analysis of time point-based sequential data. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V.S., Lee, M.L. (eds.) ER 2012 Workshops 2012. LNCS, vol. 7518, pp. 153â161. Springer, Heidelberg (2012)
Bebel, B., Morzy, T., KrĂłlikowski, Z., Wrembel, R.: Formal model of time point-based sequential data for OLAP-like analysis. Bull. Pol. Acad. Sci. Tech. Sci. 62(2), 331â340 (2014)
Buchmann, A.P., Koldehofe, B.: Complex event processing. Inf.Technol. 51(5), 241â242 (2009)
Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88â98 (2011)
Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma,S.: Managing RFID data. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2004)
Chui, C.K., Kao, B. Lo, E.Cheung, D.: S-OLAP: an olap system for analyzing sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2010)
Chui, C.K. Lo, E., Kao, B., Ho, W.-S.: Supporting ranking pattern-based aggregate queries in sequence data cubes. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM) (2009)
Dong, G., Pei, J.: Sequence Data Mining, vol. 33. Springer, New York (2007)
Eder, J., Olivotto, G.E., Gruber, W.: A data warehouse for workflow logs. In: Han, Y., Tai, S., Wikarski, D. (eds.) EDCIS 2002. LNCS, vol. 2480, pp. 1â15. Springer, Heidelberg (2002)
Ezeife, C., Monwar, M.: Ssm : A frequent sequential data stream patterns miner. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (2007)
Gonzalez, H., Han, J., Li, X.: FlowCube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2006)
Gonzalez, H., Han, J., Li, X., Klabjan, D.: Warehousing and analyzing massive RFID data sets. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 83-93 (2006)
Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: an architecture for multi-dimensional analysis of data streams. Distributed and Parallel Databases 18(2), 173â197 (2005)
Han, J.-W., Pei, J., Yan, X.-F.: From sequential pattern mining to structured pattern mining: a pattern-growth approach. J. Comput. Sci. Technol. 19(3), 257â279 (2004)
Koncilia, C., Morzy, T., Wrembel, R., Eder, J.: Interval OLAP: analyzing interval data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 233â244. Springer, Heidelberg (2014)
Liu, M. Rundensteiner, E., Greenfield, K., Gupta, C., Wang, S., Ari, I., Mehta, A.: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2011)
Liu, M., Rundensteiner, E.A.: Event sequence processing: new models and optimization techniques. In: Proceedings of SIGMOD Ph.D. Workshop on Innovative Database Research (IDAR) (2010)
Lo, E., Kao, B., Ho, W.-S., Lee, S.D., Chui, C.K., Cheung, D.W.: OLAP on sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2008)
Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 1â41 (2010)
Marascu, A., Masseglia, F.: Mining sequential patterns from data streams: a centroid approach. J. Intell. Inf. Syst. 27(3), 291â307 (2006)
Masseglia, F., Teisseire, M., Poncelet, P.: Sequential pattern mining. In: Wang, J. (ed.) Encyclopedia of Data Warehousing and Mining. IGI Global, Hershey (2009)
Melton, J. (ed.).: Working draft database language sql - part 15: Row pattern recognition (sql/rpr). ANSI INCITS DM32.2-2011-00005 (2011)
Mendes, L.F., Ding, B., Han, J.: Stream sequential pattern mining with precise error bounds. In: Proceedings of the IEEE International Conference on Data Mining (ICDM) (2008)
Mooney, C.H., Roddick, J.F.: Sequential pattern mining - approaches and algorithms. ACM Comput.Surv. 45(2), 19 (2013)
Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of Internatiional Conference on Data Engineering (ICDE) (2001)
Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: Sorted relational query language. In: Proceedings of Internatonal Conference on Scientific and Statistical Database Management (SSDBM) (1998)
Sadri, R., Zaniolo, C., Zarkesh, A., Adibi, J.: Optimization of sequence queries in database systems. In: Procedings of ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database System (PODS) (2001)
Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: A sequential pattern query language for supporting instant data mining for e-services. In: Proceedings of International Conference on Very Large Data Bases (VLDB) (2001)
Seshadri, P., Livny, M., Ramakrishnan, R.: Sequence query processing. SIGMOD Record 23(2), 430â441 (1994)
Seshadri, P., Livny, M., Ramakrishnan, R.: SEQ: A model for sequence databases. In: Proceedings of International Conference on Data Engineering (ICDE) (1995)
Seshadri, P., Livny, M., Ramakrishnan, R.: The design and implementation of a sequence database system. In: Proceedings of Interntional Conference on Very Large Data Bases (VLDB) (1996)
Vaisman, A., ZimĂĄnyi, E.: Data Warehouse Systems. Springer, Heidelberg (2014). ISBN 978-3-642-54655-6
van der Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. In: Song, M., Wynn, M.T., Liu, J. (eds.) AP-BPM 2013. LNBIP, vol. 159, pp. 1â22. Springer, Heidelberg (2013)
van Dongen, B., van der Aalst, W.M.P.: A meta model for process mining data. In: Proceedings of of CAiSE Workshops (2005)
Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60â75. Springer, Heidelberg (2011)
Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of ACM SIGMOD International Conference on Management of Data (2006)
Zheng, Q., Xu, K., Ma, S.: When to update the sequential patterns of stream data? In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 545â550. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Koncilia, C., Pichler, H., Wrembel, R. (2015). A Generic Data Warehouse Architecture for Analyzing Workflow Logs. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-23135-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23134-1
Online ISBN: 978-3-319-23135-8
eBook Packages: Computer ScienceComputer Science (R0)