Abstract
The discovery of predictive models for process performances is an emerging topic, which poses a series of difficulties when considering complex and flexible processes, whose behaviour tend to change over time depending on context factors. We try to face such a situation by proposing a predictive-clustering approach, where different context-related execution scenarios are equipped with separate prediction models. Recent methods for the discovery of both Predictive Clustering Trees and state-aware process performance predictors can be reused in the approach, provided that the input log is preliminary converted into a suitable propositional form, based on the identification of an optimal subset of features for log traces. In order to make the approach more robust and parameter free, we also introduce an ensemble-based clustering method, where multiple PCTs are learnt (using different, randomly selected, subsets of features), and integrated into an overall model. Several tests on real-life logs confirmed the validity of the approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CLUS: A predictive clustering system, http://dtai.cs.kuleuven.be/clus/
van der Aalst, W.M.P., van Dongen, B.F., Herbst, J., Maruster, L., Schimm, G., Weijters, A.J.M.M.: Workflow mining: a survey of issues and approaches. Data & Knowledge Engineering 47(2), 237–267 (2003)
van der Aalst, W.M.P., Schonenberg, M.H., Song, M.: Time prediction based on process mining. Information Systems 36(2), 450–475 (2011)
Aho, T., Zenko, B., Dzeroski, S.: Rule ensembles for multi-target regression. In: Proc. of 9th Int. Conf. on Data Mining, ICDM 2009, pp. 21–30 (2009)
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proc. of 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD 2001, pp. 245–250 (2001)
Blockeel, H., De Raedt, L.: Top-down induction of first-order logical decision trees. Artificial Intelligence 101(1-2), 285–297 (1998)
Boulis, C., Ostendorf, M.: Combining multiple clustering systems. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 63–74. Springer, Heidelberg (2004)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Casillas, A., de Lena, M.T.G.d., Martínez, R.: Document clustering into an unknown number of clusters using a genetic algorithm. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 43–49. Springer, Heidelberg (2003)
van Dongen, B.F., Crooy, R.A., van der Aalst, W.M.P.: Cycle time prediction: When will this case finally be finished? In: Proc. of 16th Int. Conf. on Cooperative Information Systems, CoopIS 2008, pp. 319–336 (2008)
Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proc. of 20th Int. Conf. on Machine Learning, ICML 2003, pp. 186–193 (2003)
Filkov, V., Skiena, S.S.: Heterogeneous data integration with the consensus clustering formalism. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 110–123. Springer, Heidelberg (2004)
Folino, F., Guarascio, M., Pontieri, L.: Discovering context-aware models for predicting business process performances. In: Proc. of 20th Int. Conf. on Cooperative Information Systems, CoopIS 2012, pp. 287–304 (2012)
Goder, A., Filkov, V.: Consensus clustering algorithms: Comparison and refinement. In: Proc. of Workshop on Algorithm Engineering and Experiments, ALENEX 2008, pp. 109–117 (2008)
Greco, G., Guzzo, A., Pontieri, L., Saccà, D.: Discovering expressive process models by clustering log traces. IEEE Transaction on Knowledge and Data Engineering 18(8), 1010–1027 (2006)
Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Ensembles of multi-objective decision trees. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 624–631. Springer, Heidelberg (2007)
Mufti, G.B., Bertrand, P., El Moubarki, L.: Determining the number of groups from measures of cluster stability. In: Proc. of Int. Symp. on Applied Stochastic Models and Data Analysis, ASMDA 2005, pp. 404–412 (2005)
Opitz, D.W., Shavlik, J.W.: Generating accurate and diverse members of a neural-network ensemble. In: Proc. of Advances in Neural Information Processing Systems 8, NIPS 1995, pp. 535–541 (1995)
Pelleg, D., Moore, A.W.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Proc. of 17th Int. Conf. on Machine Learning, ICML 2000, pp. 727–734 (2000)
Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Topchy, A.P., Jain, A.K., Punch, W.F.: A mixture model for clustering ensembles. In: Proc. of 4th SIAM Int. Conf. on Data Mining, SDM 2004 (2004)
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bulletin 1(6), 80–83 (1945)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Folino, F., Guarascio, M., Pontieri, L. (2013). Context-Aware Predictions on Business Processes: An Ensemble-Based Solution. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2012. Lecture Notes in Computer Science(), vol 7765. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37382-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-37382-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37381-7
Online ISBN: 978-3-642-37382-4
eBook Packages: Computer ScienceComputer Science (R0)