Abstract
In modern enterprises, business processes (BPs) are realized over a mix of workflows, IT systems, Web services, and direct collaborations of people. Accordingly, process data (i.e., BP execution data such as logs containing events, interaction messages, and other process artifacts) are scattered across several systems and data sources and increasingly show all typical properties of the Big Data. Understanding the execution of process data is challenging as key business insights remain hidden in the interactions among process entities: most objects are interconnected, forming complex heterogeneous but often semi-structured networks. In the context of business processes, we consider the Big data problem as a massive number of interconnected data islands from personal, shared, and business data. We present a framework to model process data as graphs, i.e., process graph, and present abstractions to summarize the process graph and to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. We present a language, namely BP-SPARQL, for the explorative querying and understanding of process graphs from various user perspectives. We have implemented a scalable architecture for querying, exploration, and analysis of process graphs. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aggarwal, C., Wang, H.: Managing and Mining Graph Data. Springer Publishing Company (2010)
Amouzgar, F., Beheshti, A., Ghodratnama, S., Benatallah, B., Yang, J., Sheng, Q.Z.: iSheets: A spreadsheet-based machine learning development platform for data-driven process analytics. In: Service-Oriented Computing - ICSOC 2018 Workshops - ADMS, ASOCA, ISYyCC, CloTS, DDBS, and NLS4IoT, Hangzhou, China, November 12–15, 2018, Revised Selected Papers, pp. 453–457 (2018)
Anyanwu, K., Maduko, A., Sheth, A.P.: SPARQ2L: towards support for subgraph extraction queries in RDF databases. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8–12, 2007, pp. 797–806 (2007)
Barros, A.P., Decker, G., Dumas, M., Weber, F.: Correlation patterns in service-oriented architectures. In: FASE, pp. 245–259 (2007)
Beheshti, S., Benatallah, B., Nezhad, H.R.M., Sakr, S.: A query language for analyzing business processes execution. In: Business Process Management - 9th International Conference, BPM 2011, Clermont-Ferrand, France, August 30 - September 2, 2011. Proceedings, pp. 281–297 (2011)
Beheshti, S., Benatallah, B., Nezhad, H.R.M., Allahbakhsh, M.: A framework and a language for on-line analytical processing on graphs. In: Web Information Systems Engineering - WISE 2012 - 13th International Conference, Paphos, Cyprus, November 28–30, 2012. Proceedings, pp. 213–227 (2012)
Beheshti, S., Nezhad, H.R.M., Benatallah, B.: Temporal provenance model (TPM): model and query language. CoRR abs/1211.5009 (2012)
Beheshti, S., Benatallah, B., Nezhad, H.R.M.: Enabling the analysis of cross-cutting aspects in ad-hoc processes. In: Advanced Information Systems Engineering - 25th International Conference, CAiSE 2013, Valencia, Spain, June 17–21, 2013. Proceedings, pp. 51–67 (2013)
Beheshti, S., Benatallah, B., Motahari-Nezhad, H.R.: Galaxy: A platform for explorative analysis of open data sources. In: Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, March 15–16, 2016, Bordeaux, France, March 15–16, 2016., pp. 640–643 (2016)
Beheshti, S., Benatallah, B., Motahari-Nezhad, H.R.: Scalable graph-based OLAP analytics over process execution data. Distrib. Parallel Databases 34(3), 379–423 (2016)
Beheshti, S., Benatallah, B., Sakr, S., Grigori, D., Motahari-Nezhad, H.R., Barukh, M.C., Gater, A., Ryu, S.H.: Process Analytics - Concepts and Techniques for Querying and Analyzing Process Data. Springer (2016)
Beheshti, A., Benatallah, B., Nouri, R., Chhieng, V.M., Xiong, H., Zhao, X.: CoreDB: a data lake service. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017, pp. 2451–2454 (2017)
Beheshti, A., Benatallah, B., Motahari-Nezhad, H.R.: Processatlas: A scalable and extensible platform for business process analytics. Softw. Pract. Exper. 48(4), 842–866 (2018)
Beheshti, A., Benatallah, B., Nouri, R., Tabebordbar, A.: CoreKG: A knowledge lake service. PVLDB 11(12), 1942–1945 (2018)
Bhattacharya, K., Gerede, C.E., Hull, R., Liu, R., Su, J.: Towards formal analysis of artifact-centric business process models. In: BPM, pp. 288–304 (2007)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Gerede, C., Su, J.: Specification and verification of artifact behaviors in business process models. In: ICSOC, pp. 181–192 (2007)
Kochut, K., Janik, M.: SPARQLeR: Extended SPARQL for semantic association discovery. In: The Semantic Web: Research and Applications, 4th European Semantic Web Conference, ESWC 2007, Innsbruck, Austria, June 3–7, 2007, Proceedings, pp. 145–159 (2007)
Kuo, J.: A document-driven agent-based approach for business processes management. Inf. Softw. Technol. 46(6), 373–382 (2004)
McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D., Barton, D.: Big data: the management revolution. Harv. Bus. Rev. 90(10), 60–68 (2012)
Motahari-Nezhad, H., Saint-Paul, R., Casati, F., Benatallah, B.: Event correlation for process discovery from Web service interaction logs. VLDB J. 20(3), 417–444 (2011)
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: a not-so-foreign language for data processing. In: SIGMOD, pp. 1099–1110 (2008)
Polyvyanyy, A., Ouyang, C., Barros, A., van der Aalst, W.M.P.: Process querying: Enabling business intelligence through query-based process analytics. Decis. Support Syst. 100, 41–56 (2017)
Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF (working draft). Tech. rep., W3C (2007)
Schätzle, A., Przyjaciel-Zablocki, M., Lausen, G.: PigSPARQL: mapping SPARQL to Pig Latin. In: Proceedings of the International Workshop on Semantic Web Information Management (2011)
Sun, Y., Su, J., Yang, J.: Universal artifacts: A new approach to business process management (BPM) systems. ACM Trans. Manag. Inf. Syst. 7(1), 3:1–3:26 (2016)
van der Aalst, W., ter Hofstede, A.H.M., Weske, M.: Business process management: A survey. In: BPM (2003)
White, T.: Hadoop: The Definitive Guide, original edn. O’Reilly Media (2009)
Yu, J.X., Cheng, J.: Graph reachability queries: A survey. In: Managing and Mining Graph Data, pp. 181–215. Springer (2010)
Zikopoulos, P., Eaton, C., et al.: Understanding Big data: Analytics for enterprise class Hadoop and streaming data. McGraw-Hill Osborne Media (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Beheshti, A., Benatallah, B., Motahari-Nezhad, H.R., Ghodratnama, S., Amouzgar, F. (2022). BP-SPARQL: A Query Language for Summarizing and Analyzing Big Process Data. In: Polyvyanyy, A. (eds) Process Querying Methods. Springer, Cham. https://doi.org/10.1007/978-3-030-92875-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-92875-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92874-2
Online ISBN: 978-3-030-92875-9
eBook Packages: Computer ScienceComputer Science (R0)