Skip to main content
Log in

Developing provenance-aware query systems: an occurrence-centric approach

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In recent years, research on provenance has increased exponentially, and such studies in the field of business process monitoring have been especially remarkable. Business process monitoring deals with recording information about the actual execution of processes to then extract valuable knowledge that can be utilized for business process quality improvement. In prior research, we developed an occurrence-centric approach built on our notion of occurrence that provides a holistic perspective of system dynamics. Based on this concept, more complex structures are defined herein, namely Occurrence Base (OcBase) and Occurrence Management System (OcSystem), which serve as scaffolding to develop business process monitoring systems. This paper focuses primarily on the critical provenance task of extracting valuable knowledge from such systems by proposing an Occurrence Query Framework that includes the definition of an Occurrence Base Metamodel and an Occurrence Query Language based on this metamodel. Our framework provides a way of working for the construction of business process monitoring systems that are provenance aware. As a proof of concept, a tool implementing the various components of the framework is presented. This tool has been tested against a real system in the context of biobanks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. A state represents the possibility of an object of being in it, and an active state represents the actual time an object remains in a particular state.

References

  1. Allué A, Domínguez E, López A, Zapata MA (2013) QRP: a CMMI appraisal tool for project quality management. Proced Technol 9:664–669

    Article  Google Scholar 

  2. Bézivin J (2006) Model driven engineering: an emerging technical space. In: Proceedings of GTTSE’05, Springer, pp 36–64

  3. Bloesch AC, Halpin TA (1996) ConQuer: a conceptual query language. In: Thalheim B (ed) ER. Lecture Notes in Computer Science, vol 1157. Springer, pp 121–133

  4. Brauer PC, Hasselbring W (2012) Capturing provenance information with a workflow monitoring extension for the kieker framework. In: Proceedings of the 3rd international workshop on semantic web in provenance management, CEUR-WS

  5. Buneman P, Davidson SB (2010) Data provenance? the foundation of data quality. http://www.sei.cmu.edu/measurement/research/upload/Davidson.pdf. Last visited on April 2016

  6. Campanile F, Coppolino L, Giordano S, Romano L (2008) A business process monitor for a mobile phone recharging system. J Syst Archit 54(9):843–848

    Article  Google Scholar 

  7. Carata L, Akoush S, Balakrishnan N, Bytheway T, Sohan R, Selter M, Hopper A (2014) A primer on provenance. Commun ACM 57(5):52–60

    Article  Google Scholar 

  8. Casasnovas JA, Alcalde V, Civeira F, Guallar E, Ibanez B, Jimenez-Borreguero J, Laclaustra M, Leon M, Ordovas JM, Pocovi M, Sanz G, Fuster V (2012) ‘Aragon workers’ health study—design and cohort description. BMC Cardiovasc Disord 12(45):1–11

    Google Scholar 

  9. Chebotko A, Lu S, Fei X, Fotouhi F (2010) Rdfprov: a relational rdf store for querying and managing scientific workflow provenance. Data Knowl Eng 69(8):836–865

    Article  Google Scholar 

  10. Chen P, Plale B, Aktas MS (2014) Temporal representation for mining scientific data provenance. Future Gener Comput Syst 36:363–378

    Article  Google Scholar 

  11. Chiticariu L, Tan W-C, Vijayvargiya G (2005) Dbnotes: a post-it system for relational databases based on provenance. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, ACM, pp 942–944

  12. Costello C, Molloy O (2008) Towards a semantic framework for business activity monitoring and management. In: AAAI spring symposium: AI meets business rules and process management, pp 17–27

  13. Curbera F, Doganata Y, Martens A, Mukhi NK, Slominski A (2008) Business provenance–a technology to increase traceability of end-to-end operations, vol 5331. In: OTM Conferences (1). Lecture Notes in Computer Science, vol 5331. Springer, pp 100–119

  14. Curcin V, Miles S, Danger R, Chen Y, Bache R, Taweel A (2014) Implementing interoperable provenance in biomedical research. Future Gener Comput Syst 34:1–16

    Article  Google Scholar 

  15. da Cruz SMS, Costa RM, Manhães M, Zavaleta J (2013) Monitoring soa-based applications with business provenance. In: Proceedings of the 28th annual ACM symposium on applied computing, ACM, pp 1927–1932

  16. DeFee J, Harmon P (2005) Workflow handbook. In: Future strategies, chapter business activity monitoring and simulation, pp 53–74

  17. Domínguez E, Pérez B, Rubio AL, Zapata MA, Lavilla J, Allué A (2014) Occurrence-oriented design strategy for developing business process monitoring systems. IEEE Trans Knowl Data Eng 26(7):1749–1762

    Article  Google Scholar 

  18. Freire J, Koop D, Santos E, Silva CT (2008) Provenance for computational tasks: a survey. Comput Sci Eng 10(3):11–21

    Article  Google Scholar 

  19. Gadelha LM Jr, Clifford B, Mattoso M, Wilde M, Foster I (2011) Provenance management in swift. Future Gener Comput Syst 27(6):775–780

    Article  Google Scholar 

  20. Gerede CE, Bhattacharya K, Su J (2007) Static analysis of business artifact-centric operational models. In: Proceedings of SOCA’07, pp 133–140

  21. Glavic B, Dittrich KR (2007) Data provenance: a categorization of existing approaches. In: Datenbanksysteme in business, technologie und web (BTW’07), pp 227–241

  22. Glavic B, Miller RJ, Alonso G (2013) Using sql for efficient generation and querying of provenance information. In: Tannen V, Wong L, Libkin L, Fan W, Tan W-C, Fourman MP (eds) In search of elegance in the theory and practice of computation. Springer, Berlin, pp 291–320

  23. Holland DA, Braun UJ, Maclean D, Muniswamy-Reddy K-K, Seltzer MI (2008) Choosing a data model and query language for provenance. In: Proceedings of the 2nd international provenance and annotation workshop

  24. Joglekar GS, Giridhar A, Reklaitis G (2014) A workflow modeling system for capturing data provenance. Comput Chem Eng 67:148–158

    Article  Google Scholar 

  25. Kang B, Kim D, Kang S-H (2012) Real-time business process monitoring method for prediction of abnormal termination using KNNI-based LOF prediction. Expert Syst Appl 39(5):6061–6068

    Article  Google Scholar 

  26. Kang D, Lee S, Kim K, Lee JY (2009) An OWL-based semantic business process monitoring framework. Expert Syst Appl 36(4):7576–7580

    Article  Google Scholar 

  27. Karsai G, Krahn H, Pinkernell C, Rumpe B, Schneider M, Völkel S (2009) Design guidelines for domain specific languages. In: Proceedings of the 9th OOPSLA workshop on domain-specific modeling (DSM’09), pp 7–13

  28. Karvounarakis G, Ives ZG, Tannen V (2010) Querying data provenance. In: Proceedings of SIGMOD’10, ACM, pp 951–962

  29. Ko RK (2009) A computer scientist’s introductory guide to business process management (BPM). Crossroads 15(4):4:11–4:18

    Google Scholar 

  30. Kobryn C (2000) Architectural patterns for metamodeling. In: Evans A, Kent S, Selic B (eds) UML’00—the unified modeling language. LNCS, vol 1939. Springer, Berlin, pp 265–277

  31. Lucia AD, Deufemia V, Gravino C, Risi M (2009) Design pattern recovery through visual language parsing and source code analysis. J Syst Softw 82(7):1177–1193

    Article  Google Scholar 

  32. Moreau L (2010) The foundations for provenance on the web. Found Trends Web Sci 2(2–3):99–241

    Article  Google Scholar 

  33. Moreau L, Clifford B, Freire J, Futrelle J, Gil Y, Groth P, Kwasnikowska N, Miles S, Missier P, Myers J, Plale B, Simmhan Y, Stephan E, den Bussche JV (2011) The open provenance model core specification (v1. 1). Future Gener Comput Syst 27(6):743–756

    Article  Google Scholar 

  34. Moreau L, Groth P, Miles S, Vazquez-Salceda J, Ibbotson J, Jiang S, Munroe S, Rana O, Schreiber A, Tan V, Varga L (2008) The provenance of electronic data. Commun ACM 51(4):52–58

    Article  Google Scholar 

  35. Moreau L, Missier P (2013) PROV-DM: the PROV data model, technical report, world wide web consortium. http://www.w3.org/TR/prov-dm/

  36. Mukhi NK (2010) Monitoring unmanaged business processes. In: Proceedings of the 2010 international conference on the move to meaningful internet systems—volume part I’, OTM’10, Springer, pp 44–59

  37. OMG (2012) UML 2.4.1 superstructure specification. http://www.omg.org/. Last visited on April 2016

  38. Reichert M, Bassil S, Bobrik R, Bauer T (2010) The proviado access control model for business process monitoring components. Enterp Model Inf Syst Archit Int J 5(3):64–88

    Google Scholar 

  39. Scheidegger C, Koop D, Santos E, Vo H, Callahan S, Freire J, Silva C (2008) Tackling the provenance challenge one layer at a time. Concurr Comput Pract Exp 20(5):473–483

    Article  Google Scholar 

  40. Simmhan Y, Plale B, Gannon D (2005) A survey of data provenance in e-science. ACM Sigmod Rec 34(3):31–36

    Article  Google Scholar 

  41. ter Hofstede AHM, van der Aalst WMP, Adams M, Russell N (2010) Modern business process automation: YAWL and its support environment. Springer, Heidelberg

    Book  Google Scholar 

  42. Tian H, Sunderraman R, Yian H (2007) A domain-specific conceptual data modeling and querying methodology. In: Proceedings of the 1st international conference on information systems, technology and management, New Delhi, India

  43. van der Aalst WMP (2007) Exploring the CSCW spectrum using process mining. Adv Eng Inf 21(2):191–199

    Article  Google Scholar 

  44. van der Aalst W, Weijter A, Maruster L (2003) Workflow mining: discovering process models from event logs. IEEE Trans Knowl Data Eng 16(9):1128–1142

    Article  Google Scholar 

  45. Widom J (2008) Trio: a system for data, uncertainty, and lineage. In: Aggarwal CC (ed) Managing and mining uncertain data. Springer, Berlin, pp 113–148

Download references

Acknowledgments

This work has been partially supported by the Spanish Ministry of Science and Innovation, project SMOTY (IPT-2011-1328-390000), the Univ. of La Rioja, project APPI15/02, and the Univ. of Zaragoza, project UZ2015-TEC-05.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beatriz Pérez.

Appendices

Appendix 1: OcQuery Language syntax

Next, we present the syntax of our OcQuery Language.

figure r
figure s

Appendix 2: Generalized OcQuery Language syntax

In this section, we present the syntax of the Generalized OcQuery Language. We show in bold text the modified or added elements with respect to the OcQuery Language syntax.

figure t
figure u

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Domínguez, E., Pérez, B., Rubio, Á.L. et al. Developing provenance-aware query systems: an occurrence-centric approach. Knowl Inf Syst 50, 661–688 (2017). https://doi.org/10.1007/s10115-016-0950-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0950-z

Keywords

Navigation