Improving structural medical process comparison by exploiting domain knowledge and mined information

https://doi.org/10.1016/j.artmed.2014.07.001Get rights and content

Highlights

  • Our framework support process mining and process comparison in medicine.

  • Process comparison exploits domain knowledge and all available mined information.

  • Tests in stroke management show that, with respect to previously published metrics our approach generates outputs closer to those of a stroke management expert; the framework can support experts in answering key research questions.

Abstract

Objectives

Process model comparison and similar process retrieval is a key issue to be addressed in many real-world situations, and a particularly relevant one in medical applications, where similarity quantification can be exploited to accomplish goals such as conformance checking, local process adaptation analysis, and hospital ranking. In this paper, we present a framework that allows the user to: (i) mine the actual process model from a database of process execution traces available at a given hospital; and (ii) compare (mined) process models. The tool is currently being applied in stroke management.

Methods

Our framework relies on process mining to extract process-related information (i.e., process models) from data. As for process comparison, we have modified a state-of-the-art structural similarity metric by exploiting: (i) domain knowledge; (ii) process mining outputs and statistical temporal information. These changes were meant to make the metric more suited to the medical domain.

Results

Experimental results showed that our metric outperforms the original one, and generated output closer than that provided by a stroke management expert. In particular, our metric correctly rated 11 out of 15 mined hospital models with respect to a given query. On the other hand, the original metric correctly rated only 7 out of 15 models. The experiments also showed that the framework can support stroke management experts in answering key research questions: in particular, average patient improvement decreased as the distance (according to our metric) from the top level hospital process model increased.

Conclusions

The paper shows that process mining and process comparison, through a similarity metric tailored to medical applications, can be applied successfully to clinical data to gain a better understanding of different medical processes adopted by different hospitals, and of their impact on clinical outcomes. In the future, we plan to make our metric even more general and efficient, by explicitly considering various methodological and technological extensions. We will also test the framework in different domains.

Introduction

Process model comparison and similar process retrieval is a key issue to be addressed in many real-world situations. For example, when two companies are merged, process engineers need to compare processes originating from the two companies, in order to analyze their possible overlaps, and to identify areas for consolidation. Moreover, large companies build over time huge process model repositories, which serve as a knowledge base for their ongoing process management/enhancement efforts. Before adding a new process model to the repository, process engineers have to check that a similar model does not already exist, in order to prevent duplication. Particularly interesting is the case of medical process model comparison, where similarity quantification can also be exploited in a conformance checking perspective. Indeed, the process model actually implemented at a given healthcare organization can be compared to the existing reference clinical guideline, to check conformance, and/or to understand the level of adaptation to local constraints that may have been required. As a matter of fact, the existence of local resource constraints may lead to differences between the models implemented at different hospitals, even when referring to the treatment of the same disease (and to the same guideline). A quantification of these differences (and maybe a ranking of the hospitals derived from it) can be exploited for several purposes, like, e.g., administrative purposes, performance evaluation and public funding distribution. The actual medical process models are not always explicitly available at the healthcare organization. However, a database of process execution traces (also called the “event log”) can often be reconstructed starting from data that hospitals collect through their information systems (in the best case by means of workflow technology).

In this case, process mining techniques [1] can be exploited, to extract process models from event log data. Stemming from these considerations, in this work we present a framework, which allows the user to:

  • 1.

    extract the actual process model from the available medical process execution traces, through process mining techniques;

  • 2.

    perform medical process model comparison, to fulfill the objectives described above.

Item 2 has required the introduction of proper metrics, in order to quantify process model similarity. We could rely on an extensive literature when studying this topic (see Section 4). In particular, since process mining extracts the process model in the form of a graph, our work is located in the research stream on graph structural similarity, and on graph edit distance-based approaches [2], [3]. The state of the art on structural similarity on process models is represented by the work by Dijkman et al. [2]. Specifically, we have extended the work in [2], by:
  • exploiting domain knowledge;

  • exploiting process mining outputs and statistical temporal information.

We believe that the use of domain knowledge represents a significant enhancement in the metric definition, which, otherwise, would operate in a “blind” and context-independent fashion. Indeed, the original metric in [2] is completely independent of the domain of application. On the other hand, when domain knowledge is available, rich and well consolidated, as is often the case in medicine, its exploitation can surely improve the quality of any automated support to the expert's work – including process comparison (see e.g., [4]). Moreover, the use of additional information extracted from data, and of temporal information in particular, can be a relevant advancement as well, in fields in which the role of time can be very critical, like, e.g., emergency medicine. We are currently applying our framework to stroke management. In this domain, the positive experimental results we have obtained support the statements above. Indeed, our metric has proved to outperform the original metric in [2], and to generate outputs that are closer to those provided by a stroke management expert (see Section 3.1). Having verified the reliability of our tool through the experimental study described in Section 3.1, we have then applied it to address a key, open research question in stroke management, namely: do similar process models (implemented in different stroke units) lead to similar clinical outcomes (e.g., patient survival rate and/or patient improvement rate at discharge)? Some interesting conclusions on this issue were obtained (see Section 3.2), testifying the potential clinical usefulness of our contribution. The paper is organized as follows. Section 2 provides the details of our methodological approach. Section 3 showcases experimental results. Section 4 compares our contribution to related works. Section 5 shows the limitations of our work, and our future research directions, meant to overcome the open issues. Section 6 illustrates our concluding remarks.

Section snippets

Methods

In this section, we will first introduce process ming and the ProM tool; then we will provide the technical details of our metric.

Results

We have applied our framework, which is implemented in Java3, to stroke management processes. A stroke is the rapidly developing loss of brain function(s) due to disturbance in the blood supply to the brain. This can be due to ischemia (lack of glucose and oxygen

Comparison to related works

Graph representation and retrieval is a very active research area, which is giving birth to different methodological approaches and software tools. Graph databases, like, e.g., HypergraphDB [16] and DEX [17], are gaining popularity, for working in emerging linked data such as social network data and biological data. However, in this section we will focus on contributions that are more closely related to comparison and retrieval in process/workflow management research. As clearly stated in [9],

Discussion on limitations and future research directions

Despite the novelty of our approach, discussed in Section 4, and despite the encouraging experimental results presented in Section 3, we wish to point out some limitations of the current version of our work, that will guide us in the choice of future research directions. Namely, the following issues still need to be managed:

  • in distance calculation, we currently simplify the control flow information of the mined models, by simply considering sequence, and ignoring AND/OR splits and joins. In the

Conclusions

This work showed that process mining and process comparison can be applied successfully to clinical data to gain a better understanding of medical processes. It is interesting to analyze the differences, to establish whether they concern only the scheduling of the various tasks or also the tasks themselves. In this way, not only may different practices that are used to treat similar patients be discovered, but also unexpected behavior may be highlighted. Experimental results have shown the

Acknowledgments

This research is partially supported by the GINSENG Project, Compagnia di San Paolo. We would like to thank Dr. I. Canavero for the independent evaluation of process distance.

References (49)

  • R. Dijkman et al.

    Graph matching algorithms for business process model similarity search

  • R. Basu et al.

    Intelligent decision support in healthcare

    Analytics

    (2012)
  • IEEE taskforce on process mining: process mining manifesto, http://www.win.tue.nl/ieeetfpm [last accessed on...
  • B. van Dongen et al.

    The proM framework: a new era in process mining tool support

  • A. Weijters et al.

    Process mining with the heuristic miner algorithm, BETA working paper series, WP 166

    (2006)
  • M. Palmer et al.

    Verb semantics for English–Chinese translation

    Mach Transl

    (1995)
  • E. Chiabrando et al.

    Semantic similarity in heterogeneous ontologies

  • A. Marzal et al.

    Computation of normalized edit distance and applications

    IEEE Trans Pattern Anal Mach Intell

    (1993)
  • L. Yujian et al.

    A normalized levenshtein distance metric

    IEEE Trans Pattern Anal Mach Intell

    (2007)
  • D. Inzitari et al.

    Italian stroke guidelines (spread): evidence and clinical practice

    Neurol Sci

    (2006)
  • S. Panzarasa et al.

    Data mining techniques for analyzing stroke care processes

    Stud Health Technol Inform

    (2010)
  • B. Iordanov

    Hypergraphdb: a generalized graph database

  • N. Martínez-Bazan et al.

    Dex: a high-performance graph database management system

  • S. Melnik et al.

    Similarity flooding: a versatile graph matching algorithm and its application to schema matching

    (2002)
  • Cited by (21)

    • xPM: Enhancing exogenous data visibility

      2022, Artificial Intelligence in Medicine
      Citation Excerpt :

      Similar work can be seen in the healthcare domain, where process mining has been applied to understand many perspectives across a health process. In [44], the authors proposed a comparison measurement for two models to compare clinical guidelines with historical health records. Subsequently, analysis was performed to relate clinical outcomes for stroke management with a computed distance between a mined process model and the query process model.

    • Process mining in healthcare – An updated perspective on the state of the art

      2022, Journal of Biomedical Informatics
      Citation Excerpt :

      In 22,1% of the cases (58 papers), this distinction was not relevant as these papers were, for example, conceptual papers or literature reviews. Popular algorithms that are applied include heuristics miner [60–65], fuzzy miner [66–68], and inductive miner [58,69,70]. Finally, it is assessed whether process mining is combined with other techniques.

    • Recommendations for enhancing the usability and understandability of process mining in healthcare

      2020, Artificial Intelligence in Medicine
      Citation Excerpt :

      In this respect, Mans et al. [58] compare stroke treatment in two different processes. The same context, stroke treatment, is also considered by Montani et al. [59] to demonstrate a general comparison technique for clinical processes. Partington et al. [60] consider patients presenting themselves with acute coronary syndrome symptoms at the emergency department of four Australian hospitals.

    • Probabilistic modeling personalized treatment pathways using electronic health records

      2018, Journal of Biomedical Informatics
      Citation Excerpt :

      Section 4 shows the experimental results on a clinical data set with 48,024 CVD patients, and finally, Section 5 concludes the work. Process mining is a research discipline that focuses on providing evidence-based analysis for effective business process management [8,25,26]. Shifting to clinical settings, applications that employ process mining techniques to routinely collected clinical data can enable healthcare stakeholders to empirically investigate treatment behaviors as they are delivered by different health providers [7].

    View all citing articles on Scopus
    1

    On behalf of the Stroke Unit Network (SUN) collaborating centers.

    View full text