skip to main content
10.1145/2678025.2701407acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

Cohort Comparison of Event Sequences with Balanced Integration of Visual Analytics and Statistics

Published: 18 March 2015 Publication History

Abstract

Finding the differences and similarities between two datasets is a common analytics task. With temporal event sequence data, this task is complex because of the many ways single events and event sequences can differ between the two datasets (or cohorts) of records: the structure of the event sequences (e.g., event order, co-occurring events, or event frequencies), the attributes of events and records (e.g., patient gender), or metrics about the timestamps themselves (e.g., event duration). In exploratory analyses, running statistical tests to cover all cases is time-consuming and determining which results are significant becomes cumbersome. Current analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. This paper presents a taxonomy of metrics for comparing cohorts of temporal event sequences, showing that the problem-space is bounded. We also present a visual analytics tool, CoCo (for "Cohort Comparison"), which implements balanced integration of automated statistics with an intelligent user interface to guide users to significant, distinguishing features between the cohorts. Lastly, we describe two early case studies: the first with a research team studying medical team performance in the emergency department and the second with pharmacy researchers.

References

[1]
Tableau software. http://www.tableausoftware.com/, Mar 2014.
[2]
Agrawal, R., and Srikant, R. Mining sequential patterns. In Proc. 11th International Conference on Data Engineering, IEEE Comput. Soc. Press (1995), 3--14.
[3]
Allison, P. D. Discrete-time methods for the analysis of event histories. Sociological Methodology 13, 1 (1982), 61--98.
[4]
Álvarez, M. R., Félix, P., and Cariñena, P. Discovering metric temporal constraint networks on temporal databases. Artificial Intelligence in Medicine 58, 3 (July 2013), 139--54.
[5]
Batal, I., Sacchi, L., Bellazzi, R., and Hauskrecht, M. A temporal abstraction framework for classifying clinical temporal data. Proc. AMIA Annual Symposium 2009 (Jan. 2009), 29--33.
[6]
Bellazzi, R., Sacchi, L., and Concaro, S. Methods and tools for mining multivariate temporal data in clinical and biomedical applications. Proc. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2009 (Jan. 2009), 5629--32.
[7]
Bewick, V., Cheek, L., and Ball, J. Statistics review 12: survival analysis. Critical Care (London, England) 8, 5 (Oct. 2004), 389--94.
[8]
Bremm, S., von Landesberger, T., Hess, M., Schreck, T., Weil, P., and Hamacherk, K. Interactive visual comparison of multiple trees. In Proc. 2011 IEEE Conference on Visual Analytics Science and Technology (VAST) (2011), 31--40.
[9]
Brown, P. F., DeSouza, P. V., Mercer, R. L., Pietra, V. J. D., and Lai, J. C. Class-based n-gram models of natural language. Computational Linguistics 18, 4 (Dec. 1992), 467--479.
[10]
Carter, E., Burd, R., Monroe, M., Plaisant, C., and Shneiderman, B. Using eventflow to analyze task performance during trauma resuscitation. Proceedings of the Workshop on Interactive Systems in Healthcare (WISH 2013) (2013).
[11]
Cerami, E., Gao, J., Dogrusoz, U., Gross, B. E., Sumer, S. O., Aksoy, B. A., Jacobsen, A., Byrne, C. J., Heuer, M. L., Larsson, E., Antipin, Y., Reva, B., Goldberg, A. P., Sander, C., and Schultz, N. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discovery 2, 5 (May 2012), 401--4.
[12]
Chelaru, F., Smith, L., Goldstein, N., and Bravo, H. C. Epiviz: interactive visual analytics for functional genomics data. Nat Meth 11, 9 (Sept. 2014), 938--940.
[13]
Chen, Y., Cunningham, F., Rios, D., McLaren, W. M., Smith, J., Pritchard, B., Spudich, G. M., Brent, S., Kulesha, E., Marin-Garcia, P., Smedley, D., Birney, E., and Flicek, P. Ensembl variation resources. BMC genomics 11, 1 (Jan. 2010), 293.
[14]
Collett, D. Modelling survival data in medical research. CRC press, 2003.
[15]
Concaro, S., Sacchi, L., Cerra, C., Fratino, P., and Bellazzi, R. Mining health care administrative data with temporal association rules on hybrid events. Methods of Information in Medicine 50, 2 (Jan. 2011), 166--79.
[16]
Cule, B., Tatti, N., and Goethals, B. Marbles: Mining association rules buried in long event sequences. Statistical Analysis and Data Mining 7, 2 (2014), 93--110.
[17]
Dees, N. D., Zhang, Q., Kandoth, C., Wendl, M. C., Schierding, W., Koboldt, D. C., Mooney, T. B., Callaway, M. B., Dooling, D., Mardis, E. R., Wilson, R. K., and Ding, L. MuSiC: identifying mutational significance in cancer genomes. Genome Research 22, 8 (Aug. 2012), 1589--98.
[18]
Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., Shneiderman, B., and Plaisant, C. Discovering interesting usage patterns in text collections. In Proc. 16th ACM Conference on Conference on Information and Knowledge Management - CIKM '07, ACM Press (New York, USA, Nov. 2007), 213.
[19]
Dupont, M., Gacouin, A., Lena, H., Lavoué, S., Brinchault, G., Delaval, P., and Thomas, R. Survival of patients with bronchiectasis after the first ICU stay for respiratory failure. Chest 125, 5 (May 2004), 1815--20.
[20]
Ferstay, J. A., Nielsen, C. B., and Munzner, T. Variant view: visualizing sequence variants in their gene context. IEEE Transactions on Visualization and Computer Graphics 19, 12 (Dec. 2013), 2546--55.
[21]
Fiume, M., Williams, V., Brook, A., and Brudno, M. Savant: genome browser for high-throughput sequencing data. Bioinformatics (Oxford, England) 26, 16 (Aug. 2010), 1938--44.
[22]
Fournier-Viger, P., Faghihi, U., Nkambou, R., and Nguifo, E. M. CMRules: Mining sequential rules common to several sequences. Knowledge-Based Systems 25, 1 (Feb. 2012), 63--76.
[23]
Goel, M. K., Khanna, P., and Kishore, J. Understanding survival analysis: Kaplan-Meier estimate. International Journal of Ayurveda Research 1, 4 (Oct. 2010), 274--8.
[24]
Guerra-gómez, J. A., Pack, M. L., Plaisant, C., and Shneiderman, B. Visualizing changes over time in datasets using dynamic hierarchies. IEEE Transactions on Visualization and Computer Graphics 19, 12 (2013), 2566--2575.
[25]
Guerra-gómez, J. A., Wongsuphasawat, K., Wang, T. D., Pack, M. L., and Plaisant, C. Analyzing incident management event sequences with interactive visualization. Proceedings of the Transportation Research Board 90th annual meeting (2011).
[26]
Hartsell, W. F., Scott, C. B., Bruner, D. W., Scarantino, C. W., Ivker, R. A., Roach, M., Suh, J. H., Demas, W. F., Movsas, B., Petersen, I. A., Konski, A. A., Cleeland, C. S., Janjan, N. A., and DeSilvio, M. Randomized trial of short- versus long-course radiotherapy for palliation of painful bone metastases. Journal of the National Cancer Institute 97, 11 (2005), 798--804.
[27]
Holten, D., and van Wijk, J. J. Visual Comparison of Hierarchically Organized Data. Computer Graphics Forum 27, 3 (May 2008), 759--766.
[28]
Jankowska, M., Keselj, V., and Milios, E. Relative N-gram signatures: Document visualization at the level of character N-grams. In 2012 IEEE Conference on Visual Analytics Science and Technology (VAST), IEEE (Oct. 2012), 103--112.
[29]
Kent, W. J., Sugnet, C. W., Furey, T. S., Roskin, K. M., Pringle, T. H., Zahler, A. M., and Haussler, a. D. The Human Genome Browser at UCSC. Genome Research 12, 6 (May 2002), 996--1006.
[30]
Klimov, D., Shahar, Y., and Taieb-Maimon, M. Intelligent visualization and exploration of time-oriented data of multiple patients. Artificial Intelligence in Medicine 49, 1 (May 2010), 11--31.
[31]
Lage, M., Barber, B. L., Harrison, D. J., and Jun, S. The Cost of Treating Skeletal-Related Events in Patients With Prostate Cancer. Am J Manag Care 14, 5 (2008), 317--322.
[32]
Laxman, S., and Sastry, P. S. A survey of temporal data mining. Sadhana 31, 2 (Apr. 2006), 173--198.
[33]
Lee, Y. J., Lee, J. W., Chai, D. J., Hwang, B. H., and Ryu, K. H. Mining temporal interval relational rules from temporal data. Journal of Systems and Software 82, 1 (2009), 155--167.
[34]
Lutz, S. T., Jones, J., and Chow, E. Role of radiation therapy in palliative care of the patient with cancer. Journal of Clinical Oncology (2014).
[35]
Malik, S., Du, F., Monroe, M., Onukwugha, E., Plaisant, C., and Shneiderman, B. An evaluation of visual analytics approaches to comparing cohorts of event sequences. In EHRVis Workshop on Visualizing Electronic Health Record Data at VIS '14 (2014).
[36]
Mannila, H., Toivonen, H., and Verkamo, A. I. Discovery of Frequent Episodes in Event Sequences. Data Mining and Knowledge Discovery 1, 3 (Sept. 1997), 259--289.
[37]
Meyer, M., Munzner, T., and Pfister, H. MizBee: a multiscale synteny browser. IEEE Transactions on Visualization and Computer Graphics 15, 6 (Jan. 2009), 897--904.
[38]
Monroe, M. Interactive Event Sequence Query and Transformation. PhD thesis, University of Maryland, "2014".
[39]
Monroe, M., Lan, R., Lee, H., Plaisant, C., and Shneiderman, B. Temporal event sequence simplification. Visualization and Computer Graphics, IEEE Transactions on 19, 12 (Dec 2013), 2227--2236.
[40]
Monroe, M., Meyer, T. E., Plaisant, C., Lan, R., Wongsuphasawat, K., Coster, T. S., Gold, S., Millstein, J., and Shneiderman, B. Visualizing patterns of drug prescriptions with eventflow: A pilot study of asthma medications in the military health system. Proceedings of Workshop on Visual Analytics in Healthcare (VAHC 2013) (2013).
[41]
Moskovitch, R., and Shahar, Y. Medical temporal-knowledge discovery via temporal abstraction. Proc. AMIA Annual Symposium 2009 (Jan. 2009), 452--6.
[42]
Munzner, T., Guimbreti'ere, F., Tasiran, S., Zhang, L., and Zhou, Y. TreeJuxtaposer: Scalable Tree Comparison using Focus+Context with Guaranteed Visibility. In ACM SIGGRAPH 2003, no. 1, ACM Press (New York, USA, 2003), 453.
[43]
Norén, G. N., Hopstadius, J., Bate, A., Star, K., and Edwards, I. R. Temporal pattern discovery in longitudinal electronic patient records. Data Mining and Knowledge Discovery 20, 3 (Nov. 2009), 361--387.
[44]
Nørgaard, M., Jensen, A. Ø., Jacobsen, J., Cetin, K., Fryzek, J., and Sørensen, H. Skeletal related events, bone metastasis and survival of prostate cancer: a population based cohort study in Denmark (1999 to 2007). J Urol. 184, 1 (2010), 162--167.
[45]
Oracle. Oracle Health Sciences Cohort Explorer User's Guide. Tech. rep., Oracle, 2011.
[46]
Perer, A., and Wang, F. Frequence: Interactive mining and visualization of temporal frequent event sequences. In Proceedings of the 19th International Conference on Intelligent User Interfaces, IUI '14, ACM (New York, USA, 2014), 153--162.
[47]
Sathiakumar, N., Delzell, E., Morrisey, M., Falkson, C., Yong, M., Chia, V., Blackburn, J., Arora, T., and Kilgore, M. Mortality following bone metastasis and skeletal-related events among patients 65 years and above with lung cancer: A population-based analysis of U.S. Medicare beneficiaries, 1999--2006. Lung India 30, 1 (2013), 20--26.
[48]
Shneiderman, B., and Plaisant, C. Strategies for evaluating information visualization tools: Multi-dimensional in-depth long-term case studies. In Proceedings of the 2006 AVI Workshop on BEyond Time and Errors: Novel Evaluation Methods for Information Visualization, BELIV '06, ACM (New York, NY, USA, 2006), 1--7.
[49]
Stolper, C., Perer, A., and Gotz, D. Progressive visual analytics. In To appear in IEEE Transactions on Visualization and Computer Graphics, TVCG (2014).
[50]
Tatti, N., and Cule, B. Mining closed episodes with simultaneous events. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '11, ACM (New York, NY, USA, 2011), 1172--1180.
[51]
Tatti, N., and Vreeken, J. The long and the short of it: Summarising event sequences with serial episodes. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '12, ACM (New York, NY, USA, 2012), 462--470.
[52]
Thorvaldsdóttir, H., Robinson, J. T., and Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics 14, 2 (Mar. 2013), 178--92.
[53]
TIBCO. Spotfire. http://spotfire.tibco.com/, Mar. 2014.
[54]
Viégas, F. B., Wattenberg, M., and Dave, K. Studying cooperation and conflict between authors with history flow visualizations. In Proc. 2004 Conference on Human Factors in Computing Systems - CHI '04, ACM Press (New York, USA, Apr. 2004), 575--582.
[55]
Wang, J., Kong, L., Gao, G., and Luo, J. A brief introduction to web-based genome browsers. Briefings in Bioinformatics 14, 2 (Mar. 2013), 131--43.
[56]
Wongsuphasawat, K., and Gotz, D. Exploring flow, factors, and outcomes of temporal event sequences with the outflow visualization. IEEE Transactions on Visualization and Computer Graphics 18, 12 (2012), 2659--2668.
[57]
Zhang, Z., Gotz, D., and Perer, A. Interactive Cohort Analysis and Exploration. Journal of Information Visualization (IVS), to appear. (2014).

Cited By

View all
  • (2024)VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision-MakingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676323(1-21)Online publication date: 13-Oct-2024
  • (2024)A Multi-Level Task Framework for Event Sequence AnalysisIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345651031:1(842-852)Online publication date: 18-Sep-2024
  • (2024)A Comparative Study on Fixed-Order Event Sequence Visualizations: Gantt, Extended Gantt, and Stringline ChartsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335891930:12(7687-7701)Online publication date: Dec-2024
  • Show More Cited By

Index Terms

  1. Cohort Comparison of Event Sequences with Balanced Integration of Visual Analytics and Statistics

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '15: Proceedings of the 20th International Conference on Intelligent User Interfaces
    March 2015
    480 pages
    ISBN:9781450333061
    DOI:10.1145/2678025
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 March 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cohort comparison
    2. temporal data
    3. visual analytics

    Qualifiers

    • Research-article

    Funding Sources

    • Oracle
    • The University of Maryland/Mpowering the State

    Conference

    IUI'15
    Sponsor:

    Acceptance Rates

    IUI '15 Paper Acceptance Rate 47 of 205 submissions, 23%;
    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)113
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision-MakingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676323(1-21)Online publication date: 13-Oct-2024
    • (2024)A Multi-Level Task Framework for Event Sequence AnalysisIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345651031:1(842-852)Online publication date: 18-Sep-2024
    • (2024)A Comparative Study on Fixed-Order Event Sequence Visualizations: Gantt, Extended Gantt, and Stringline ChartsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335891930:12(7687-7701)Online publication date: Dec-2024
    • (2024)Roses Have Thorns: Understanding the Downside of Oncological Care Delivery Through Visual Analytics and Sequential Rule MiningIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332693930:1(1227-1237)Online publication date: 1-Jan-2024
    • (2024)Leveraging Historical Medical Records as a Proxy via Multimodal Modeling and Visualization to Enrich Medical Diagnostic LearningIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332692930:1(1238-1248)Online publication date: 1-Jan-2024
    • (2024)Ready-to-Serve Detection in Badminton Videos2024 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC61013.2024.10457177(1-5)Online publication date: 28-Jan-2024
    • (2024)Service and End of Rally Detection in Badminton VideosSports Analytics10.1007/978-3-031-69073-0_15(173-180)Online publication date: 25-Sep-2024
    • (2023)DASS Good: Explainable Data Mining of Spatial Cohort DataComputer Graphics Forum10.1111/cgf.1483042:3(283-295)Online publication date: 27-Jun-2023
    • (2023)Visually Abstracting Event Sequences as Double Trees Enriched with Category‐Based ComparisonComputer Graphics Forum10.1111/cgf.1480542:6Online publication date: 22-May-2023
    • (2023)Population-Level Visual Analytics of Smartphone Sensed Health and Wellness Using Community Phenotypes2023 IEEE 11th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI57859.2023.00061(420-429)Online publication date: 26-Jun-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media