Uncertainty-aware visual analytics for exploring human behaviors from heterogeneous spatial temporal data

https://doi.org/10.1016/j.jvlc.2018.06.007Get rights and content

Abstract

When analyzing human behaviors, we need to construct the human behaviors from multiple sources of data, e.g. trajectory data, transaction data, identity data, etc. The problems we’re facing are the data conflicts, different resolution, missing and conflicting data, which together lead to the uncertainty in the spatial temporal data. Such uncertainty in data leads to difficulties and even failure in the visual analytics task for analyzing people behavior, pattern and outliers. However, traditional automatic methods can not solve the problems in such complex scenario, where the uncertain and conflicting patterns are not well-defined. To solve the problems, we proposed a semi-automatic approach, for users to solve the conflicts and identify the uncertainties. To be general, we summarized five types of uncertainties and solutions to conduct the tasks of behavior analysis. Combined with the uncertainty-aware methods, we proposed a visual analytics system to analyze human behaviors, detect patterns and find outliers. Case studies from the IEEE VAST Challenge 2014 dataset confirm the effectiveness of our approach.

Introduction

Data recording human behavior becomes more and more in volume and diversity. With the development of the techniques, the GPS can record people’s position and movement, the transaction system in the bank can record people’s purchase and billing behavior, and more social media data would reflect people’s attitude towards public affairs, or even eating preference. Facing the heterogeneous data, we can adopt visual analytics to understand the people behavior, find patterns and detect outlier events.

Directly using heterogeneous data analyzing process could lead to difficulties and even failure for the visual analytics tasks. This is because the data are often heterogeneous and imperfect. There can be various uncertainties in the data, including errors, data missing and conflicts. The data can also be in different resolutions. However, traditional automatic methods can’t solve the problems in such complex scenario, where the uncertain and conflicting patterns are not well-defined. Our approach combines both algorithmic methods together with interactions in visualization, to enable users to identify, mark and refine such uncertainty issues. Together with such uncertainty-aware methods, we proposed a visual analytics system for supporting human’s spatial temporal behavior analysis from the heterogeneous data.

In this paper, we report different kinds of uncertainties that we identified in a visual spatial data analysis and demonstrate how we refine them with the semi-automatic methods. Generally, our methods are data-driven reliability improvement methods. For different types of data, we have proposed different solutions and adopt cross referent of multiple sources of data. As there are heterogeneous data sharing the same attributes, but with different granularity, we can get finer resolution data with uncertainty from other types of data. With these approaches, we can better understand people’s behavior in different dimensions and mark the reliability for further analysis. Uncertainty identification and analysis vary much and are challenging to solve through pure computation methods. So in our work, combined with visual identification and automatic preprocess methods, our methods have users in the visual analytics loop. Thus, users can better explore the different reliability of the data and further analyze outlier events.

Throughout this work, we use the fictitious datasets from IEEE VAST Challenge 2014 Mini Challenge 2 [1]. Combined with the uncertainty-aware approach, our proposed visual analytics system is able to summarize the general movement patterns of a group of people, and help analysts detect abnormal events, with various visualization view and multiple filters. In summary, our contribution is as follows.

  • Semi-automatic Uncertainty Refinement Methods. We summarized five general types of uncertainty and proposed novel solutions for each. To solve the ill-defined uncertainty problems, we combine users’ capability and algorithmic methods and allow human in the analysis loop.

  • Uncertainty-aware Visual Analytics System. We have developed a comprehensive visual analytics system, incorporating the uncertainty-aware approaches and multiple coordinated visualization views, thus providing a full solution for understanding the human behaviors and detect interesting patterns and outliers.

This paper is structured as follows. Section 2 reviews related work. After introducing the data in Section 3, we present the uncertainty summary and general description of solutions in Section 4. We present the details of uncertainty-aware approach in Section 5. We present the visual analytics procedure and technical details in Sections 6 and 7. We demonstrate the use of our tools in four case studies. Finally, we discuss the limitations, future work, and conclusion.

Section snippets

Related work

Behavior analysis usually focuses on pattern extraction [2], relationship identification [3] and people clustering [4]. In People Garden [5], Xiong et al. summarized the temporal behavior of each person with a flower metaphor, and put them into different categories. Kanda et al. [3] analyzed the movement of museum visitors, focusing on hotspots and different visiting strategies. Orellana et al. [6] studied the interactions of people with a mobile game dataset. User behavior data usually involve

Uncertainty taxonomy

In the paper scope, we mainly discuss the uncertainty in spatial temporal data for exploring human behaviors. The data unit representing human behavior is defined as event. The event is defined with the following attributes - time, location, people. Thus, these attributes and event are the targeting objects for analyzing uncertainty. For each attribute, we summarize the following uncertainty types, including missing information, conflict, granularity issues, multiple values and errors. We

Uncertainty illustration

In this section, we first describe the data we used. After that, we introduce the data fusion method and visual analysis system, which is the basis for the uncertainty processing and classification.

Semi-automatic uncertainty processing

Generally, we have three types of operation guideline dealing with different types of data uncertainty. First, we mark and differentiate the missing data for each data sources, including GPS log missing, transaction data missing. The marking process is based on the understanding of the data distribution based on several reasonable assumptions in human behavior. Second, for data expressing the same events from different sources, we would refine the data with higher resolution from others.

Visual analytics system

Our visual analytics system combines the uncertainty-aware approaches with a fully interactive exploration functions. Our system can enable users to find reliable information, detect patterns and find outliers from the heterogeneous spatial temporal data sources (Fig. 9).

System implementation

Our system is developed under a client-server architecture. The client is built with HTML5/Javascript, and the server-side services are implemented with Python and MongoDB. The client includes multiple web programming techniques and toolkits, such as Google Maps, d3 library, WebGL and Canvas. On the server side, we choose MongoDB to manage data sets because of its flexibility and scalability in handling multiple sources of data. The uncertainty processing part relies on both the visual

Evaluation

We evaluated our proposed uncertainty-aware visual analytics methods in two aspects. First, we compare our methods with the pure computational methods and illustrate our advantages. Second, we use a case study to illustrate how users can find events successfully after dealing with the uncertainties.

Discussion

We propose an uncertainty-aware visual analytics approach to deal with multiple sources of spatial temporal data. With both interactive and algorithmic methods, users can identify and refine the data uncertainty, which is challenging to conduct due to the ill-defined uncertain patterns. Such process requires the semantic understanding. For example, the abnormal visiting pattern can be detected with a large amount of false alarms. One people might go to supermarket not that regular, which can be

Conclusion

In this paper, we present an uncertainty-aware visual analytics system to investigate human behaviors from heterogeneous spatial temporal data. We summarize five representative types of uncertainty and its refinement methodology. A data-driven approach is proposed and we make full use of humans’ judgment through a visual interface. With cross verification from multiple sources, we can further improve the reliability of the refinement results. Based on the refinement results, we are able to

Acknowledgements

We thank the contribution of Zhenhuang Wang, Chenglong Wang, Zipeng Liu and Zhengjie Miao. This project is supported by the National Key Research and Development Program of China (2016QY02D0304), National Basic Research Program of China (973) (2015CB352503), and National Natural Science Foundation of China (61672055).

References (28)

  • K.A. Cook et al.

    The VAST challenge: history, scope, and outcomes: an introduction to the special issue

    Inf. Vis.

    (2014)
  • R. Krüger et al.

    Using social media content in the visual analysis of movement data

    Proc. Workshop on Interactive Visual Text Analytics

    (2012)
  • T. Kanda et al.

    Analysis of people trajectories with ubiquitous sensors in a science museum

    Proc. of IEEE ICRA

    (2007)
  • Q. Li et al.

    Mining user similarity based on location history

    Proc. ACM SIGSPATIAL GIS

    (2008)
  • R. Xiong et al.

    Peoplegarden: creating data portraits for users

    Proc. of the ACM UIST

    (1999)
  • D. Orellana et al.

    Uncovering interaction patterns in mobile outdoor gaming

    Proc. GEOProcessing

    (2009)
  • T. Hägerstraand

    What about people in regional science?

    Pap. Reg. Sci.

    (1970)
  • T. Kapler et al.

    Geotime information visualization

    Inf. Vis.

    (2005)
  • C. Tominski et al.

    Stacking-based visualization of trajectory attribute data

    IEEE Trans. Vis. Comput. Graph.

    (2012)
  • R. Krüger et al.

    Trajectorylenses - a set-based filtering and exploration technique for long-term trajectory data

    Comput. Graph. Forum

    (2013)
  • G.L. Andrienko et al.

    Interactive visual clustering of large collections of trajectories

    Proc. of the IEEE VAST

    (2009)
  • G.L. Andrienko et al.

    Spatio-temporal aggregation for visual analysis of movements

    Proc. of the IEEE VAST

    (2008)
  • D. Guo et al.

    A visualization system for space-time and multivariate patterns (VIS-STAMP)

    IEEE Trans. Vis. Comput. Graph.

    (2006)
  • C. Gorg et al.

    Visual analytics support for intelligence analysis

    Computer

    (2013)
  • Cited by (8)

    View all citing articles on Scopus
    View full text