Keywords

1 Introduction

In traffic modeling, researchers have been collecting, analyzing, and visualizing large datasets in the past two decades [19, 22, 25]. The volume, velocity, and variety of traffic data in urban areas are growing at an exponential rate and thus, understanding and capturing the semantics of traffic data is an increasingly complex challenge. Given the critical importance of the problem, many efforts, in both industry and academia, have explored systematic approaches to addressing the challenges of data visualization, which refers to the process of graphically representing data in order to illustrate the relationships within data and to uncover hidden patterns [24].

It is well-recognized that road safety in the Kingdom of Saudi Arabia is in an extremely dire state, where more than half a million accidents occur each year [8]. Moreover, a significant number of these accidents are severe, which makes road traffic injuries the leading cause of death for young males between the age of 16 and 30 [20]. We believe that enhancing road safety is paramount to traffic regulators and understanding traffic accidents temporal and spatial patterns can help in achieving such a goal. Insights gained from visualizations of accidents’ data demonstrate how valuable visual analytics can be to authorities and policy makers to better understand traffic, enhance future planning, and determine corrective actions.

Although the prevalence and accessibility of traffic data are changing the way people view mobility in their cities and roads, the task of retrieving such insights from huge and heterogeneous traffic datasets and presenting them to people is very challenging. In this paper, we aim to aid people understand the mobility patterns of their cities by exploring different visualization techniques for traffic data and investigate what insights each visualization technique yields. In particular, we aim to contribute to improving road safety in the Kingdom of Saudi Arabia by capturing insights from an Accidents Dataset collected from the General Directorate of Traffic (GDT) in Riyadh [5] and providing those perceptions through an interactive visualization platform to the policy makers in the GDT. Due to the characteristics of traffic data, its multivariate nature, and the importance of its spatial and temporal properties, the visualization techniques we explore are: spatial visualizations, temporal visualizations, spatio-temporal visualizations, and multivariate visualizations [14].

The remainder of this paper is organized as follows. Section 2 provides a brief background about data visualization in general, with an emphasis on traffic data visualization. A description of our visualization platform is presented in Sect. 3. Section 4 presents the visualizations of traffic accidents in the city of Riyadh as a case study. Finally, conclusions are drawn in Sect. 5.

2 Related Work

We start this section by discussing data visualization in Sect. 2.1. Then, we present the process of visual analytics in Sect. 2.2. Finally, we cover traffic data visualization in Sect. 2.3.

2.1 Data Visualization

Data visualization refers to the process of graphically representing data in order to illustrate the relationships within data and to reveal hidden patterns and structures [24]. Different types of data visualizations including Information Visualization and Visual Analytics are used to help people in understanding and exploring their data [18]. One the one hand, information visualization relies on visual computing in order to help humans acquire abstract information [13]. On the other hand, visual analytics is not only graphical representation of the data – it is an integrated approach that combines data analysis, data visualization, and human interventions. Note that the human interventions (e.g. interaction, collaboration, cognition, perception, etc.) in visual analytics plays an important role in the decision making process [18]. The main objective of visual analytics is not to only allow users in detecting expected patterns, but to enable them in identifying unexpected patterns and relationships to observe the hidden insights and relationships.

2.2 Visual Analytics Process

The process of visual analytics consists of: information gathering, data preprocessing, data analysis, data visualization, interaction and decision making. Figure 1 shows an abstract overview of the visual analytics process, where ovals represent different stages and arrows represent transitions.

In general, heterogeneous data sources need to be integrated before analyzing or visualizing the data itself. Therefore, the first step after integrating the raw data is often preprocessing and transforming it to derive different representations for further explorations. Other typical preprocessing tasks include data transformation, data cleaning, and/or integration of heterogeneous data sources [17].

After data preprocessing, analysts have the option to choose between visualizing the data directly or applying several analysis methods. If they choose to analyze the data first, data query and machine learning methods are applied. Visualizing the data whether at the beginning or after the analysis allows the users to interact with the resulted visuals by modifying different parameters. One of the characteristics of visual analytics is the ability to alternate between visualizations and analysis methods, which leads to a continuous refinement and validation of initial results. To conclude, knowledge from the visual analytics process can be gained from analysis methods, visualization, as well as human-computer interactions [15].

Fig. 1.
figure 1

Visual analytics process

2.3 Traffic Data Visualization

Information visualization and visual analytics are becoming an integral part of many recent traffic systems. They help in understanding traffic behavior and uncovering spatial and temporal patterns in traffic information [4, 21]. In traffic data visualization, the different kinds of visualization usually incorporate the data’s spatial and temporal properties, due to the significance of these properties in the context of traffic.

A number of tools have been developed for traffic incidents visualization. For example, Incident Cluster Explorer (ICE) is a web-based visual analytics tool for traffic accidents dataset [21]. This tool provides interactive spatial visualizations, which include (i) an icon mode to show every incident on the map and (ii) a heat-map mode to aggregate the incidents into grids. The ICE tool also provides other types of visualization such as: histograms and scatter plots.

A second example is the CrashMap web-based tool, which visualizes traffic crashes that occurred on British roads [4]. This tool compiles the accidents data into an easy-to-use format showing each incident on a map. In addition, it provides detailed information regarding each accident, e.g., when did the accident happen?, at what time of day?, how serious was the incident?, etc.

Traffic Origins [11] is another visualization tool that highlights the effect of traffic incidents on congestion. This tool uses historical accidents and traffic flow data, and allow the user to investigate the effect of these accidents on traffic flow. This is facilitated by visualizing traffic conditions within an expanding circle that surrounds the accident’s location 15 min before it occurs till it clears out.

3 System Framework

Riyadh is among the fastest growing cities in the Kingdom of Saudi Arabia and the world: the city is seeing an increase in vehicular trips at a rate of 9% per year, and population growth of 3.9% annually [12]. Using data generated from the crash records, call details records (CDRs), and other datasets (e.g., Riyadh road network) an interactive visualization platform called SaudiTraffic [7] was developed.

In addition to presenting complex information with innovative statistical analyses and computational algorithms, SaudiTraffic presents that information in an interactive and intuitive way, while making it accessible to both policy makers and users to aid them in exploring, analyzing, and understanding mobility dynamics. To facilitate the process of capturing the data’s semantics, the platform provides different types of visualizations including: basic data visualization and analysis-based visualization.

The SaudiTraffic platform presents interactive visualizations of real traffic accidents that occurred in Riyadh. Different visualizations are created for the four categories of visualization techniques (i.e., spatial visualizations, temporal visualizations, spatio-temporal visualizations, and multivariate visualizations) with an emphasis on analytics. By using these techniques, the SaudiTraffic platform invites users to explore, interact, and discover patterns in traffic incidents on the roads of Riyadh. This platform also aims to bring together machine intelligence with human intelligence through visual analytics. The interaction with SaudiTraffic platform can act as a decision support system allowing stakeholders to explore the impact of traffic accidents on social and economic progress. This provides a valuable lens to explore how to enhance road safety in the Kingdom.

In this web-based platform, different libraries are used to produce the interactive visualizations. Leaflet [6], a commonly used open-source JavaScript library, is utilized to visualize geospatial data. Its interactive maps and support for mobile and desktop platforms, makes it one of the most popular libraries when it comes to geospatial mapping. Chart.js [3] and ZingChart [10] are used in SaudiTraffic as well. They are JavaScript libraries that create different types of responsive and interactive charts including line plots, bar charts, radar plots, and pie charts. The goal of these libraries is to overcome the scalability and flexibility issues in JavaScript and develop a flexible, fast, and modern way to create charts.

4 Case Study: Visualization of Traffic Accidents in Riyadh

Given the characteristics of traffic data, visualization techniques for traffic analysis are often aligned with four aspects of design guidelines; namely, spatial, temporal, spatio-temporal, and multivariate. In this section, we present examples from a case study of traffic incidents in Riyadh for these four aspects with an emphasis on analytics to shed light on the insights gained from these visualization techniques.

4.1 Datasets

The data used in this paper is the records of crashes that occurred in the period from January 2013 to October 2015, provided by the GDT [5]. This dataset consists of three tables: accidents, parties, and vehicles. The accidents table contains almost 250,000 accidents records, each record with 23 attributes such as: exact location, time, severity, and type of the accident. The parties table contains information about all parties involved in the accident, such as their role in the accident, health status, gender, and nationality. The vehicles table provides details about all vehicles involved in the accident, including the car make and model, its color, and its registration type.

Other datasets are used in this paper as well. The Riyadh road network dataset, obtained from Arriyadh Development Authority (ADA) [2], is used to map accidents to the roads where they have occurred. In addition, CDRs were used to identify how people move across the urban landscape by extracting the origin-destination matrix, and then estimate the congestion on each road segment in the city.

4.2 Visualization of Spatial Properties

Understanding mobility dynamics and identifying accidents hotspots are some of many insights that can be obtained from the spatial properties of traffic data. The level of aggregation plays an important role in capturing these insights. Therefore, spatial visualization can be categorized based on the aggregation level into three categories: Dot-based visualizations, Heat-map visualizations, and Region-based visualizations.

Fig. 2.
figure 2

Dot-based visualization (Color figure online)

Dot-Based Visualizations. In dot-based visualizations, the location of each record in the dataset is represented by a dot without any aggregation. Since no aggregation is done, this approach allows for providing details for each record, which are often indicated using the color and the size of the dot.

We utilized this approach to visualize each accident from the accidents dataset, where each accident is represented by a dot, whose color corresponds to the severity of the accident, as shown in Fig. 2.

The dot-based approach is not only an intuitive way to visualize traffic incidents spatially, it can also provide users and stakeholders many insights regarding the location of accidents hotspots in the city, and which road segments are more dangerous to commuters.

The visualization in Fig. 2 allows the user to interact with each incident by hovering over it to obtain more information regarding the accident such as the reason behind the accident, when did it happen?, and its severity. It also allows for zooming in and out, which helps when having hundreds of accidents clustered in a certain region.

Fig. 3.
figure 3

Heat-map visualization (Color figure online)

Heat-Map Based Visualizations. Heat-map visualizations are suitable for large-scale data where huge and complex information can be mapped in a clear and intuitive way like the dataset we are dealing with [14]. Research has shown that heat-maps are suitable for generating patterns of dots. In addition, heat-maps are useful when dealing with thousands of dots that may overlap.

The heat-map visualization in Fig. 3 helps in presenting the distinction between areas with many accidents versus areas with few ones. In particular, this visualization is used to represent the number of run over accidents in Riyadh, where warmer colors (i.e., orange and red) indicate a larger number of run overs; and colder colors (i.e., yellow and green) indicate a smaller number of accidents or no accidents at all.

Using the aforementioned heat-map can help in identifying where run over accidents clusters. This information should direct the authorities’ attention to these locations, with the hope that it will lead eventually to safer roads for pedestrians as well as commuters.

In heat-maps, unlike the dot-based approach, the observation process on each object is not applicable. However, heat-maps are much easier to perceive the density of points, e.g., number of run over accidents.

Fig. 4.
figure 4

Region-based visualization

Region-Based Visualizations. In region-based visualizations, data points are aggregated into predetermined regions, which in contrast to dot-based visualizations, are more suitable for capturing macro patterns in large datasets.

In SaudiTraffic, the aggregated number of accidents in each district in Riyadh is visualized, demonstrating the contrast in the number of accidents between different districts. The size of the district along with its boundaries were determined using the districts’ shapefiles obtained from the ADA. Afterwards, accidents were mapped to districts according to these boundaries. As a result, accidents located outside the city are discarded.

Aggregating the number of accidents in each district can help in many ways. For example, the map in Fig. 4 helps in presenting which districts are more vulnerable to accidents, in the hope that those districts can be made safer for commuters. It can also help insurance companies in determining the impact a driver’s home district should have on the evaluation of his vehicle insurance price. Note that this map allows hovering over any district to get the name and the number of accidents in the aforementioned period.

4.3 Visualization of Temporal Properties

Time-oriented visualizations facilitate the exploration and discovery of trends, periodicity, and abnormality of traffic data along the time dimension. As noted by Chen et al. [14], visualizations designed solely on the time axis are often categorized into: linear-time information visualizations and periodic-time information visualizations.

Linear Temporal Visualizations. In linear temporal visualizations, time is represented as a linear field with a start and an end points. This visualization technique is often used to detect trends and patterns of another variable’s temporal behavior, and is very easy to comprehend. It is however less capable of detecting periodic patterns in discrete timeframes, and of showing multiple variables, because of the clutter problem [14].

Fig. 5.
figure 5

Linear temporal visualization

In this case study, linear temporal visualization was utilized to investigate the correlation between school calendar and significant events and the number of accidents. The daily average of the number of accidents in a week is shown with the school vacations, Islamic holidays (i.e., Eid Al-Fitr and Eid Al-Adha), and the holy month of Ramadan. The daily average in each week was used to eliminate the significant variation between working days and weekends. The school vacation dates were taken from the Ministry of Education in Saudi Arabia [1], whereas the month of Ramadan and the Islamic holidays dates are based on Umm Al-Qura calendar (viz., the official Hijri calendar used in Saudi Arabia) [9].

The visualizations, shown in Fig. 5, have shown that the number of accidents drops during both the school vacations and the Islamic holidays. This drop is quite significant during the summer vacation, and it decreases further during the holy month of Ramadan. The drop during school vacations might be attributed to the decreased traffic flow in that period. These patterns are repeated throughout 2014 and 2015, as shown in Figs. 5a and b respectively. This indicates that there is a consistent pattern with the number of accidents in specific events.

The interactivity in this visualization allows for the exploration of a longer timeframe without losing any details, where the user can scroll through the years to discover patterns in different years.

These findings about the correlation between these events and the number of accidents will help traffic departments with the management and allocation of their resources.

Fig. 6.
figure 6

Periodic temporal visualization

Periodic Temporal Visualizations. Periodic temporal visualizations help in emphasizing the contrast between discrete timeframes such as weekends and weekdays or months of seasons. Although they are known for their low spatial efficiency, they excel in communicating patterns visually to decision makers to elicit an understanding of contrasts related to time.

To understand and identify these patterns, an interactive radar visualization for the number of accidents for each day of the week was used. This visualization has shown a consistent pattern for the working days, where peaks coincide with the traffic rush hours in a typical working day, as Fig. 6a shows. However, in weekends, these peaks disappear, and a slight increase around midnight is apparent, as Fig. 6b illustrates. Out of these seven days, Friday has the most distinct pattern, where the number of accidents decrease dramatically in the morning hours, as shown in Fig. 6b. The reason for this dramatic decrease can be attributed to Friday prayer, since most stores do not open on Friday before the prayer, and the residents of Riyadh tend to postpone their commutes till after the prayer. The interactivity in this visualization has helped in obtaining these findings, since it allows the user to select which weekday(s) to show.

Fig. 7.
figure 7

Spatio-temporal visualization: number of accidents on King Fahad road

4.4 Spatio-Temporal Visualizations

The two main attributes in the crash records dataset are accident location and time. Spatio-temporal visualizations utilize both attributes to discover hidden temporal patterns that might appear for a given location [23]. In other words, they allow the users to compare data in different timeframes and observe the changes, if any.

To investigate the safety of each road segment in the city, each accident was mapped to the nearest segment in the road network dataset obtained from the ADA. Moreover, the temporal dimension is added, where each month can be investigated separately using the scroll bar. The road segments are then visualized, where the number of accidents that occurred on each segment in the selected month is indicated by its color.

Adding both the spatial and the temporal dimensions enables the user to identify the temporal variations in the number of accidents in each segment. For example, Figs. 7a and b show a spot where accidents have occurred sporadically, with more than 15 accidents in November 2014, and a notable decrease in the following month. On the other hand, Figs. 8a and b show a consistency in the number of accidents on Khurais road in the months of August and October 2014.

Fig. 8.
figure 8

Spatio-temporal visualization: number of accidents on Khurais road

4.5 Multivariate Visualizations

Traffic dataset contains different types of attributes besides location and time. Multivariate data visualization is used when the purpose of the visualization is to find the correlations between many attributes of interest. In this visualization type, attributes are not limited to temporal and spatial, but can also include other attributes such as: numerical, categorical, and textual [26].

Fig. 9.
figure 9

Multivariate visualization

In this case study, the traffic data is layered over the crash data to investigate the interrelationships between these layers. The grid heat-map shown in Fig. 9 represents traffic accidents in the city between 7 to 10 in the morning in the month of December, whereas the road heat-map represents volume of traffic in the same period. Figure 9 shows traffic accidents are concentrated in regions where traffic volume is higher. This correlation would help the GDT in the placement of traffic monitoring sites.

Despite the advantages of multivariate visualizations, layering several layers in the representations may confuse the users [26]. Moreover, the ordering of layers has a major impact on the expressiveness of visualization [16]. Different orders infer different conclusions to be drawn, but no arrangement principle is established so far [26].

5 Conclusions

In this paper, different visualization techniques for traffic data are explored and utilized to gain insights about traffic accidents in Riyadh. These insights include how the number of accidents changes throughout the day and how it varies within the weekdays, how the number of pedestrians fatalities is distributed spatially, and what is the correlation between the number of accidents and significant events throughout the year.

As the case study demonstrates, the interactive visual analytics in SaudiTraffic platform can help users answer questions related to integrated transit system in general, and traffic accidents specifically. It also helps in understanding the semantics of complex and multivariate data, giving the user the ability to control different variables to uncover hidden pattern and subtle insights.