1 Introduction and Literature Review

More than 1.2 million fatalities and 50 million injuries were recorded globally in 2015, making road traffic accidents a leading cause of death. Fatalities and injuries from traffic accidents cause, on average, an estimated 3% loss in GDP on a global scale [1]. In the U.S., in 2015, there were 35,092 fatalities and 2.44 million injuries resulting from 6.3 million reported traffic accidents causing an estimated 1.9% loss in GDP [2]. Accident severity has a significant impact on the economic costs associated with an accident. Fatalities, injuries, and property damage accidents cost, on average, $1.5 million, $80,700, and $9,300, respectively per accident [3]. These economic costs account for wage and productivity losses, medical expenses, administrative expenses, and motor vehicle damage including the value of damage to property.

A popular area in traffic safety analysis is the identification of factors that increase the likelihood of an accident occurring. These factors typically include non-behavioral factors affecting accident frequency such as geometric variables (e.g., horizontal and vertical road alignments, or the immediate physical environment), traffic characteristics (e.g., hourly volume, annual average daily traffic, composition of vehicle types), and environmental conditions (e.g., road surface conditions, light conditions, or weather conditions) for a specific road type (e.g., highway, intersection, rural road) [4,5,6,7,8]. However, accident severity presents a less studied and understood aspect of safety in transportation systems [6]. Studying the factors of the conditional distribution of accident severity (i.e., probability of accident severity, given that an accident has occurred) allows for gaining additional insights into true motorist behavior (e.g., speeding, driving under the influence, sleep deprivation, cell phone use) as well as the interactions of the behavior with environmental and roadway characteristics. The factors influencing accident severity may therefore have different interdependencies and characteristics than those influencing the likelihood of accident itself. The main focus of this paper is therefore on the interaction between motorist behavior, environmental and roadway conditions and their impact on accident severity.

Studies of the factors that contribute to and cause motor vehicle accidents with different levels of severity rely on either a univariate or a multivariate approach. The former approach aims to investigate the effect of a single factor on accident severity. In contrast, the latter approach considers the effect of a multitude of factors and their interactions on accident severity. For instance, certain studies investigated the effect of gender on accident severity and found that accidents involving men are more severe [9, 10]. Other studies investigated the relationship between alcohol use and accident severity and found that fatality rates increase dramatically with drinking and driving [11, 12]. Other factors studied in isolation include driver distraction, seat-belt usage, driver age, and lighting and weather conditions to name a few [13,14,15,16]. The univariate approach introduces potential ambiguity and bias in severity analyses, prompting the majority of recent studies to employ multivariate analyses to incorporate the influence of a multitude of factors on accident severity. Many factors are typically included in these multivariate studies to develop injury severity models. However, many of these studies choose to concentrate on a subset of traffic data limited to a particular accident type, certain road segments, or vehicle types [6, 17,18,19]. Typically, this approach is followed in order to obtain accurate prediction models from a somewhat homogenous dataset [20].

In recent years, various methodologies and statistical techniques have been developed and applied to model the accident injury severity. In general, methodologies and techniques used to model accident severity can be classified into three groups: (1) discrete choice models, (2) soft computing techniques, and (3) data mining techniques. Discrete choice models include logit and probit models. The logit model aims to describe the relationship of one or more independent variables to an outcome variable. Ouyang et al. [21] implemented a binary logit model (BLM) to investigate the simultaneity of injury severity outcomes in multi-vehicle crashes and found that high speed curve designs decrease the injury severity of car-truck collisions. Malyshkina and Mannering [22] analyzed the accident injury severity for two vehicles or fewer using a multinomial logit model (MNL) and found that adverse weather conditions correlates with severe injuries. Chen et al. [17] developed a hybrid approach to combine multinomial logit models and Bayesian network methods to analyze driver injury severities in rear-end crashes and found that factors such as windy weather conditions and truck-involvement increase accident severity. Probit models address certain limitations of logit models. They can incorporate random variation and any pattern of substitution, and do not suffer from the multinomial logit’s assumption of independence of irrelevant alternatives (IIA) [23]. Xie et al. [24] analyzed accident injury severity using a Bayesian ordered probit model and found it an effective method to combine information contained in the data with the prior knowledge of the parameters. Mujalli and de Oña [25] provided a comprehensive summary of other types of discrete choice models to model accident severity injury and address different methodological issues for certain datasets. These models include hierarchical logit, heteroskedastic logit, ordered logit, mixed logit, and ordered probit models [19, 26,27,28,29]. A major drawback of discrete choice models is the predefined underlying relationships between variables (e.g., linear relation) which may lead to errors in injury severity estimation if these assumptions are violated in the dataset.

To add flexibility to their models, some researchers have exploited soft computing techniques in accident injury severity applications. These techniques include the artificial neural network (ANN), genetic algorithm (GA) and recurrent neural network (RNN). Delen et al. [20] used a series of ANNs to model the potentially non-linear relationships between the injury severity levels and causal factors and found that no single factor by itself appears to significantly influence accident injury severity. Kunt et al. [30] compared the performance of ANN and GA in predicting accident severity outcome and found that ANN outperformed GA. Sameen and Pradhan [31] developed a RNN model to predict the injury severity of traffic accidents and found its performance superior to that of ANNs. The performance of soft computing techniques, in general, is highly dependent on complete data and typically cannot incorporate prior knowledge or expert opinion.

Recently, increased attention has been directed at data mining techniques such as Bayesian networks to model accident severity as a result of increasing data availability and computational resources. Bayesian networks, which are graphical models of interactions between a set of variables, have been used in a number of traffic crash and modeling studies [17, 32,33,34,35]. For instance, Simoncic [32] developed a Bayesian network to model road accidents involving two cars incorporating several factors for both vehicles such as seatbelt use, alcohol use, and driver experience. De Oña et al. [33] used Bayesian networks to identify significant factors and analyze the severity of traffic accidents on rural highways by classifying accidents as slightly injured or killed/severely injured. De Oña et al. [34] employed Bayesian networks to model traffic accident injury severity on Spanish rural highways. Mujalli and De Oña [35] analyzed traffic accidents injury severity on two-lane highways using Bayesian networks. Zong et al. [36] compared Bayesian networks and regression models and concluded that Bayesian networks outperformed regression models. However, driver characteristics related variables, which have an impact on injury severity, were not included due to limited data. Other traffic modeling applications for Bayesian networks include the identification of traffic conditions by estimating traffic accident risks and the analysis of highway safety performance [37, 38]. The review of the literature suggests that most studies addressing accident injury severity prediction are narrow in scope and rely on only a highly homogenous dataset. In addition, from the literature review it is clear that no study was conducted or laid out a plan to model accident severity for autonomous or semi-autonomous vehicles.

The studies involving Bayesian networks provide insights into the typical applications in traffic accident modeling and analysis. Various features of Bayesian networks enable it to model traffic accident situations. These networks can capture interdependencies and statistical associations between dependent and independent variables that ultimately affects the predicted outcome. The directed acyclic graph (DAG) defines the network structure for a Bayesian network and the conditional probability distributions (CPD) define the quantitative relationship between variables. The network structure and CPD do not require any specified assumptions about variables. In addition, complete data, i.e., where all variable values are specified for a given observation, are not necessarily required for these models. The network predicts and infers probabilistically, conditional on the evidence provided for variables. Bayesian networks are also capable of incorporating prior knowledge and can predict more than a single output node. Furthermore, Bayesian networks are useful when uncertainty is present regarding the correlation between variables and their combined influence on the predicted outcome.

This study aims to develop a Bayesian network to discover patterns from a non-homogenous accident dataset in order to estimate the severity of an accident, should it occur. The severity of an accident injury is classified into two categories, namely, property damage and injury/fatality. The framework consists of a Bayesian network that integrates pre-crash information including driver behavior, geometric features, and environmental features. Contrary to previous studies, in this work, the domain is not narrowed to a particular accident type, road segment, or vehicle type. State-wide accident data from Michigan in 2016 is used to train, validate, and test the Bayesian network [39]. In addition, the impact of balancing the data is investigated due to the typically heavily unbalanced accident datasets with respect to the number of fatal accidents compared to those causing property damage or injury. As an extension to the tested model, a general framework based on Bayesian networks for modelling accident injury severity for various levels of autonomy in vehicles is presented for future calibration and testing as autonomous vehicle technology data become more readily available.

The rest of the paper is organized as follows. In Sect. 2, we provide an overview of Bayesian networks and the performance metrics used to evaluate these networks in classification applications. In Sect. 3, we discuss the data set used in the numerical study. In Sect. 4, a Bayesian network for the driver and autonomous networks is developed. The driver network is trained with and without data balancing, and the corresponding results and insights are discussed. In addition, a detailed discussion is provided on incorporating the factors influencing accident severity in the autonomous network and ways to calibrate these factors are specified. Lastly, we conclude in Sect. 5.

2 Methodological Approach

In this section, we discuss Bayesian networks and their applicability in modeling traffic accident injury severity. In addition, we present various indicators used to measure the performance of Bayesian networks in classification applications.

2.1 Bayesian Networks

Bayesian networks have gained increasing popularity in recent years and is employed in the modeling process for various applications where expert knowledge is important including traffic analyses, medicine, bio-informatics, and image processing [40]. The Bayesian network in this study acts as a classifier to analyze the worst injury severity in a traffic accident based on factors identified from the literature and from expert knowledge.

Let \( S = \left\{ {X_{1} , . . ., X_{n} } \right\}, n \ge 1 \) denote a set of variables representing the nodes in the Bayesian network. Let the network structure \( B_{S} \) denote a DAG Bayesian network over the set \( S \) and \( B_{p} = \{ p(X_{i} |pa\left( {X_{i} } \right), X_{i} \in S)\} \) for \( i = 1, 2, . . ., n \) denote a set of probability distributions where \( pa\left( {X_{i} } \right) \) is the set of parents of \( X_{i} \) in the Bayesian network structure \( B_{S} \). Edges in the DAG represent the relationship between parent and child nodes. These edges indicate causality, dependence and independence, based on graph theory. Therefore, a Bayesian network represents joint probability distributions \( P\left( S \right) = \prod_{{X_{i} \in S}} p(X_{i} |pa\left( {X_{i} } \right)) \). Using the DAG to classify injury severity is to classify an outcome \( y \), trained from a dataset \( T \) that contains multiple instances of \( \left( {X, y} \right) \). In order to use the network as a classifier, a value of \( y \) is required to maximize \( P(y|X) \), i.e., \( {\text{argmax}}_{\text{y}} P(y|X) \).

The structure and parameters, i.e., conditional probability distributions of a Bayesian network, can be determined in a number of ways depending on the application [41, 42]. One popular approach is to learn the parameters of a Bayesian network, given the structure, usually using maximum likelihood (ML) estimation. The other popular approach is to employ a score-based approach to learn the structure of the network. Employing a scoring metric requires complete data for all variables in the network. Typical scoring methods include the Bayesian information criterion (BIC), structure entropy, Akaike information criterion (AIC), and Bayes metric. In this study, the structure is specified using expert knowledge and the parameters are learned using the ML method. In addition, the BIC metric is used to justify the network structure from a finite set of alternative network configurations.

2.2 Performance Evaluation Indicators

Several indicators are typically used to evaluate the performance of a Bayesian network used in classification applications [33]. In this study, accuracy, recall, precision, balanced F-score, receiver operating characteristic (ROC) score, and the geometric mean of recall and specificity are presented as classifier performance indicators [43]. The accuracy, which calculates the proportion of correctly classified cases, provides an evaluation of the classifier’s overall performance. However, for heavily unbalanced datasets, this metric may often be very high but does not necessarily indicate good performance. Recall and precision evaluates the classifier’s ability to correctly distinguish between the property damage and injury/fatality cases. The F-score incorporates the trade-off between precision and recall and is used to evaluate the overall accident injury severity classification performance of the Bayesian network. The F-score incorporates the average proportions of precision and recall. Furthermore, the ROC score, unlike the F-score, does not give equal weight to precision and recall. Classifiers with ROC curves with an area under curve (AUC) of 0.5 are considered to be no better than guessing whereas an AUC of 1.00 describes a perfect classifier. Lastly, the geometric mean of recall and specificity provides an overall performance evaluation metric, which are particularly useful for unbalanced datasets.

3 Data

Accident data were obtained from the Michigan Traffic Crash Facts (MTCF) website for the year 2016 [39]. The dataset consists of state-wide data for all types of accidents and vehicles on a public traffic way in Michigan resulting in injury, death, or at least $1,000 in property damage. The dataset consists of 81.5% property damage cases and 18.5% injury/fatality cases. In this study, 14,000 accident records were used, where accidents causing injury and death are grouped together due to the low frequency of fatal accidents.

4 Results and Discussion

The following section presents the results obtained from a numerical study. First, the Bayesian network for the driver and autonomous networks is presented. Next, the driver network is trained with and without balancing, and the corresponding results are discussed. Lastly, a detailed discussion is provided on incorporating the factors affecting accident severity in the autonomous network.

4.1 Driver and Autonomous Networks

As discussed, objectives of this study include the prediction of accidents as property damage or injury/fatality accidents and to identify certain factors that influence the accident injury severity. In addition, an extension to the driver mode, the autonomous case is also presented, referred to as the ‘autonomous’ network. Factors included in the driver network is based on previous studies, expert knowledge and the availability of new data such as the distractedness of the driver [17, 33, 34]. The autonomous factors are identified with the aid of recent crashes involving semi-autonomous vehicles and industry experts. The factors included in the driver network include: characteristics of the accident (time, day of week, accident type, total number of involved vehicles, speed limit, worst injury in accident); weather information (weather and lighting conditions); driver behavior (alcohol and drug use, distracted); road characteristics (surface conditions, geometry, class, and zone); and the immediate physical environment as the accident occurred (pedestrian or deer involved). The factors included in the autonomous network include GPS accuracy, quality of sensor readings, quality of traffic signs, as well as V21 and V2V communication. Table 1 provides information on the factors included for the driver and autonomous networks developed in this study.

Table 1. The definition of variables for the driver and autonomous networks

As discussed, the structure of the network is specified using expert knowledge and parameters are estimated using ML. In addition, the Bayesian information criterion (BIC), or Schwarz criterion, is employed for the model structure to justify selection among a finite set of possible model structures. Figure 1 presents the Bayesian network that includes the driver and autonomous networks. The autonomous network naturally also includes the factors of the driver network.

Fig. 1.
figure 1

Bayesian network indicating the driver and autonomous networks. The driver network is indicated with solid lines and the autonomous factors are indicated with dashed lines. Refer to Table 1 for node terminology.

4.2 Results

The total number of 14,000 accident records was split into the following sets: 10,000 records for training, 2000 records for validation, and 2000 for testing. To reduce training time, the training set of 10,000 records was split into ten equally sized, balanced sets with 500 property damage and 500 injury/fatality cases to train ten Bayesian networks. Two methods, namely, ensembling and majority vote, were then employed to select an approach to combine the results of the ten trained models. Specifically, in the ensembling method, the ten conditional probability distributions obtained from the trained models were averaged to obtain a single model that was used to predict and classify accident injury severity for the validation set. In the majority vote method, the ten models were each used to predict and classify accident injury severity. The majority vote from the ten predictions was then used as the ultimate prediction. In both methods, in the case of a tie, a random number is generated where the class is assigned to property damage if the class probability is greater than 0.185, and assigned to the injury/severity class otherwise. The threshold of 0.185 is selected based on the fraction of injury/fatality accidents in the overall dataset. Results for the two methods on the validation set indicate that the majority vote method performed superior to the ensemble method based on both the F-score and accuracy. The majority vote method had an F-score and accuracy of 0.85 and 0.76, respectively, as opposed to an F-score of 0.76 and accuracy of 0.66 for the ensemble method. Hence, the majority vote was used for the remaining analyses.

Specifically, the majority vote method was used on the test set for both balanced and unbalanced training sets. The balanced case provides promising results as illustrated in Table 2. The F-score of 0.79 indicates that the network classified accidents as property damage and injury/fatality with high precision and recall. The AUC for the ROC curve for the testing dataset was 0.62, illustrating that the trained model discovers correlation between variables in an effective manner in order to classify the accident severity [44]. The geometric mean of recall and specificity of 0.6 illustrates that the model was able to predict both classes of severity. In addition, the high precision indicates that the Bayesian network was able to classify the majority of property damage cases correctly. Table 2 also illustrates the test results for an unbalanced training set with 815 property damage and 185 fatal/injury cases per training set. As expected, the unbalanced case overfits the data for the property damage class resulting in a high F-score, but lacks the ability to correctly predict fatal/injury cases as evident from the ROC score and geometric mean of 0.58 and 0.46, respectively.

Table 2. Test results for balanced and unbalanced training sets

The relationship between variables in a Bayesian network enables probability inference analyses based on conditional probability distributions for all factors included in the network. By setting evidences for specific variables, the contribution of factors to accident injury severity is quantified. More specifically, the difference between the probabilities of accident injury severity outcomes with and without evidence for a particular level of a given factor, in the absence of any further evidence for the other factors, determines the impact of that particular factor level on the outcomes. Table 3 illustrates the inference results for variables significantly influencing accident injury severity. Specifically, Table 3 presents the ten most influential factor levels where the corresponding percentages are the percentage difference under the factor level, averaged over the ten trained Bayesian network models using balanced data.

Table 3. Probability inference results for variables affecting injury severities

As seen in Table 3, the inference results indicate that a dry road surface condition causes the highest increase, approximately by 11.2%, in the probability that a crash results in fatality/injury. This is mainly because dry road conditions are typically associated with higher travelling speeds, which can contribute to an increase in the severity if an accident occurs. Furthermore, snowy road surface and poor weather conditions are associated with a lower likelihood of an accident resulting in fatal/injury accident and therefore a higher likelihood that it causes property damage if it occurs. These results are consistent with previous studies that highlight the significance of factors influencing accident injury severity [20, 34].

4.3 Bayesian Network Extension

With reference to the autonomous network in Fig. 1, indicated by the dashed lines, the following discussion aims to provide justification for the selection of the factors that may influence accident injury severity and ways to calibrate these variables as data become more readily available for autonomous vehicles. It is envisioned that the manufacturer data collected during recent autonomous technology studies, such as the Safety Pilot Study conducted by University of Michigan Transportation Research Institute (UMTRI), will enable the calibration of the following factors.

Experts across various industries anticipate that the gradual introduction of autonomous driving technologies will lead to safer and more efficient roadways. Advocates argue that, in the future, autonomous cars will reduce traffic accidents by up to 90% [45, 46]. Autonomous vehicles will continuously monitor the environment using sensors, cameras, radars, and a global positioning system (GPS) which ultimately feeds information to the control software that enables autonomous tasks such as autonomous emergency braking (AEB), adaptive cruise control (ACC), lane keeping assist, and automatic parking [47, 48].

During the transition to an autonomous vehicle network and while the technology is still developing, care need to be taken to ensure that all factors contributing to accidents and increased severity for accidents involving autonomous vehicles are understood. In addition, new risks related to autonomous driving technologies seem inevitable. For instance, an area of autonomous technology in vehicles that requires significant research attention is the concept of networking. Networking comes in the forms of vehicle to vehicle (V2V) and vehicle to infrastructure (V2I) communication. However, this area of autonomous technology introduces certain dangers in the form of cyber-security risks. In general, anticipated risks for autonomous and semi- autonomous vehicles include sensor failure, the injection of false messages, or contrasting information received from various sources. Below, the calibration of factors influencing accident injury severity in various levels of connected autonomous vehicles are discussed.

GPS Accuracy and Availability.

Autonomous vehicles use GPS, radar, and sensor technology to direct and steer the vehicle without the assistance of the driver. However, certain factors influence the accuracy of these readings and may therefore introduce non-precise information that could cause unsafe conditions when used for controlling the vehicle. For instance, clearly, GPS accuracy gets affected by presence of tall buildings and dense urban areas [49, 50]. It is likely that the lack of accurate information about the vehicle position may lead to accidents and possibly, increase their severity under certain conditions. Data for this factor in the future can be acquired by generating maps that could describe the strength of GPS signal, in order to estimate the GPS accuracy and availability for a journey.

Quality of Traffic Signs and Lane Markings.

The quality of traffic signs and lane markings significantly impact the steering capabilities of autonomous vehicles. For instance, in 2017 a Tesla Model S drove into a highway barrier at a construction zone in Dallas [51]. This occurred as a result of sensors failing to recognize the roadway markings and traffic signs to merge into another lane. The road construction and warning signs were poorly implemented. To calibrate this factor in the Bayesian network, the quality of traffic signs and lane marking can be approximated using classification models trained separately for this purpose. In addition, manufacturer data can be used to estimate the ability of hardware and software to recognize various traffic signs and lane markings.

Quality of Sensor Messages.

Autonomous vehicles depend on information received from sensors and radars. Therefore, if a sensor is obstructed or interfered with, the software controlling the steering of the vehicle may receive the wrong information or no information at all [52]. For instance, if mud covers a sensor, bright lights or electronic interference affect sensor readings, or if malicious information finds its way into the software, the vehicle and its passengers will be at risk. This factor can be calibrated by using manufacturer data regarding the ability of sensors to transmit the correct information under the mentioned circumstances.

Vehicle to Infrastructure (V2I) Communication.

V2I communication refers to the exchange of information between the vehicle and its surrounding infrastructure. For instance, traffic information such as speed limits, traffic lights readings, and traffic reports are communicated to the vehicle [53]. If incorrect information is obtained by the car, or incorrect information is transmitted by the roadside units (RSUs), dangerous situations can arise. This factor in the Bayesian network can be calibrated using a combination of manufacturer and city infrastructure data.

Vehicle to Vehicle (V2V) Communication.

V2 V communication is the exchange of information between vehicles. For instance, emergency brake warning, forward collision warning, and intersection movement assistance will be common in autonomous vehicles and could potentially revolutionize vehicle safety [45]. However, if incorrect/imprecise information is obtained from surrounding or oncoming traffic, dangerous situations may arise. This factor can be calibrated in the future as more field studies that include V2 V technology are conducted. An example of such studies is the Safety Pilot project at University of Michigan, where about 3,000 vehicles in the Ann Arbor region were equipped with communication equipment.

5 Conclusions

This paper develops a Bayesian network to estimate the severity of an accident, should it occur, for all types of accidents, road conditions, and road segments, as a function of pre-crash information. The generic Bayesian network is developed for both non-autonomous and autonomous vehicles which will share the road for the foreseeable future. A model is trained for non-autonomous vehicles using state-wide data from Michigan for the year 2016. Results indicate that the Bayesian network performs well in the classification of accident injury severity, particularly when training sets are balanced to avoid favoring the more represented group, i.e., property damage accidents. The F-score of 0.79, area under the ROC curve of 0.62, and geometric mean of recall and specificity of 0.6 illustrate that the trained model discovers correlation between causal and contributing variables in an effective manner in order to classify the accident injury severity. Furthermore, discussions are presented on the calibration and testing of the autonomous model in the future as data from autonomous vehicle technologies become more readily available. It is anticipated that the developed methodology would assist the development of countermeasures to decrease accident severity and improve traffic safety performance.

The study is subject to certain limitations. The data set used in the numerical analysis is limited to the Michigan area for the year 2016. Hence, it should be noted that the reported results and insights are only valid for this area and time period. Further analyses including other areas over an extended time period are required to ensure generalizability. Furthermore, due to the low frequency of fatal accidents and limited information regarding the severity of crashes in the data set, only two levels were used to model crash severity in the numerical study. In future research efforts, a more granular approach to modeling the severity of accidents may be employed if more detailed data is acquired. Further research efforts can be employed to improve the performance of the Bayesian network and investigate the correlation between factors contributing to accident injury severity. Additional factors contributing to accidents can be included in the network and a variety of structure scoring methods can be utilized to improve the classification performance. Lastly, the autonomous Bayesian network can be calibrated and tested as data becomes more readily available.