A Path Towards Understanding Factors Affecting Crash Severity in Autonomous Vehicles Using Current Naturalistic Driving Data

van Wyk, Franco; Khojandi, Anahita; Masoud, Neda

doi:10.1007/978-3-030-29513-4_8

A Path Towards Understanding Factors Affecting Crash Severity in Autonomous Vehicles Using Current Naturalistic Driving Data

Franco van Wyk¹⁷,
Anahita Khojandi¹⁷ &
Neda Masoud¹⁸

Conference paper
First Online: 24 August 2019

2491 Accesses
4 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1038))

Abstract

In the U.S., in 2015 alone, there were approximately 35,000 fatalities and 2.4 million injuries caused by an estimated 6.3 million traffic accidents. In the future, it is speculated that automated systems will help to avoid or decrease the number and severity of accidents. However, before such a time, a broad range of vehicles, from non-autonomous to fully-autonomous, will share the road. Hence, measures need to be put in place to improve both safety and efficiency, while not compromising the advantages of autonomous driving technology. In this study, a Bayesian network model is developed to predict the severity of an accident, should it occur, given the road and the immediate environment conditions. The model is calibrated for the case of traditional vehicles using pre-crash information on driver behaviour, road surface conditions, weather and lighting conditions, among other variables, to predict two categories of consequences for accidents, namely property damage and injury/fatality. The results demonstrate that the proposed methodology and the determinant factors used in the models can predict the consequences of an accident, and more importantly, the probability of a crash causing injury/fatality, with high accuracy. Approaches to extend this model are proposed to predict accident severity for autonomous vehicles through leveraging their sensor data. Such a model would assist the development of countermeasures to identify the most important factors impacting severity of accidents for semi- and fully-autonomous vehicles to prevent accidents, decrease accident severity in cases where accidents are bound to occur, and improve transportation safety in general.

Download conference paper PDF

1 Introduction and Literature Review

More than 1.2 million fatalities and 50 million injuries were recorded globally in 2015, making road traffic accidents a leading cause of death. Fatalities and injuries from traffic accidents cause, on average, an estimated 3% loss in GDP on a global scale [1]. In the U.S., in 2015, there were 35,092 fatalities and 2.44 million injuries resulting from 6.3 million reported traffic accidents causing an estimated 1.9% loss in GDP [2]. Accident severity has a significant impact on the economic costs associated with an accident. Fatalities, injuries, and property damage accidents cost, on average, $1.5 million, $80,700, and $9,300, respectively per accident [3]. These economic costs account for wage and productivity losses, medical expenses, administrative expenses, and motor vehicle damage including the value of damage to property.

A popular area in traffic safety analysis is the identification of factors that increase the likelihood of an accident occurring. These factors typically include non-behavioral factors affecting accident frequency such as geometric variables (e.g., horizontal and vertical road alignments, or the immediate physical environment), traffic characteristics (e.g., hourly volume, annual average daily traffic, composition of vehicle types), and environmental conditions (e.g., road surface conditions, light conditions, or weather conditions) for a specific road type (e.g., highway, intersection, rural road) [4,5,6,7,8]. However, accident severity presents a less studied and understood aspect of safety in transportation systems [6]. Studying the factors of the conditional distribution of accident severity (i.e., probability of accident severity, given that an accident has occurred) allows for gaining additional insights into true motorist behavior (e.g., speeding, driving under the influence, sleep deprivation, cell phone use) as well as the interactions of the behavior with environmental and roadway characteristics. The factors influencing accident severity may therefore have different interdependencies and characteristics than those influencing the likelihood of accident itself. The main focus of this paper is therefore on the interaction between motorist behavior, environmental and roadway conditions and their impact on accident severity.

Studies of the factors that contribute to and cause motor vehicle accidents with different levels of severity rely on either a univariate or a multivariate approach. The former approach aims to investigate the effect of a single factor on accident severity. In contrast, the latter approach considers the effect of a multitude of factors and their interactions on accident severity. For instance, certain studies investigated the effect of gender on accident severity and found that accidents involving men are more severe [9, 10]. Other studies investigated the relationship between alcohol use and accident severity and found that fatality rates increase dramatically with drinking and driving [11, 12]. Other factors studied in isolation include driver distraction, seat-belt usage, driver age, and lighting and weather conditions to name a few [13,14,15,16]. The univariate approach introduces potential ambiguity and bias in severity analyses, prompting the majority of recent studies to employ multivariate analyses to incorporate the influence of a multitude of factors on accident severity. Many factors are typically included in these multivariate studies to develop injury severity models. However, many of these studies choose to concentrate on a subset of traffic data limited to a particular accident type, certain road segments, or vehicle types [6, 17,18,19]. Typically, this approach is followed in order to obtain accurate prediction models from a somewhat homogenous dataset [20].

In recent years, various methodologies and statistical techniques have been developed and applied to model the accident injury severity. In general, methodologies and techniques used to model accident severity can be classified into three groups: (1) discrete choice models, (2) soft computing techniques, and (3) data mining techniques. Discrete choice models include logit and probit models. The logit model aims to describe the relationship of one or more independent variables to an outcome variable. Ouyang et al. [21] implemented a binary logit model (BLM) to investigate the simultaneity of injury severity outcomes in multi-vehicle crashes and found that high speed curve designs decrease the injury severity of car-truck collisions. Malyshkina and Mannering [22] analyzed the accident injury severity for two vehicles or fewer using a multinomial logit model (MNL) and found that adverse weather conditions correlates with severe injuries. Chen et al. [17] developed a hybrid approach to combine multinomial logit models and Bayesian network methods to analyze driver injury severities in rear-end crashes and found that factors such as windy weather conditions and truck-involvement increase accident severity. Probit models address certain limitations of logit models. They can incorporate random variation and any pattern of substitution, and do not suffer from the multinomial logit’s assumption of independence of irrelevant alternatives (IIA) [23]. Xie et al. [24] analyzed accident injury severity using a Bayesian ordered probit model and found it an effective method to combine information contained in the data with the prior knowledge of the parameters. Mujalli and de Oña [25] provided a comprehensive summary of other types of discrete choice models to model accident severity injury and address different methodological issues for certain datasets. These models include hierarchical logit, heteroskedastic logit, ordered logit, mixed logit, and ordered probit models [19, 26,27,28,29]. A major drawback of discrete choice models is the predefined underlying relationships between variables (e.g., linear relation) which may lead to errors in injury severity estimation if these assumptions are violated in the dataset.

To add flexibility to their models, some researchers have exploited soft computing techniques in accident injury severity applications. These techniques include the artificial neural network (ANN), genetic algorithm (GA) and recurrent neural network (RNN). Delen et al. [20] used a series of ANNs to model the potentially non-linear relationships between the injury severity levels and causal factors and found that no single factor by itself appears to significantly influence accident injury severity. Kunt et al. [30] compared the performance of ANN and GA in predicting accident severity outcome and found that ANN outperformed GA. Sameen and Pradhan [31] developed a RNN model to predict the injury severity of traffic accidents and found its performance superior to that of ANNs. The performance of soft computing techniques, in general, is highly dependent on complete data and typically cannot incorporate prior knowledge or expert opinion.

Recently, increased attention has been directed at data mining techniques such as Bayesian networks to model accident severity as a result of increasing data availability and computational resources. Bayesian networks, which are graphical models of interactions between a set of variables, have been used in a number of traffic crash and modeling studies [17, 32,33,34,35]. For instance, Simoncic [32] developed a Bayesian network to model road accidents involving two cars incorporating several factors for both vehicles such as seatbelt use, alcohol use, and driver experience. De Oña et al. [33] used Bayesian networks to identify significant factors and analyze the severity of traffic accidents on rural highways by classifying accidents as slightly injured or killed/severely injured. De Oña et al. [34] employed Bayesian networks to model traffic accident injury severity on Spanish rural highways. Mujalli and De Oña [35] analyzed traffic accidents injury severity on two-lane highways using Bayesian networks. Zong et al. [36] compared Bayesian networks and regression models and concluded that Bayesian networks outperformed regression models. However, driver characteristics related variables, which have an impact on injury severity, were not included due to limited data. Other traffic modeling applications for Bayesian networks include the identification of traffic conditions by estimating traffic accident risks and the analysis of highway safety performance [37, 38]. The review of the literature suggests that most studies addressing accident injury severity prediction are narrow in scope and rely on only a highly homogenous dataset. In addition, from the literature review it is clear that no study was conducted or laid out a plan to model accident severity for autonomous or semi-autonomous vehicles.

The studies involving Bayesian networks provide insights into the typical applications in traffic accident modeling and analysis. Various features of Bayesian networks enable it to model traffic accident situations. These networks can capture interdependencies and statistical associations between dependent and independent variables that ultimately affects the predicted outcome. The directed acyclic graph (DAG) defines the network structure for a Bayesian network and the conditional probability distributions (CPD) define the quantitative relationship between variables. The network structure and CPD do not require any specified assumptions about variables. In addition, complete data, i.e., where all variable values are specified for a given observation, are not necessarily required for these models. The network predicts and infers probabilistically, conditional on the evidence provided for variables. Bayesian networks are also capable of incorporating prior knowledge and can predict more than a single output node. Furthermore, Bayesian networks are useful when uncertainty is present regarding the correlation between variables and their combined influence on the predicted outcome.

This study aims to develop a Bayesian network to discover patterns from a non-homogenous accident dataset in order to estimate the severity of an accident, should it occur. The severity of an accident injury is classified into two categories, namely, property damage and injury/fatality. The framework consists of a Bayesian network that integrates pre-crash information including driver behavior, geometric features, and environmental features. Contrary to previous studies, in this work, the domain is not narrowed to a particular accident type, road segment, or vehicle type. State-wide accident data from Michigan in 2016 is used to train, validate, and test the Bayesian network [39]. In addition, the impact of balancing the data is investigated due to the typically heavily unbalanced accident datasets with respect to the number of fatal accidents compared to those causing property damage or injury. As an extension to the tested model, a general framework based on Bayesian networks for modelling accident injury severity for various levels of autonomy in vehicles is presented for future calibration and testing as autonomous vehicle technology data become more readily available.

The rest of the paper is organized as follows. In Sect. 2, we provide an overview of Bayesian networks and the performance metrics used to evaluate these networks in classification applications. In Sect. 3, we discuss the data set used in the numerical study. In Sect. 4, a Bayesian network for the driver and autonomous networks is developed. The driver network is trained with and without data balancing, and the corresponding results and insights are discussed. In addition, a detailed discussion is provided on incorporating the factors influencing accident severity in the autonomous network and ways to calibrate these factors are specified. Lastly, we conclude in Sect. 5.

2 Methodological Approach

In this section, we discuss Bayesian networks and their applicability in modeling traffic accident injury severity. In addition, we present various indicators used to measure the performance of Bayesian networks in classification applications.

2.1 Bayesian Networks

Bayesian networks have gained increasing popularity in recent years and is employed in the modeling process for various applications where expert knowledge is important including traffic analyses, medicine, bio-informatics, and image processing [40]. The Bayesian network in this study acts as a classifier to analyze the worst injury severity in a traffic accident based on factors identified from the literature and from expert knowledge.

Let $ S = \left\{ {X_{1} , . . ., X_{n} } \right\}, n \ge 1 $ denote a set of variables representing the nodes in the Bayesian network. Let the network structure $ B_{S} $ denote a DAG Bayesian network over the set $ S $ and $ B_{p} = \{ p(X_{i} |pa\left( {X_{i} } \right), X_{i} \in S)\} $ for $ i = 1, 2, . . ., n $ denote a set of probability distributions where $ pa\left( {X_{i} } \right) $ is the set of parents of $ X_{i} $ in the Bayesian network structure $ B_{S} $. Edges in the DAG represent the relationship between parent and child nodes. These edges indicate causality, dependence and independence, based on graph theory. Therefore, a Bayesian network represents joint probability distributions $ P\left( S \right) = \prod_{{X_{i} \in S}} p(X_{i} |pa\left( {X_{i} } \right)) $. Using the DAG to classify injury severity is to classify an outcome $ y $, trained from a dataset $ T $ that contains multiple instances of $ \left( {X, y} \right) $. In order to use the network as a classifier, a value of $ y $ is required to maximize $ P(y|X) $, i.e., $ {\text{argmax}}_{\text{y}} P(y|X) $.

The structure and parameters, i.e., conditional probability distributions of a Bayesian network, can be determined in a number of ways depending on the application [41, 42]. One popular approach is to learn the parameters of a Bayesian network, given the structure, usually using maximum likelihood (ML) estimation. The other popular approach is to employ a score-based approach to learn the structure of the network. Employing a scoring metric requires complete data for all variables in the network. Typical scoring methods include the Bayesian information criterion (BIC), structure entropy, Akaike information criterion (AIC), and Bayes metric. In this study, the structure is specified using expert knowledge and the parameters are learned using the ML method. In addition, the BIC metric is used to justify the network structure from a finite set of alternative network configurations.

2.2 Performance Evaluation Indicators

Several indicators are typically used to evaluate the performance of a Bayesian network used in classification applications [33]. In this study, accuracy, recall, precision, balanced F-score, receiver operating characteristic (ROC) score, and the geometric mean of recall and specificity are presented as classifier performance indicators [43]. The accuracy, which calculates the proportion of correctly classified cases, provides an evaluation of the classifier’s overall performance. However, for heavily unbalanced datasets, this metric may often be very high but does not necessarily indicate good performance. Recall and precision evaluates the classifier’s ability to correctly distinguish between the property damage and injury/fatality cases. The F-score incorporates the trade-off between precision and recall and is used to evaluate the overall accident injury severity classification performance of the Bayesian network. The F-score incorporates the average proportions of precision and recall. Furthermore, the ROC score, unlike the F-score, does not give equal weight to precision and recall. Classifiers with ROC curves with an area under curve (AUC) of 0.5 are considered to be no better than guessing whereas an AUC of 1.00 describes a perfect classifier. Lastly, the geometric mean of recall and specificity provides an overall performance evaluation metric, which are particularly useful for unbalanced datasets.

3 Data

Accident data were obtained from the Michigan Traffic Crash Facts (MTCF) website for the year 2016 [39]. The dataset consists of state-wide data for all types of accidents and vehicles on a public traffic way in Michigan resulting in injury, death, or at least $1,000 in property damage. The dataset consists of 81.5% property damage cases and 18.5% injury/fatality cases. In this study, 14,000 accident records were used, where accidents causing injury and death are grouped together due to the low frequency of fatal accidents.

4 Results and Discussion

The following section presents the results obtained from a numerical study. First, the Bayesian network for the driver and autonomous networks is presented. Next, the driver network is trained with and without balancing, and the corresponding results are discussed. Lastly, a detailed discussion is provided on incorporating the factors affecting accident severity in the autonomous network.

4.1 Driver and Autonomous Networks

As discussed, objectives of this study include the prediction of accidents as property damage or injury/fatality accidents and to identify certain factors that influence the accident injury severity. In addition, an extension to the driver mode, the autonomous case is also presented, referred to as the ‘autonomous’ network. Factors included in the driver network is based on previous studies, expert knowledge and the availability of new data such as the distractedness of the driver [17, 33, 34]. The autonomous factors are identified with the aid of recent crashes involving semi-autonomous vehicles and industry experts. The factors included in the driver network include: characteristics of the accident (time, day of week, accident type, total number of involved vehicles, speed limit, worst injury in accident); weather information (weather and lighting conditions); driver behavior (alcohol and drug use, distracted); road characteristics (surface conditions, geometry, class, and zone); and the immediate physical environment as the accident occurred (pedestrian or deer involved). The factors included in the autonomous network include GPS accuracy, quality of sensor readings, quality of traffic signs, as well as V21 and V2V communication. Table 1 provides information on the factors included for the driver and autonomous networks developed in this study.

Table 1. The definition of variables for the driver and autonomous networks

Full size table

As discussed, the structure of the network is specified using expert knowledge and parameters are estimated using ML. In addition, the Bayesian information criterion (BIC), or Schwarz criterion, is employed for the model structure to justify selection among a finite set of possible model structures. Figure 1 presents the Bayesian network that includes the driver and autonomous networks. The autonomous network naturally also includes the factors of the driver network.

4.2 Results

The total number of 14,000 accident records was split into the following sets: 10,000 records for training, 2000 records for validation, and 2000 for testing. To reduce training time, the training set of 10,000 records was split into ten equally sized, balanced sets with 500 property damage and 500 injury/fatality cases to train ten Bayesian networks. Two methods, namely, ensembling and majority vote, were then employed to select an approach to combine the results of the ten trained models. Specifically, in the ensembling method, the ten conditional probability distributions obtained from the trained models were averaged to obtain a single model that was used to predict and classify accident injury severity for the validation set. In the majority vote method, the ten models were each used to predict and classify accident injury severity. The majority vote from the ten predictions was then used as the ultimate prediction. In both methods, in the case of a tie, a random number is generated where the class is assigned to property damage if the class probability is greater than 0.185, and assigned to the injury/severity class otherwise. The threshold of 0.185 is selected based on the fraction of injury/fatality accidents in the overall dataset. Results for the two methods on the validation set indicate that the majority vote method performed superior to the ensemble method based on both the F-score and accuracy. The majority vote method had an F-score and accuracy of 0.85 and 0.76, respectively, as opposed to an F-score of 0.76 and accuracy of 0.66 for the ensemble method. Hence, the majority vote was used for the remaining analyses.

Specifically, the majority vote method was used on the test set for both balanced and unbalanced training sets. The balanced case provides promising results as illustrated in Table 2. The F-score of 0.79 indicates that the network classified accidents as property damage and injury/fatality with high precision and recall. The AUC for the ROC curve for the testing dataset was 0.62, illustrating that the trained model discovers correlation between variables in an effective manner in order to classify the accident severity [44]. The geometric mean of recall and specificity of 0.6 illustrates that the model was able to predict both classes of severity. In addition, the high precision indicates that the Bayesian network was able to classify the majority of property damage cases correctly. Table 2 also illustrates the test results for an unbalanced training set with 815 property damage and 185 fatal/injury cases per training set. As expected, the unbalanced case overfits the data for the property damage class resulting in a high F-score, but lacks the ability to correctly predict fatal/injury cases as evident from the ROC score and geometric mean of 0.58 and 0.46, respectively.

Table 2. Test results for balanced and unbalanced training sets

Full size table

The relationship between variables in a Bayesian network enables probability inference analyses based on conditional probability distributions for all factors included in the network. By setting evidences for specific variables, the contribution of factors to accident injury severity is quantified. More specifically, the difference between the probabilities of accident injury severity outcomes with and without evidence for a particular level of a given factor, in the absence of any further evidence for the other factors, determines the impact of that particular factor level on the outcomes. Table 3 illustrates the inference results for variables significantly influencing accident injury severity. Specifically, Table 3 presents the ten most influential factor levels where the corresponding percentages are the percentage difference under the factor level, averaged over the ten trained Bayesian network models using balanced data.

Table 3. Probability inference results for variables affecting injury severities

Full size table

As seen in Table 3, the inference results indicate that a dry road surface condition causes the highest increase, approximately by 11.2%, in the probability that a crash results in fatality/injury. This is mainly because dry road conditions are typically associated with higher travelling speeds, which can contribute to an increase in the severity if an accident occurs. Furthermore, snowy road surface and poor weather conditions are associated with a lower likelihood of an accident resulting in fatal/injury accident and therefore a higher likelihood that it causes property damage if it occurs. These results are consistent with previous studies that highlight the significance of factors influencing accident injury severity [20, 34].

4.3 Bayesian Network Extension

With reference to the autonomous network in Fig. 1, indicated by the dashed lines, the following discussion aims to provide justification for the selection of the factors that may influence accident injury severity and ways to calibrate these variables as data become more readily available for autonomous vehicles. It is envisioned that the manufacturer data collected during recent autonomous technology studies, such as the Safety Pilot Study conducted by University of Michigan Transportation Research Institute (UMTRI), will enable the calibration of the following factors.

Experts across various industries anticipate that the gradual introduction of autonomous driving technologies will lead to safer and more efficient roadways. Advocates argue that, in the future, autonomous cars will reduce traffic accidents by up to 90% [45, 46]. Autonomous vehicles will continuously monitor the environment using sensors, cameras, radars, and a global positioning system (GPS) which ultimately feeds information to the control software that enables autonomous tasks such as autonomous emergency braking (AEB), adaptive cruise control (ACC), lane keeping assist, and automatic parking [47, 48].

During the transition to an autonomous vehicle network and while the technology is still developing, care need to be taken to ensure that all factors contributing to accidents and increased severity for accidents involving autonomous vehicles are understood. In addition, new risks related to autonomous driving technologies seem inevitable. For instance, an area of autonomous technology in vehicles that requires significant research attention is the concept of networking. Networking comes in the forms of vehicle to vehicle (V2V) and vehicle to infrastructure (V2I) communication. However, this area of autonomous technology introduces certain dangers in the form of cyber-security risks. In general, anticipated risks for autonomous and semi- autonomous vehicles include sensor failure, the injection of false messages, or contrasting information received from various sources. Below, the calibration of factors influencing accident injury severity in various levels of connected autonomous vehicles are discussed.

GPS Accuracy and Availability.

Autonomous vehicles use GPS, radar, and sensor technology to direct and steer the vehicle without the assistance of the driver. However, certain factors influence the accuracy of these readings and may therefore introduce non-precise information that could cause unsafe conditions when used for controlling the vehicle. For instance, clearly, GPS accuracy gets affected by presence of tall buildings and dense urban areas [49, 50]. It is likely that the lack of accurate information about the vehicle position may lead to accidents and possibly, increase their severity under certain conditions. Data for this factor in the future can be acquired by generating maps that could describe the strength of GPS signal, in order to estimate the GPS accuracy and availability for a journey.

Quality of Traffic Signs and Lane Markings.

The quality of traffic signs and lane markings significantly impact the steering capabilities of autonomous vehicles. For instance, in 2017 a Tesla Model S drove into a highway barrier at a construction zone in Dallas [51]. This occurred as a result of sensors failing to recognize the roadway markings and traffic signs to merge into another lane. The road construction and warning signs were poorly implemented. To calibrate this factor in the Bayesian network, the quality of traffic signs and lane marking can be approximated using classification models trained separately for this purpose. In addition, manufacturer data can be used to estimate the ability of hardware and software to recognize various traffic signs and lane markings.

Quality of Sensor Messages.

Autonomous vehicles depend on information received from sensors and radars. Therefore, if a sensor is obstructed or interfered with, the software controlling the steering of the vehicle may receive the wrong information or no information at all [52]. For instance, if mud covers a sensor, bright lights or electronic interference affect sensor readings, or if malicious information finds its way into the software, the vehicle and its passengers will be at risk. This factor can be calibrated by using manufacturer data regarding the ability of sensors to transmit the correct information under the mentioned circumstances.

Vehicle to Infrastructure (V2I) Communication.

V2I communication refers to the exchange of information between the vehicle and its surrounding infrastructure. For instance, traffic information such as speed limits, traffic lights readings, and traffic reports are communicated to the vehicle [53]. If incorrect information is obtained by the car, or incorrect information is transmitted by the roadside units (RSUs), dangerous situations can arise. This factor in the Bayesian network can be calibrated using a combination of manufacturer and city infrastructure data.

Vehicle to Vehicle (V2V) Communication.

V2 V communication is the exchange of information between vehicles. For instance, emergency brake warning, forward collision warning, and intersection movement assistance will be common in autonomous vehicles and could potentially revolutionize vehicle safety [45]. However, if incorrect/imprecise information is obtained from surrounding or oncoming traffic, dangerous situations may arise. This factor can be calibrated in the future as more field studies that include V2 V technology are conducted. An example of such studies is the Safety Pilot project at University of Michigan, where about 3,000 vehicles in the Ann Arbor region were equipped with communication equipment.

5 Conclusions

This paper develops a Bayesian network to estimate the severity of an accident, should it occur, for all types of accidents, road conditions, and road segments, as a function of pre-crash information. The generic Bayesian network is developed for both non-autonomous and autonomous vehicles which will share the road for the foreseeable future. A model is trained for non-autonomous vehicles using state-wide data from Michigan for the year 2016. Results indicate that the Bayesian network performs well in the classification of accident injury severity, particularly when training sets are balanced to avoid favoring the more represented group, i.e., property damage accidents. The F-score of 0.79, area under the ROC curve of 0.62, and geometric mean of recall and specificity of 0.6 illustrate that the trained model discovers correlation between causal and contributing variables in an effective manner in order to classify the accident injury severity. Furthermore, discussions are presented on the calibration and testing of the autonomous model in the future as data from autonomous vehicle technologies become more readily available. It is anticipated that the developed methodology would assist the development of countermeasures to decrease accident severity and improve traffic safety performance.

The study is subject to certain limitations. The data set used in the numerical analysis is limited to the Michigan area for the year 2016. Hence, it should be noted that the reported results and insights are only valid for this area and time period. Further analyses including other areas over an extended time period are required to ensure generalizability. Furthermore, due to the low frequency of fatal accidents and limited information regarding the severity of crashes in the data set, only two levels were used to model crash severity in the numerical study. In future research efforts, a more granular approach to modeling the severity of accidents may be employed if more detailed data is acquired. Further research efforts can be employed to improve the performance of the Bayesian network and investigate the correlation between factors contributing to accident injury severity. Additional factors contributing to accidents can be included in the network and a variety of structure scoring methods can be utilized to improve the classification performance. Lastly, the autonomous Bayesian network can be calibrated and tested as data becomes more readily available.

References

World Health Organization: Global status report on road safety 2015. World Health Organization (2015)
Google Scholar
National Highway Traffic Safety Administration: 2015, Motor vehicle crashes: overview. Traffic Saf. Facts Res. Note 2016, 1–9 (2016)
Google Scholar
National Safety Council: Injury Facts: Library of Congress Catalog Card Number: 99-74142, 2015 edn, Itasca, IL (2015)
Google Scholar
Mannering, F.L., Grodsky, L.L.: Statistical analysis of motorcyclists’ perceived accident risk. Accid. Anal. Prev. 27(1), 21–31 (1995)
Article Google Scholar
Howard, M.E., Desai, A.V., Grunstein, R.R., Hukins, C., Armstrong, J.G., Joffe, D., Swann, P., Campbell, D.A., Pierce, R.J.: Sleepiness, sleep-disordered breathing, and accident risk factors in commercial vehicle drivers. Am. J. Respir. Crit. Care Med. 170(9), 1014–1021 (2004)
Article Google Scholar
Shankar, V., Mannering, F., Barfield, W.: Effect of roadway geometrics and environmental factors on rural freeway accident frequencies. Accid. Anal. Prev. 27(3), 371–389 (1995)
Article Google Scholar
Lord, D., Mannering, F.: The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transp. Res. Part A: Policy Pract. 44(5), 291–305 (2010)
Google Scholar
Chang, L.Y., Chen, W.C.: Data mining of tree-based models to analyze freeway accident frequency. J. Saf. Res. 36(4), 365–375 (2005)
Article Google Scholar
Hayakawa, H., Fischbeck, P.S., Fischhoff, B.: Traffic accident statistics and risk perceptions in Japan and the United States. Accid. Anal. Prev. 32(6), 827–835 (2000)
Article Google Scholar
Valent, F., Schiava, F., Savonitto, C., Gallo, T., Brusaferro, S., Barbone, F.: Risk factors for fatal road traffic accidents in Udine, Italy. Accid. Anal. Prev. 34(1), 71–84 (2002)
Article Google Scholar
Zajac, S.S., Ivan, J.N.: Factors influencing injury severity of motor vehicle–crossing pedestrian crashes in rural Connecticut. Accid. Anal. Prev. 35(3), 369–379 (2003)
Article Google Scholar
Keall, M.D., Frith, W.J., Patterson, T.L.: The influence of alcohol, age and number of passengers on the night-time risk of driver fatal injury in New Zealand. Accid. Anal. Prev. 36(1), 49–61 (2004)
Article Google Scholar
Derrig, R.A., Segui-Gomez, M., Abtahi, A., Liu, L.L.: The effect of population safety belt usage rates on motor vehicle-related fatalities. Accid. Anal. Prev. 34(1), 101–110 (2002)
Article Google Scholar
Yannis, G., Golias, J., Papadimitriou, E.: Driver age and vehicle engine size effects on fault and severity in young motorcyclist accidents. Accid. Anal. Prev. 37(2), 327–333 (2005)
Article Google Scholar
Edwards, J.B.: The relationship between road accident severity and recorded weather. J. Saf. Res. 29(4), 249–262 (1999)
Article Google Scholar
Neyens, D.M., Boyle, L.N.: The influence of driver distraction on the severity of injuries sustained by teenage drivers and their passengers. Accid. Anal. Prev. 40(1), 254–259 (2008)
Article Google Scholar
Chen, C., Zhang, G., Tarefder, R., Ma, J., Wei, H., Guan, H.: A multinomial logit model-Bayesian network hybrid approach for driver injury severity analyses in rear-end crashes. Accid. Anal. Prev. 80, 76–88 (2015)
Article Google Scholar
Yau, K.K.: Risk factors affecting the severity of single vehicle traffic accidents in Hong Kong. Accid. Anal. Prev. 36(3), 333–340 (2004)
Article Google Scholar
Milton, J.C., Shankar, V.N., Mannering, F.L.: Highway accident severities and the mixed logit model: an exploratory empirical analysis. Accid. Anal. Prev. 40(1), 260–266 (2008)
Article Google Scholar
Delen, D., Sharda, R., Bessonov, M.: Identifying significant predictors of injury severity in traffic accidents using a series of artificial neural networks. Accid. Anal. Prev. 38(3), 434–444 (2006)
Article Google Scholar
Ouyang, Y., Shankar, V., Yamamoto, T.: Modeling the simultaneity in injury causation in multivehicle collisions. Transp. Res. Rec.: J. Transp. Res. Board 1784, 143–152 (2002)
Article Google Scholar
Malyshkina, N.V., Mannering, F.L.: Markov switching multinomial logit model: an application to accident-injury severities. Accid. Anal. Prev. 41(4), 829–838 (2009)
Article Google Scholar
Train, K.E.: Discrete Choice Methods with Simulation. Cambridge University Press, Cambridge (2009)
Book MATH Google Scholar
Xie, Y., Zhang, Y., Liang, F.: Crash injury severity analysis using Bayesian ordered probit models. J. Transp. Eng. 135(1), 18–25 (2009)
Article Google Scholar
Mujalli, R.O., de Oña, J.: Injury severity models for motor vehicle accidents: a review. In: Proceedings of the Institution of Civil Engineers (2013)
Google Scholar
Daniels, S., Brijs, T., Nuyts, E., Wets, G.: Externality of risk and crash severity at roundabouts. Accid. Anal. Prev. 42(6), 1966–1973 (2010)
Article Google Scholar
Wang, X., Kockelman, K.: Use of heteroscedastic ordered logit model to study severity of occupant injury: distinguishing effects of vehicle weight and type. Transp. Res. Rec.: J. Transp. Res. Board 1908, 195–204 (2005)
Article Google Scholar
Jin, Y., Wang, X., Chen, X.: Right-angle crash injury severity analysis using ordered probability models. In: 2010 International Conference on Intelligent Computation Technology and Automation (ICICTA), vol. 3, pp. 206–209. IEEE, May 2010
Google Scholar
Zhu, X., Srinivasan, S.: A comprehensive analysis of factors influencing the injury severity of large-truck crashes. Accid. Anal. Prev. 43(1), 49–57 (2011)
Article Google Scholar
Kunt, M.M., Aghayan, I., Noii, N.: Prediction for traffic accident severity: comparing the artificial neural network, genetic algorithm, combined genetic algorithm and pattern search methods. Transport 26(4), 353–366 (2011)
Article Google Scholar
Sameen, M.I., Pradhan, B.: Severity prediction of traffic accidents with recurrent neural networks. Appl. Sci. 7(6), 476 (2017)
Article Google Scholar
Simoncic, M.: A Bayesian network model of two-car accidents. J. Transp. Stat. 7(2/3), 13–25 (2004)
Google Scholar
de Ona, J., López, G., Mujalli, R., Calvo, F.J.: Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. Accid. Anal. Prev. 51, 1–10 (2013)
Article Google Scholar
de Oña, J., Mujalli, R.O., Calvo, F.J.: Analysis of traffic accident injury severity on Spanish rural highways using Bayesian networks. Accid. Anal. Prev. 43(1), 402–411 (2011)
Article Google Scholar
Mujalli, R.O., De ONa, J.: A method for simplifying the analysis of traffic accidents injury severity on two-lane highways using Bayesian networks. J. Saf. Res. 42(5), 317–326 (2011)
Article Google Scholar
Zong, F., Xu, H., Zhang, H.: Prediction for traffic accident severity: comparing the Bayesian network and regression models. Math. Probl. Eng. 2013 (2013)
Google Scholar
Gregoriades, A., Mouskos, K.C.: Black spots identification through a Bayesian networks quantification of accident risk index. Transp. Res. Part C: Emerg. Technol. 28, 28–43 (2013)
Article Google Scholar
Mbakwe, A.C., Saka, A.A., Choi, K., Lee, Y.-J.: Modeling highway traffic safety in Nigeria using Delphi technique and Bayesian network. In: TRB 93rd Annual Meeting Compendium of Papers, Washington, D.C., p. 21 (2014)
Google Scholar
University of Michigan: Michigan Traffic Crash Facts (2017). https://www.michigantrafficcrashfacts.org. Accessed 29 July 2017
Mittal, A. (ed.): Bayesian Network Technologies: Applications and Graphical Models: Applications and Graphical Models. IGI Global (2007)
Google Scholar
Jensen, F.V.: An Introduction to Bayesian Networks, vol. 210, pp. 1–178. UCL Press, London (1996)
Google Scholar
Margaritis, D.: Learning Bayesian network model structure from data (No. CMU-CS-03-153). Carnegie-Mellon University, Pittsburgh, PA School of Computer Science (2003)
Google Scholar
Powers, D.M.W.: Evaluation: from precision, recall and f-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
MathSciNet Google Scholar
Tape, T.G.: Interpretation of diagnostic tests. Ann. Intern. Med. 135(1), 72 (2001)
Article Google Scholar
Fagnant, D.J., Kockelman, K.: Preparing a nation for autonomous vehicles: opportunities, barriers and policy recommendations. Transp. Res. Part A: Policy Pract. 77, 167–181 (2015)
Google Scholar
Silberg, G., Wallace, R., Matuszak, G., Plessers, J., Brower, C., Subramanian, D.: Self-driving cars: the next revolution. White paper, KPMG LLP & Center of Automotive Research, p. 36 (2012)
Google Scholar
Yeomans, G.: Autonomous vehicles: handing over control—opportunities and risks for insurance. Lloyd’s (2014)
Google Scholar
SAE International: Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles (2016)
Google Scholar
Kummerle, R., Hahnel, D., Dolgov, D., Thrun, S., Burgard, W.: Autonomous driving in a multi-level parking structure. In: 2009 IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 3395–3400. IEEE, May 2009
Google Scholar
Schipperijn, J., Kerr, J., Duncan, S., Madsen, T., Klinker, C.D., Troelsen, J.: Dynamic accuracy of GPS receivers for use in health research: a novel method to assess GPS accuracy in real-world settings. Front. Public Health 2, 21 (2014)
Article Google Scholar
Durden, T.: Tesla autopilot crash caught on dashcam (2017). http://www.zerohedge.com/news/2017-03-02/tesla-autopilot-crash-caught-dashcam. Accessed 17 June 2017
National Highway Traffic Safety Administration: Cybersecurity best practices for modern vehicles. Report No. DOT HS, 812, p. 333 (2016)
Google Scholar
Petit, J., Shladover, S.E.: Potential cyberattacks on automated vehicles. IEEE Trans. Intell. Transp. Syst. 16(2), 546–556 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Tennessee, Knoxville, TN, 37996, USA
Franco van Wyk & Anahita Khojandi
University of Michigan, Ann Arbor, MI, 48109, USA
Neda Masoud

Authors

Franco van Wyk
View author publications
You can also search for this author in PubMed Google Scholar
Anahita Khojandi
View author publications
You can also search for this author in PubMed Google Scholar
Neda Masoud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anahita Khojandi .

Editor information

Editors and Affiliations

School of Computing, Computer Science Research Institute, Ulster University, Newtownabbey, UK
Yaxin Bi
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

van Wyk, F., Khojandi, A., Masoud, N. (2020). A Path Towards Understanding Factors Affecting Crash Severity in Autonomous Vehicles Using Current Naturalistic Driving Data. In: Bi, Y., Bhatia, R., Kapoor, S. (eds) Intelligent Systems and Applications. IntelliSys 2019. Advances in Intelligent Systems and Computing, vol 1038. Springer, Cham. https://doi.org/10.1007/978-3-030-29513-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-29513-4_8
Published: 24 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29512-7
Online ISBN: 978-3-030-29513-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Abstract

1 Introduction and Literature Review

2 Methodological Approach

2.1 Bayesian Networks

2.2 Performance Evaluation Indicators

3 Data

4 Results and Discussion

4.1 Driver and Autonomous Networks

4.2 Results

4.3 Bayesian Network Extension

GPS Accuracy and Availability.

Quality of Traffic Signs and Lane Markings.

Quality of Sensor Messages.

Vehicle to Infrastructure (V2I) Communication.

Vehicle to Vehicle (V2V) Communication.

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation