research-article

Public Access

Adaptive Driving Assistant Model (ADAM) for Advising Drivers of Autonomous Vehicles

Authors:

Ewart de VisserAuthors Info & Claims

ACM Transactions on Interactive Intelligent Systems (TiiS), Volume 12, Issue 3

Article No.: 21, Pages 1 - 28

https://doi.org/10.1145/3545994

Published: 26 July 2022 Publication History

All formats PDF

Abstract

Fully autonomous driving is on the horizon; vehicles with advanced driver assistance systems (ADAS) such as Tesla's Autopilot are already available to consumers. However, all currently available ADAS applications require a human driver to be alert and ready to take control if needed. Partially automated driving introduces new complexities to human interactions with cars and can even increase collision risk. A better understanding of drivers’ trust in automation may help reduce these complexities. Much of the existing research on trust in ADAS has relied on use of surveys and physiological measures to assess trust and has been conducted using driving simulators. There have been relatively few studies that use telemetry data from real automated vehicles to assess trust in ADAS. In addition, although some ADAS technologies provide alerts when, for example, drivers’ hands are not on the steering wheel, these systems are not personalized to individual drivers. Needed are adaptive technologies that can help drivers of autonomous vehicles avoid crashes based on multiple real-time data streams. In this paper, we propose an architecture for adaptive autonomous driving assistance. Two layers of multiple sensory fusion models are developed to provide appropriate voice reminders to increase driving safety based on predicted driving status. Results suggest that human trust in automation can be quantified and predicted with 80% accuracy based on vehicle data, and that adaptive speech-based advice can be provided to drivers with 90 to 95% accuracy. With more data, these models can be used to evaluate trust in driving assistance tools, which can ultimately lead to safer and appropriate use of these features.

1 Introduction

According to the National Highway Traffic Safety Administration (NHTSA), the number of auto accidents has increased annually from 5.419 million in 2010 to 6.756 million in 2019—a 20% increase over the last 10 years. Of the 6.756 million accidents in 2019, 41% resulted in injuries or fatalities [1]. This trend is likely to continue to as the number of vehicles on the road increases.

Researchers have used a variety of approaches to investigate causes of auto accidents. For example, in a 1980 report, Treat [2] examined over a five-year period how frequently various human, environmental, and vehicular factors were involved in traffic accidents by studying 13,568 police-reported accidents, of which 2,258 were investigated on-scene by technicians and 420 by a multidisciplinary team. Human errors were identified as definite causes in 70.7% of the accidents, environmental factors in 12.4%, and vehicular factors in 4.5%. In 20% of the cases, no definite cause was identified. A taxonomy of direct human causes was developed based on an information-processing model of the driver as a vehicle controller. Singh [3] analyzed data from 5,470 crashes over the period from July 3, 2005 to December 31, 2007. Driver, vehicle, and environment-related information were collected at crash scenes as part of the National Motor Vehicle Crash Causation Survey, which was conducted by the U.S. National Highway Transportation Safety Administration (NHTSA). The last event in the crash causal chain—a.k.a, the critical reason for the crash—was attributed to the driver in 94 percent (±2.2%) of the crashes, to failure or degradation of a vehicle component in 2 percent (±0.7%) of the crashes, and to the environment (slick roads, weather, etc.) in 2 percent (±1.3%) of crashes. Recognition errors accounted for about 41 percent (±2.1%), decision errors 33 percent (±3.7%), and performance errors 11 percent (±2.7%) of crashes. Dingus et al. [4] used a naturalistic driving dataset of 905 injurious and property damage crashes; they found driver-related factors—such as error, impairment, fatigue, and distraction—in almost 90% of crashes.

Environmental factors including weather conditions (rain, sleet, snow, and fog) and road pavement conditions (wet, snowy/slushy, or icy) can cause major accidents. On average, nearly 5,000 people are killed and over 418,000 people are injured in weather-related crashes each year [5]. According to NHTSA statistics, the top environmental factors leading to collisions are slick roads (50%) and glare (17%) [3].

In short, major causes of accidents include human error and environment factors, with human error sharing about 71–90% [2–5]. Major categories of human error factors include speeding, distraction, fatigue, and drunk driving. Major categories of environmental factors include road conditions and weather conditions [5]. In this study, we broaden the scope of distraction to include fatigue and drunk driving, and broadened road conditions to include weather conditions, since snow and rain contribute to degradation of road conditions. Therefore, speeding, distraction, and road conditions were considered primary factors in designing the driving advisor tool described in this study.

2 Literature Review

This section briefly reviews research on advanced driver assistance technologies. In addition, because successful use of these technologies requires human drivers to have appropriate levels of trust in the technology, research related to human drivers’ interactions with driver assistance systems is also reviewed.

Advanced driver assistance systems. Advanced driver assistance systems (ADAS) are active automotive safety systems that utilize advanced sensors such as cameras, radar, lidar, and map databases, comprised of a hardware layer for sensing and a software layer of intelligence for post processing and decision making [6]. ADAS are often classified by the level of automation they achieve, using the Society of Automotive Engineers (SAE) scale, which ranges from 0 (No automation) to 5 (Full Automation) [6–7]. While there has been significant research and development (R&D) activity at levels 4 and 5, most ADAS systems currently in the market are between levels 2 and 3; [6, 8].

Yi et al. [9] suggest that driving assistance systems can be classified into three categories: (1) safe driving systems—such as adaptive cruise control, lane keeping, collision avoidance—which focus on the vehicle; (2) driver monitoring systems, which monitor drivers and warn them about abnormal driving behaviors and cognitive states; and (3) in-vehicle information systems that provide information and services for the driver, such as directions and traffic conditions. These applications have been implemented using a variety of technologies for sensing and perception (cameras, radar, lidar) and decision-making (artificial intelligence, machine learning and data fusion) [10–15].

The bulk of ADAS efforts reported in the literature have focused on enhancing vehicle capabilities with a view toward achieving level 5. However, all currently available ADAS applications require a human driver to be alert and ready to take control if needed. It appears humans will continue to be involved in driving for the foreseeable future. However, increasing levels of driving automation introduces new complexities to human interactions with cars and can be a double-edged sword [16–18]. For example, studies with level 3 vehicles have found that situations in which the driver must manually take over control from the automated mode increase collision risk with surrounding vehicles [13].

ADAS technologies often provide other types of driver assistance in addition to autonomous driving. For example, Tesla's Model X emits a tone or beep to alert drivers when their hands are not on the steering wheel. Honda and Jaguar have projects to detect a driver's mental state based on factors such as facial expressions, voice, heart rate, and respiration rate [9]. However, Yi et al. [9] note that these systems are generic—based on models developed from behavior of many different drivers—not personalized to individual drivers. Needed are adaptive technologies that can help drivers of autonomous vehicles avoid crashes based on multiple real-time data streams. For example, one way of assisting drivers is to provide adaptive speech-based advice as needed, such as telling the driver to speed up, slow down, or stop. This guidance can be based on external factors (such as road or weather conditions), vehicle factors (such as speed and lane keeping), and indicators of the driver's internal state (such as fatigue). Trust in the ADAS can also be a consideration.

Trust. A fundamental issue affecting human interactions with autonomous vehicles is trust [19]. To successfully interact with an ADAS, a human driver needs to have an appropriate level of trust in the system, known as calibrated trust [20]. Too much trust can cause the human to fail to intervene when the system is performing incorrectly. Too little trust fails to leverage the benefits of the system. If human involvement is required, an ADAS should be able to assess how much trust the driver/operator is placing in the system and to consider trust in determining how to provide driver assistance.

Learned trust is a construct that captures how trust evolves over time from initial introduction of an agent to experiencing interaction with an agent to longer‐term interactions [21]. Situational trust is the construct that captures how trust changes based on the external environment (i.e., road types, road conditions, traffic, weather) and internal dynamic characteristics of the operator (mood, attentional capacity, self‐confidence) [21]. Recently, learned and situational trust were specifically mapped onto measures of automated driving [19].

New research has also more carefully mapped how specific task behaviors such as operator interventions and verification and response time during automation use can correspond to trust behaviors [22]. Specifically to the driving domain, braking is a common way to disengage automated driving and parking and thus can serve as an indicator of distrust. Researchers have further confirmed this in the lab by using braking frequency and magnitude as indicators of distrust in automated driving styles [23]. Using a real Tesla vehicle, researchers have used driver interventions, through braking, to show that distrust decreased with multiple uses of the automated parking system [24] and how distrust decreased by showing how the system worked compared to being told how it worked [25].

Previous research has extensively modeled the relationship between trust, reliance, and ultimate use of automated and robotic systems [16, 20–22, 24–27]. Early work focused on the relationship between machine accuracy, operator self-confidence, reliance, and trust [16, 17, 28–31]. For example, relative trust (trust – self-confidence) was shown to predict reliance on the automation [29]. Other work showed that trust in the system was higher when automation is more accurate or reliable [30–32]. Dynamic models of trust calibration mapping different stages of interaction prior to interaction and during interaction have been carefully mapped out [20, 21, 27]. Subsequent work endeavored to chart the antecedents of trust in automated and robotic systems broadly classifying important factors including the machine, the human and the environment and context [26, 33, 34]. Critically, recent work has provided a direct mapping of theoretical trust concepts originally conceived of by Mayer et al. [35] directly onto the many measures of trust—including self-report, behavioral and physiological indices—that have developed over the last several decades. As indicators of risk taking, behavioral measurements—such as interventions, verification behaviors, reliance, and response time—can be indicators of the trust relationship [22]. A more tailored approach specifically to trust in automated driving was recently detailed [19].

Assessing trust. A number of techniques have been developed to measure trust in automated systems such as self-driving vehicles and robots [22, 30, 33, 36, 37]. Survey instruments that collect self-reports of perceptions of trust—such as the Trust Of Automated Systems Test (TOAST) [34] and the Multi-Dimensional Measure of Trust (MDMT) [38, 39]—have been most commonly used. Physiological measures such as eye-tracking, EEG, and galvanized skin response have also been used [24, 25, 40, 41]. In addition, when using a vehicle (rather than a simulation), telemetry data such as location, turning, braking, acceleration, and lane keeping can be used to assess driving behavior. Vehicle telemetry data is the most ecologically valid way of collecting data related to trust, but as of this writing, there have been relatively few reports of telemetry data being used to assess trust in ADAS.

Trust in automation of autonomous vehicles can be described for individual features such as autopilot, cruise control, turning, braking, acceleration, and lane keeping. The Tesla Model X controller records and broadcasts this type of information in real-time via its Controller Area Network (CAN) bus architecture. These data can be accessed via an On-Board Diagnostics (OBD) port in real time and used to assess if a driver is under or over- trusting their vehicle's capabilities. Sensory devices, such as Tobii eye tracker or Mobileye vision system, can also be used to detect lanekeeping and distracted driving [19, 42].

Situation awareness. Situation awareness is an important factor that determines driving performance. For example, a recent computational approach for modeling driver's intent using naturalistic driving data demonstrated that lane change performance improved for drivers that checked their mirrors for more than six seconds [43], an approximate measure for situation awareness. Driving with automated vehicles might raise unique issues such as drivers finding themselves “out of the loop” [44] or with situation awareness being affected while driving with different levels of automated assistance [45]. A review found that situation awareness can deteriorate during adaptive cruise control and highly automated driving when engaged in non-related driving tasks but can improve if drivers are motivated, instructed to pay better attention, or receive feedback [46]. More recent work, investigating driving with real automated vehicles on the road, has demonstrated reduced situation awareness of the automated vehicle, increased complacency, and over-trust in automation [45, 47].

Summary. The majority of ADAS efforts reported in the literature have focused on enhancing vehicle capabilities with a view toward achieving fully automated driving. However, all currently available ADAS applications require a human driver to be alert and ready to take control if needed. Partially automated driving introduces new complexities to human interactions with cars and can even increase collision risk. A better understanding of drivers’ trust in automation may help reduce these complexities.

Techniques for measuring trust in automated systems include use of surveys to collect self-reports of perceptions of trust; use of physiological measures such as eye-tracking, EEG, and galvanized skin response; and, in the case of autonomous driving, use of vehicle telemetry data such as location, turning, braking, acceleration, and lane keeping. Although vehicle telemetry data is the most ecologically valid way of collecting data related to trust, there have been relatively few reports of vehicle telemetry data being used to assess trust in ADAS. Needed is research on the feasibility of using vehicle telemetry data to understand the driver's state of mind.

In addition, although some ADAS technologies provide other types of driver assistance—such as a tone or beep to alert drivers when their hands are not on the steering wheel—these systems are not personalized to individual drivers. Needed are adaptive technologies that can help drivers of autonomous vehicles avoid crashes based on multiple real-time data streams.

The objectives of this research are to (1) identify sensory information and vehicle telemetry data needed to increase driving safety; (2) propose an architecture for an adaptive assistant that can provide verbal guidance to drivers of autonomous vehicles; (3) develop multi-stage sensor fusion models to provide adaptive assistance for drivers; (4) evaluate the models using in-field and simulated data; and (5) suggest future work on adaptive assistance for drivers of autonomous vehicles.

3 Architecture for Adaptive Autonomous Driving Advisor

The overall goal of this research is to develop an Adaptive Autonomous Driving Advisor (AADA) that can provide adaptive speech-based advice as needed, such as telling the driver to speed up, slow down, or stop. AADA will be built upon an existing data acquisition and measurement system called Ergoneers. Ergoneers is a custom-built PC-based system that includes multiple communication ports as well as CAN bus ports. Sensory devices such as GPS, Tobii eye tracker, Mobileye Camera, and the Tesla CAN bus cable can be integrated for acquisition of both vehicle data and driver physical data such as eye and head movements.

AADA will be based on the factors identified in the Introduction: Speed, Road Conditions, Distraction, and Trust. To acquire this information, data from Tesla's CAN bus, GPS, Tobii eye tracker, and Mobileye camera will be utilized and integrated to trigger the AADA to provide appropriate voice instructions via a multistage modeling approach. Figure 1 outlines which sensors will provide which kinds of measurements and information and how the information will be fused via Stage I and Stage II models. Stage I will include four models: a linear model to measure the speed condition (i.e., speeding, normal, or below speed limit), two weighted utility models to predict road conditions and driver distraction, and an Artificial Neural Network/Support Vector Machine/Random Forest (ANN/SVM/RF) model to predict trust. Stage II will use an ANN/SVM/RF model to integrate the outputs from the Stage I models and trigger appropriate voice instructions. In this paper, we focus on the development of the adaptive ANN/SVM/RF models used in Stages I and II.

Fig. 1.

The ultimate goal is to develop an adaptive sensor fusion algorithm that can improve its performance as the amount of data increases, since the Adaptive Autonomous Driving Advisor can be used by the same individual over time. Artificial neural networks (ANN) and support vector machines (SVM) were used to develop the machine learning algorithms due to their ability and cost efficiency when handling regressions with high dimensional, non-linear, covariant inputs [49–52]. The Random Forest (RF) method was also used due to its reputation for robustness to real-world data.

Artificial neural networks are widely recognized supervised learning machine algorithms that can be used to correlate hypernonlinear problems [53, 54]. However, ANN models have some drawbacks. Most notably, they require considerable training time to make accurate predictions, and they typically fail to measure “unknown” data due to their stochastic nature [55, 56]. Therefore, a deterministic non-linear regression method may be preferred when limited training data are available.

Support vector machines, another family of supervised learning models used in classification and regression analyses, are deterministic [57, 58]. The support vector machine is intended to be a robust tool for classification and regression in noisy, complex domains. The two key features of support vector machines are generalization theory, which leads to a principled way to choose a hypothesis; and kernel functions, which introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm [59].

A principal difference between SVMs and ANNs lies in risk minimization mechanics [60–62]: SVMs employ the structural risk minimization (SRM) principle to minimize an upper bound on the expected risk, whereas ANNs apply traditional empirical risk minimization (ERM) to training data. In several fields [49, 55, 56, 60–65], SVM models are more robust and deterministic than ANNs while SVM predictions are comparable to ANN results. However, SVM model accuracy levels depend heavily on the experimental data used.

Random Forest is an ensemble machine learning method that is a collection of individual decision trees, resulting in many predictions; the majority of the predictions is used to obtain the final prediction [50, 51]. Decision trees split on features to create decision boundaries based on the gini impurity measure, a popular algorithm to optimally split nodes. Each decision tree randomly permutes the order of features, resulting in different best splits to create distinct trees with their respective predictions. Random Forests inherently perform feature selection as well as reduce overfitting on the training set due to taking a majority vote across all the decision trees. In addition, RF methods require relatively little configuration to obtain a high accuracy.

4 Stage I Model Development: Predicting Trust in Automation

The focus of the Stage I model development process was on developing an ANN/SVM/RF model for using vehicle data to predict a user's trust in automation. Hoff & Bashir propose a framework consisting of three types of trust: Dispositional Trust, Learned Trust, and Situational Trust [21]. Madison et al. describe how Hoff & Bashir's framework might be applied in the context of driving automation [19]. Dispositional trust considers user characteristics such as age, personality, tendency to take risks, and attitudes toward automation. Learned trust is trust based on past experience with a specific system. Situational trust varies based on the external environment and the internal state of the driver. For example, situational trust in an ADAS might vary based on the driver's perception of the vehicle's ability to perform under certain driving scenarios. As part of the modeling process, we also conducted experiments in realistic driving conditions with a Tesla Model X to assess the applicability of Hoff & Bashir's framework within the context of driving automation.

4.1 Experiment Setup

In June-July 2021, nine subjects participated in designed experiments. Subjects were males between the ages of 18–22 with no prior experience with the Tesla autopilot. The Tesla vehicle was a Model X version 2021.4.18. Each subject had three drives along the same route. For the first two drives, one drive had to be Manual and the other had to be Autopilot; the sequence was left up to the driver. For the third drive, the driver was allowed to choose either Autopilot or Manual mode. While in the Autopilot mode, subjects could disengage and re-engage the Autopilot if desired; the Manual mode was manual only. A manual driving mode was included to create a baseline for individual driving performance in the Tesla. This then presents a way to compare autopilot + human driver performance to human driver performance alone.

The experiment tasks included (1) complete preliminary individual attribute surveys, including the Trust Of Automated Systems Test (TOAST), before driving, (2) drive the Tesla around a designated loop-shaped route three times (Figure 2) using one of three modes (Manual, Autopilot, and Driver's Preference) each time; (3) complete surveys at the end of each loop, including the Multi-Dimensional Measure of Trust (MDMT); and (4) complete a post drive questionnaire at the end of the drive. The post-drive questionnaire asked participants to rate how much they trusted the Autopilot feature, which was considered to indicate trust in automation.

Fig. 2.

Figure 2 shows the learned trust associated with the three different drives and situational trust along the driving path for several different driving situations (downhill, straight line, turn, and curve). The goal was to use vehicle data about driving behavior under different drives and situations to model a driver's level of trust in automation. Results from TOAST were used to assess dispositional trust; results from the MDMT and the post-drive questionnaire were used to assess learned trust; and comparisons of driving data under different road conditions were used to assess situational trust.

4.2 Modeling Process

The modeling process was as follows:

•

Identify which of the self-report trust measures best indicate Trust in Automation.

•

Identify vehicle data that strongly correlate with the Trust in Automation measures.

•

Fit the data into a distribution and generate data for modeling and testing.

•

Develop and evaluate model to predict Trust in Automation based on the identified effective attributes.

•

Further evaluate model accuracy using field data and other tests.

Following are the experiments, analysis, and modeling efforts based on the procedure described above.

Step 1. Identify which of the self-report trust measures best indicate Trust in Automation. To identify which self-report measures best indicate Trust in Automation, we first computed correlation coefficients between the Multi-Dimensional Measure of Trust (MDMT) survey administered at the completion of each loop and the post-drive questionnaire. The MDMT measures 16 attributes of trust [39]. The scale is divided into two major constructs: capacity trust and moral trust. Capacity trust has two subscales: reliable and capable. The reliable subscale has four attributes including reliable, predictable, someone you can count on, and consistent. The capable subscale has four attributes including capable, skilled, competent, and meticulous. Moral trust has two subscales: ethical and sincere. The ethical subscale has four attributes including ethical, respectable, principled, and integrity. The sincere subscale has four attributes including sincere, genuine, candid, and authentic. The breakdown and alphas for each dimension can be found in Ullman and Malle [48]. The post-drive questionnaire asked subjects to rate their Trust in Automation using a Likert scale.

Table 1 shows the correlations between the post-drive trust rating and each of the 16 MDMT attributes. Note that the MDMT uses a 7-point Likert scale and the post-drive questionnaire uses a 5-point Likert scale. The 16 MDMT attributes can be grouped into four sub-scales: Reliable, Capable, Ethical, and Sincere [39]. Table 2 shows the correlations between the four subscale values and the trust rating from the post drive questionnaire.

Table 1.

Attribute	Post Drive Trust Rating
Reliable	0.708
Sincere	0.282
Capable	0.383
Ethical	0.299
Predictable	0.515
Genuine	0.462
Skilled	0.518
Respectable	0.249
Someone you can count on	0.475
Candid	0.224
Competent	0.457
Principled	0.478
Consistent	0.606
Authentic	0.232
Meticulous	0.568
Integrity	0.591

Table 1. Correlations Between the 16 MDMT Attributes and Post-Drive Trust Rating

Table 2.

	Reliable	Capable	Ethical	Sincere
Post Drive	0.6109	0.561	0.583	0.459
Mean	5.6	5.41	5.12	4.15
SD	1.279	1.615	1.531	1.663

Table 2. Post Drive Trust Rating, Mean, and Standard Deviation for Four Combined MDMT Subgroups

Results suggest that both the individual attribute Reliable and the Reliable subscale assessed during the evaluation points are moderately correlated to the self-assessment of Trust in Automation in the post drive questionnaire. The Ethical subscale also showed some degree of correlation, but since the correlation coefficient for Reliable was higher than Ethical, only the data for Reliable (from the evaluation points and post drive) was considered as a measure of performance (independent variable) in formulating the model to predict Trust in Automation.

Step 2. Identify vehicle data that strongly correlate with the Trust in Automation measures. Vehicle data were collected in real-time from the Tesla Model X CAN bus via the OBD port of a custom-built Ergoneer data acquisition system. Each experiment ran for about two hours with approximately one hour of vehicle data. The data file contains information for 143 different attributes; each data file contains approximately 1.5 million records related to the vehicle. Processing the data to identify when each major event happened is a very challenging task. We first used a bird's eye view approach to recognize major events such as the starting and stopping time of each drive when the auto-pilot is On or Off. Figure 3 shows GPS data of the driving route and a bird's eye view of the data for autopilot, speed, braking, and distance to the lane line to the left of the vehicle, which was used to track the vehicle's lane keeping performance.

Fig. 3.

This view can reveal a variety of information, including when an event (such as enabling Autopilot for the drive) starts and stops; and the length, frequency, and variation of each event. For example, a value is 3 when the autopilot is On and 2 when it is Off; so from the Autopilot plot in Figure 4, we can determine that there were three different drives, that the first and third drives were in autopilot mode, and that the first drive lasted about 15 minutes. From the Distance to Left plot, we can see how much a driver deviates from the lane line to the left of the vehicle. By examining the band width and shape, we can determine if the vehicle is driving in a straight line. If a driver does not have good control of the vehicle, the band will fluctuate a great deal. For example, in the plot shown in Figure 4, the driver consistently shifts to the right.

Fig. 4.

To identify the vehicle data that correlate strongly with the Trust in Automation performance measure, correlation coefficients were calculated to determine input candidates for the prediction model. There were three driving modes (Manual, Autopilot, Driver's Choice) and four driving situations (straight line, downhill, curve, turn) that were recorded for each drive. Although the Tesla CAN bus broadcasted data for 143 attributes, not all of the attributes were considered useful for modeling purposes. Therefore, for each situation (e.g., downhill driving), data from only six attributes were collected: number of times brakes were applied, average braking time, average speed, average speed standard deviation, average distance to the left line, and standard deviation of distance to the left line. The 24 attributes (delineated in Table 3) were computed for each driving attempt in this study.

Table 3.

	Straight line (S)	Downhill (D)	Curve (C)	Turn (T)
No. of times brakes were applied (Bn)	SBn	DBn	CBn	TBn
Average braking time in secs. (Bl)	SBl	DBl	CBl	TBl
Average speed in mph (Sa)	SSa	DSa	CSa	TSa
Average speed std dev (Ss)	SSs	DSs	CSs	TSs
Average distance to the left line in cm (La)	SLa	DLa	CLa	TLa
Std dev of distance to the left line (Ls)	SLs	DLs	CLs	TLs

Table 3. Driving Measurements Collected by Vehicle Data Attributes and Driving Situations

There were six viable datasets, each with two attempts in Autopilot mode, for a total of 12 Autopilot attempts. For each driving attempt, the correlation coefficient between each of the 24 attributes and the subject's self-assessment of Trust in Automation were calculated. Table 4 shows the 7 vehicle attributes out of the 24 that yielded moderate to strong correlations with the self-assessment of Trust in Automation. For development of the prediction model, we used data from the top three attributes—DSa (Downhill Speed average), TLs (Turn Length standard deviation), and CSs (Curve Speed standard deviation)—as inputs, because these attributes all correlated strongly with the independent variable of Trust in Automation (based on the MDMT Reliability sub-scale).

Table 4.

Subject	Mode	Trust	DSa	TBl	TSa	TLa	TLs	CSa	CSs
21191817	Auto	5.75	43.91	2.98	30.18	−0.18	0.09	40.47	3.09
21164616	Auto	5.25	44.66	11.94	27.36	−0.18	0.04	42.47	4.88
21199974	Auto	4.00	43.35	5.01	29.38	−0.10	0.25	44.76	0.90
21131964	Auto	6.25	44.85	6.24	29.15	−0.18	0.06	42.24	4.66
21160564	Auto	7.00	44.77	14.59	25.48	−0.18	0.03	23.71	10.16
21191823	Auto	6.75	44.83	21.61	24.54	−0.18	0.07	39.73	8.05
Correlation Coefficient			0.82	0.59	−0.63	−0.79	−0.81	−0.68	0.89

Table 4. Correlation Coefficients Among Trust in Automation Assessment and Vehicle Attribute Measurements

Step 3. Fit the data into a distribution and generate data for modeling and testing. Because the number of viable data sets was limited, we first fit the data identified in Step 2 into distributions, then generated data needed for modeling. The vehicle data captured while the driver was in Autopilot mode were fit into a suitable distribution and validated based on the Lilliefors test. The Lilliefors test, which is based on the Kolmogorov–Smirnov test, is used for small data sets (less than 25). Following is a summary of the process for each attribute of interest and performance measures.

(a)

Trust in Automation: The top two distribution fitting candidates were Lognormal and Normal distributions. Since the lower and upper bounds range from 1 to 7, the suggested distribution was further adjusted to Normal (4,0.95) which covers approximately 99.73% of the population.

(b)

Downhill Speed Average (DSa): The top distribution fitting candidate was Normal (44.40,0.62).

(c)

Turn Length standard deviation (TLs): The top distribution candidate was Pareto (1.3657,0.032478).

(d)

Curve Speed standard deviation (CSs): The top two distribution candidates were ExtValue (3.8081,2.6365) and Normal (5.2911,3.3464). We chose the normal distribution for this study because we believe speed has variations caused by the subject as well as other factors such as road conditions.

Step 4. Develop and evaluate model to predict Trust in Automation based on the identified effective attributes. Nonlinear machine learning modeling techniques (ANN, SVM, and RF) were applied to model and predict a user's Trust in Automation based on vehicle data. Model effectiveness was then evaluated by randomly dividing the data into testing and evaluation sets. For the ANN model, we applied a 3-2-1 ANN topology with DSa, TLs, and CSs as the three input nodes, two hidden layers, and one output node, which was Trust in Automation (TIA). Table 5 shows the results from the TIA_ANN, TIA_SVM, and TIA_RF models. The prediction accuracies for TIA_ANN and TIA_SVM were very close, but the accuracies of TIA_RF were significantly lower. The accuracy did not increase much as sample size increased to 1,000 data sets. The accuracy is defined as 1- ((TP_i – TT_i)/TT_i)%, where TP_i is the predicted trust value for data sample I and TT_i is the target trust value for data sample i. The accuracy value in the tables below is the average accuracy of a given set of samples (such as 125 data sets).

Table 5.

	125 data sets	250 data sets	500 data sets	1000 data sets
TIA_ANN	0.77738	0.76151	0.76386	0.78691
TIA_SVM	0.77804	0.79971	0.78531	0.78722
TIA_RF	0.43200	0.45600	0.39800	0.39499

Table 5. Prediction Accuracy of TIA_ANN, TIA_SVM, and TIA_RF Models for Different Sample Sizes

Step 5. Further evaluate model accuracy using field data and other tests. The prediction accuracy of the designed models was further evaluated using other available field data and tests of robustness.

Manual and Driver Preference modes. The above models were built using data generated by the subjects when they are driving in Autopilot mode, because the Autopilot data are likely different from the data collected in the Manual and Driver Preference modes (denoted as M&D). For example, there could be a potential learning effect on driver performance and perception of vehicle capability when using Autopilot for a second time. However, the M&D data were used to evaluate the noise tolerance (robustness) of the developed model. Table 6 shows the testing results, which suggest the models are noise tolerant. In particular, the TIA_ANN model performed better than the TIA_SVM and TIA_RF models.

Table 6.

	M&D	M&D + 125 Data Sets
TIA_ANN	0.72211	0.76997
TIA_SVM	0.66904	0.76316
TIA_RF	0.33333	0.40159

Table 6. Test of Model Accuracy with Field Data from Manual and Driver Preference Modes

Noise Tolerance and Sensitivity Analysis. To test the robustness of the developed models, we arbitrarily added noise to the data to see how the models would respond in terms of accuracy. Noise was added by multiplying all the original values in the dataset by 5%, 10%, or 15%. For example, a 5% noise level was added using the following equation: Value _{5% noise} = (original value) * 1.05. In other words, the noise values deviate 5% from the original values. This level of noise was added to every single data point in the data set; therefore, all the data used for training, testing, and validation had the same noise level. Table 7 shows the results for the two models. Results suggest only 1% to 2% difference when noise level increases to 15%. In one case, the accuracy increased as the noise level increased. This suggests that the introduction of noise into the data sets turned out to fit the underlying data distribution.

Table 7.

	No noise added	+ 5% Noise	+ 10% Noise	+ 15% Noise
TIA_ANN (125 data sets)	0.77738	0.75871	0.74849	0.75871
TIA_SVM (125 data sets)	0.77804	0.77804	0.77807	0.75724
TIA_RF (125 data sets)	0.43200	0.42399	0.41600	0.392
TIA_ANN (250 data sets)	0.76151	0.79625	0.7694	0.79625
TIA_SVM (250 data sets)	0.79971	0.79971	0.81396	0.79971
TIA_RF (250 data sets)	0.45600	0.46799	0.41200	0.45599
TIA_ANN (500 data sets)	0.76386	0.76791	0.7726	0.76791
TIA_SVM (500 data sets)	0.78531	0.78531	0.79086	0.78531
TIA_RF (500 data sets)	0.39800	0.39600	0.39599	0.37600
TIA_ANN (1000 data sets)	0.78691	0.7723	0.78621	0.77219
TIA_SVM (1000 data sets)	0.78722	0.78763	0.78763	0.78722
TIA_RF (1000 data sets)	0.39499	0.39460	0.39699	0.39200

Table 7. Noise Tolerant Property Testing Results

Use of additional vehicle attributes. We explored further improving model accuracy by adding additional driving attributes into the modeling process. For example, Table 4 shows that the correlation between the average distance to the left line (TLa) and Trust in Automation was −0.79. Data from Autopilot mode on TLa, CSa (Curve-Speed Average), TSa (Turn-Speed Average), and TBl (Turn-Brake Length) were fitted into distributions and validated using the Lilliefors test. Statistical testing results suggested Expon(0.0172, −0.1793), ExtValueMin(41.6520, 3.8349), Pareto(0.96522, 2.9800), and Uniform(23.4088, 31.3075) were the top candidates for these attributes, respectively. Additional data sets were generated based on those distributions. Table 8 shows that when going from three to four attributes, the accuracy of the ANN model increased slightly. However, ANN accuracy did not continue to increase when using five, six, or seven attributes. Also, SVM accuracy decreased when going from three to four attributes. Because our ultimate goal is to process vehicle attribute data online in real time, we chose to use three attributes to reduce computational complexity.

Table 8.

	3 Attributes	4 Attributes	5 Attributes	6 Attributes	7 Attributes
TIA_ANN	0.78691	0.79279	0.77894	0.78009	0.7828
TIA_SVM	0.78722	0.77996	0.77629	0.77610	0.7727
TIA_RF	0.39499	0.37899	0.38800	0.32700	0.3269

Table 8. Results Showing Attribute Effect on Accuracy

4.3 Stage I Findings and Discussion

Initially, we looked for time spent on autopilot and number of braking events as indicators of subjects’ trust in automation. However, these attributes did not strongly correlate with the individual assessment of trust on automation. This may have been because only 20% of the driving path consisted of downhill, curve, and turn situations (see Figure 3(a)). The trust signal was stronger when focusing only on sections of the driving path that require a greater cognitive load such as downhill, curve, and turns. Results suggest that the Downhill Speed Average (DSa), Turn Length standard deviation (TLs), and Curve Speed standard deviation (CSs) attributes yield stronger correlation coefficient with self-assessment of trust in automation. These attributes can potentially be used to continuously assess a subject's trust in automation as the number of driving attempts increases.

Of the three machine learning models developed, ANN and SVM yielded better accuracy than RF under a variety of conditions, including added noise, sample size variations, and tests with field data from the manual and driver preference modes.

The developed models (ANN, SVM, and RF) appear to be robust/fault tolerant. Even with added noise increases of up to 15%, accuracy was reduced only by 2% to 3%. in some cases, the accuracy increased 2% to 3%. When tested with data not previously seen by the models (from the manual and driver preference modes), the ANN model (0.72) performed better than SVM (0.66) and much better than RF (0.33).

The models need appropriate sample sizes. If the sample size is too small (less than 25), the models cannot find a good fit; therefore, the accuracy is not good. However, as sample size increases, model accuracy may not increase in proportion to the amount of data. Future research could include understanding how to find the right sample size when developing machine leaning models.

Accuracy did not improve more than 1% when the number of vehicle attributes (i.e., number of nodes) increased. This may be because the top three attributes had strong correlations with the Trust in Automation variable; whereas the other four attributes had relatively modest correlation coefficients with the self-assessment of trust in automation. Future research may include examining the extent to which these attributes are independent of one another in addition to further study of their correlation with the performance measure (trust in automation).

For purposes of providing personalized driving assistance in real time, using fewer attributes can be advantageous, in that computational requirements are reduced. Ultimately, vehicle attribute data may replace self-assessments of trust on the vehicle capability (learned trust). Future directions may include the use of dispositional trust to replace learned trust. A machine learning model such as ANN can be developed as the base model based on individual differences. The model can then improve itself as the individual drives more often and more data are generated. Therefore, the model becomes a personalized model that adapts and becomes smarter over time.

5 Stage II Model Development: Adaptive Driving Assistant Model (ADAM)

The focus of the Stage II model development process was on developing Adaptive Driving Assistant Models (ADAM) based on ANN/SVM/RF techniques that can integrate the outputs from the four Stage I models (in Figure 1) and trigger appropriate voice instructions.

5.1 Model Development

Following are the steps followed in developing the ADAM models: (1) classify risk factors into categories and levels; (2) identify sensory device(s) for use in detecting risk factors; and (3) develop sensor fusion algorithms to integrate sensory data.

Step 1: Classify risk factors into categories and levels. Based on the literature review [2–5], four categories of risk factors are considered: Speed, Distraction, Road Conditions, and Trust. Within each factor, there are three levels of severity: Over, Normal, and Under. For example, if the factor is Speed, the levels would be Over Speed, Normal, and Under Speed. Therefore, there are 81 possible combinations (3 × 3 × 3 × 3) to be considered in this study. In addition, there are five possible types of advice that the system can provide: Slow down, Speed up, Brake, Stop, and Nothing.

Step 2. Identify sensory devices for use in detecting risk factors. Sensory devices were designated for monitoring each factor. Some of the data are from Tesla's CAN bus and some are from external sensory devices. Table 9 shows devices used for each factor.

Table 9.

Factors	Sensory devices
Speed	Mobileye, GPS
Distraction	Hands on wheel, Cruise control, Tobii eye tracking
Road condition	Tesla CAN Bus, Mobileye

Table 9. Devices Used for Each Factor

Step 3. Develop sensor fusion algorithms to integrate sensory data. For the ANN model, a 4 × 3 × 1 topology was used, representing four inputs, three hidden nodes, and one output. The four inputs are speed, distraction, road conditions, and trust, and the one output is the type of guidance to provide to the driver. Between the inputs and outputs are hidden layers that connect the input nodes to output nodes via linkages. Each linkage has a weight and function that can be used to recognize non-linear fitting. For the SVM model, a regression model built upon speed, distraction, road conditions, and trust data sets is employed to predict the type of guidance to provide to the driver.

The ANN model was built using the MATLAB Neural Network Toolbox TrainLM function, which is based on the Levenberg-Marquardt backpropagation algorithm.

The SVM regression model was trained and cross-validated using the FitrSVM function within MATLAB's Support Vector Machine (SVM) Toolbox. FitrSVM maps the predictor data using Radial Basis Function (RBF) kernel functions. An SVM model has two essential factors: cost (c) and gamma (g). Cost means tolerance of error, which determines the generalizability of the model. Gamma is a parameter of the RBF kernel, which is inversely proportional to the number of support vectors that affect training and prediction speed. To train an efficient model that will neither overfit nor underfit, the values of c and g must be kept within an appropriate range. Hence, grid-search and cross-validation (cv) are utilized to find the best c and g automatically. To initiate the grid-search, a set of c and g values should be designated for the parameters. Based on selected scoring standards, the best settings will be determined after exhausting all the various combinations of parameters. To prevent the model from becoming too complicated, which may lead to overfitting, cross-validation is implemented simultaneously with grid search. The training sets will be divided into several subsets randomly. One subset is designated as a training set for each round, and the others serve as validation sets. These two mechanisms (grid search and cross-validation) were combined to adjust the parameters to improve training efficiency and model performance.

Cross-validation was also utilized in training and tuning the Random Forest model. The max depth of the trees is determined post training. Setting a maximum depth of the decision trees prunes the leaves which reduces overfitting and potentially removes the influence of noise.

The procedure for producing the output node values for the ANN and SVM models was as follows:

(1)

For each input type, assign relative weight for each level of severity.

(2)

Calculate the accumulated weights for each possible outcome of the output node.

(3)

Fit the weights of each outcome into a distribution.

(4)

Redistribute data into groups based on the number of possible outcomes of the output node; so, the number of data groups on the histogram is equal to the number of possible outcomes of the output nodes. Each group of data represents the probability of each possible outcome.

Table 10 shows the weights assigned to each factor.

Table 10.

Factor	Associated Levels and Assigned Weights
Speed	1 – Normal; 2 – Below; 3 – Over
Distraction	1 – Normal; 2 – Light; 3 – Severe
Road condition	1 – Normal; 2 – Poor; 3 – Worst
Trust	1 – Normal; 2 – Partial; 3 – Complete
Outcomes	Nothing, Speed up, Slow down, Brake, and Stop

Table 10. Factors Involved, Associated Levels and Assigned Weights

Since four factors are considered and each factor has three levels, there are 81 possible combinations. After assigning weights to the severity of each factor, we can fit the overall value of each possible outcome into a distribution. At the same time, we can arrange the number of data groups into a histogram based on the number of predetermined outcomes, which is five in this case. As shown in Figure 5, a normal distribution with a mean and standard deviation of (7,1.4811) is a relatively good fit and happens to form five different groups with boundary values of 4.9, 6.35, 7.7, and 9.1. We can further normalize each outcome value between 0 and 1 using these group boundary values. The outcome values are calculated assuming Speed, Distraction, and Road condition factors are equally weighted plus half of Trust (since Trust is negatively correlated with the outcomes).

Fig. 5.

Figure 6 shows the data distribution, which is calculated assuming the outcome is the summation of four equally weighted factors (Speed, Road Condition, Distraction, and Trust). The distribution of the 81 possible outcomes suggests a normal distribution of N (8,1.64), which covers the range of 4 to 12 with 97.5% of the population after consolidating nine groups of data into five groups. The data distribution resembles a uniform distribution of U(0.95, 5.05). Figure 7 shows the distribution after the groups were consolidated.

Fig. 6.

Fig. 7.

5.2 Model Evaluation

To evaluate the Stage II models, we used both simulated data and field data to represent the outputs from the four Stage I models. Two approaches were used—comprehensive and historical—based on the data source. In the historical approach, data from external sources are used to represent past and current situations. In the comprehensive approach, data are generated to simulate a broad spectrum of events. This approach can include rare events, allowing possible future events to be represented. Using these two approaches together allows us to thoroughly evaluate and assess the robustness of the proposed models.

Comprehensive. For the comprehensive approach, we fit each factor value into a distribution, and then generate more data based on the underlying distribution with revised parameters as needed. Figures 8, 9, and 10 show possible distribution candidates for fitting existing data and revision of the underlying distribution to generate additional data. These are Uniform (1,3), Normal (2,0.3), and Triangular (1,2,3) distributions.

Fig. 8.

Fig. 9.

Fig. 10.

Tables 11, 12, and 13 show the accuracy of ADAM's sensor fusion algorithm-based ANN, SVM, and RF methods under conditions of different distributions and number of data sets. Results suggest that (1) accuracy increases as sample size increases; (2) there is not much increase in accuracy after sample size reaches 500 data sets; (3) the algorithms perform a little better when data follows a normal distribution; and (4) ADAM_ANN provides more stable (less variation) accuracy. With smaller samples, ADAM_ANN performs slightly better than ADAM_SVM but ADAM_RF performs much worse.

Table 11.

	125 sets	250 sets	500 sets	1000 sets
ADAM_ANN	0.87547	0.8725	0.87581	0.89429
ADAM_SVM	0.87249	0.93459	0.88662	0.89344
ADAM_RF	0.59378	0.64894	0.78642	0.77374

Table 11. Accuracy Comparison Using ADAM_ANN and ADAM_SVM with Normally Distributed Data Sets

Table 12.

	125 sets	250 sets	500 sets	1000 sets
ADAM_ANN	0.84786	0.86614	0.86759	0.87048
ADAM_SVM	0.80853	0.861	0.86889	0.89068
ADAM_RF	0.72776	0.74634	0.75014	0.80184

Table 12. Accuracy Comparison Using ADAM_ANN and ADAM_SVM with Uniformly Distributed Data Sets

Table 13.

	125 sets	250 sets	500 sets	1000 sets
ADAM_ANN	0.84644	0.85822	0.86459	0.87621
ADAM_SVM	0.80588	0.87113	0.85141	0.86823
ADAM_RF	0.65034	0.69096	0.74113	0.78097

Table 13. Accuracy Comparison Using ADAM_ANN and ADAM_SVM with Triangularly Distributed Data Sets

Historical. In this approach, we use historical data collected by government agencies, insurance companies, and third-party research foundations representing events in the U.S. to generate Speed, Distracted Driving, and Road Condition data for the input nodes of the ADAM_ANN, ADAM_SVN, and ADAM_RF algorithms. To generate data on the Trust in Automation factor, we rely on research findings from Dikmen & Burns [66].

Dikmen and Burns [66] reported that a survey of Tesla drivers was conducted to ask about their confidence in the Autopilot and common features. Overall, participants reported high levels of trust in Autopilot (M = 4.02, SD =.65) and moderate levels of initial trust (M = 2.83, SD = .82) on 5-point Likert scales. Trust in Autopilot was positively correlated with frequency of Autopilot use, self-rated knowledge about Autopilot, ease of learning, and usefulness of Autopilot displays.

Table 14 shows categories of factors and the frequency with which they affect driving, according to a 2016 survey by AAA [67]:

Table 14.

Factor	Group, Frequency, and Approximation
Impaired driving statistics	Group A
Category	Frequency	Approximation
Alcohol level too high	1/8, last 12 months, 9% more than once over the past year	1/8, 0.09 × 2 = 0.18
Distribution	0.18

Distracted driving statistics	Group B
Category	Frequency	Approximation
Talking on a cell phone	2/3 in last 30 days; 33% regularly	1/3 + 2/3 /12 = 0.389
Reading a text message or email	2/5 in last 30 days; 12% regularly	2/5/12 + 0.12 = 0.153
Typing or sending a text or email	1/3 in last 30 days; 8% regularly	1/3/12 + 0.08 = 0.108
Distribution	Ranges from 0.108 to 0.389; Average of 20%

Drowsy driving statistics	Group C
Category	Frequency	Overall approximation
Tired	Could not keep eyes on the road last month, 20% regularly	1/5 = 0.20
Distribution	0.2

Overall Distribution	A: 18%, B_avg: 20%, C to D (normal): 62%

Table 14. Statistics About Various Factors Related to Distracted Driving (Based on [66])

Overall driving distraction categories include use of electronics (on phone, texting, reading emails), fatigue, and driving while impaired. Assuming these three categories are independent of one another, the percentage of driving distraction can be estimated as 18% for driving under the influence, 20% for distracted driving, and 62% including driving in tired or normal condition (because driving while tired does not necessarily cause accidents, this group was combined with the normal driving condition group.)

The AAA survey [67] also summarized how drivers behave when speeding:

•

Nearly half of all drivers (48 percent) report going 15 mph over the speed limit on a freeway in the past month, while 15 percent admit doing so fairly often or regularly.

•

About 45 percent of drivers report going 10 mph over the speed limit on a residential street in the past 30 days, and 11 percent admit doing so fairly often or regularly.

Based on data from The Washington Post [68] and Caring.com [69], 20% of drivers who are age 65 or above tend to drive under the speed limit. Table 15 summarizes these statistics and classifies these four categories into three groups with associated percentages: Group A: over speed limit, 19%; group B: under speed limit, 18.3%; and group C: within speed limit, 62.7%.

Table 15.

Category	Frequency	Group	Distribution
Highway speeding	48%, 15 mph above past month, 15% regularly.	A	0.48/12 + 15% = 19%
Residential speeding	45%, 10 mph above past month, 11% regularly.	A	0.45/12 + 11% = 14.75%
Driving too slow [68, 69]	About 1/6 to 20% above age 65	B	16.6% to 20%
Normal		C
Overall Distribution	A: 19%, B: 18.3%, C: 62.7%

Table 15. Statistics about Speeding on Highway and in Residential Areas

According to ten-year averages from 2007 to 2016 analyzed by Booz Allen-Hamilton based on NHTSA data [5], on average, over 5,891,000 vehicle crashes occur each year. Approximately 21% of these crashes—nearly 1,235,000—are weather-related. Weather-related crashes are defined as those crashes that occur in adverse weather (i.e., rain, sleet, snow, fog, severe crosswinds, or blowing snow/sand/debris) or on slick pavement (i.e., wet pavement, snowy/slushy pavement, or icy pavement). The vast majority of weather-related crashes happen on wet pavement (70%) and during rainfall (46%). A much smaller percentage of weather-related crashes occur during winter conditions: 18% during snow or sleet, 13% on icy pavement, and 16% on snowy or slushy pavement. Only 3% happen in the presence of fog [5]. Table 16 summarize these statistics data about accidents caused by road/weather conditions.

Table 16.

Conditions	Frequency	Group	Distribution
Wet Pavement	15%	A	20%
Rain	10%	A
Snow/Sleet	4%	A
Icy Payment	3%	A
Snow/Slushy Payment	4%	A
Fog	1%	B	1%
Normal		C	79%
Overall Distribution	A: 19%, B:1%, C:79%

Table 16. Statistics about Accidents Caused by Road/Weather Conditions

Based on the above historical statistics for Speeding, Distracted Driving, and Road Conditions, data sets were generated for modeling and evaluating the ADAM_ANN and ADAM_SVM sensor fusion algorithms. Table 17 shows both algorithms performed well with accuracy ranging from 87% to 95%.

Table 17.

	125 sets	250 sets	500 sets	1000 sets
ADAM_ANN	0.90717	0.94259	0.93161	0.95248
ADAM_SVM	0.87745	0.91474	0.93576	0.94632
ADAM_RF	0.85600	0.88057	0.94349	0.95484

Table 17. Evaluation of ADAM_ANN and ADAM_SVM using Historical Data

5.3 Stage II: Findings and Discussion

In Stage II, we designed and evaluated three machine learning models (ANN, SVM, and RF) for providing driving advice to drivers of autonomous vehicles using data generated from historical statistics and fitted distributions. For the models based on historical statistics, model accuracy ranged from 90% to 95% for ANN, 87% to 94% for SVM, and 85% to 95% for RF for the fitted distributions. For the models based on fitted distributions, model accuracy ranged from 85% to 89% for ANN, 80% to 92% for SVM, and 59% to 73% for RF. These results suggest (1) all three models perform better when using data generated from historical statistics than from fitted distributions. This may be due to the data in the fitted distribution having greater variation than historical statistics; (2) ANN and SVM models are a good fit for this application; (3) the ANN model seems most stable and adaptive, and gradually improves its accuracy as sample size increases; (4) the SVM model in one instance improved its accuracy faster than the other two models (Table 11, 250 data sets). Overall, the modeling methodology appears to be correct and yields good results. Future directions include (1) using a hybrid SVM and ANN modeling approach to improve model accuracy at different sample sizes; (2) development of a novel model based upon ANN topology for real-time data processing; and (3) development of a plug-in portable hardware system that incorporates sensory devices and machine learning algorithms to provide real-time personalized voice advice to a driver.

6 Assessment of Applicability of Hoff & Bashir's Framework Within the Context of Driving Automation

Although the focus of this research was on model development, we also examined the self-report and vehicle data for signs that Hoff & Bashir's trust framework [21] holds up in the context of driving the Tesla. Our expectations were as follows:

•

Dispositional Trust: Individual differences in trust disposition should predict driving behaviors and Autopilot use.

•

Learned Trust: Trust in the Autopilot will increase from the beginning of the study to the end of the study.

•

Situational Trust: Drivers will trust the Autopilot more in routine parts of the drive compared to non-routine parts of the drive.

For these assessments, we examined data from the six subjects who used the AutoPilot mode twice to see if Hoff & Bashir's concepts of dispositional trust, learned trust, and situational trust are borne out in the self-report and driving attribute data. We also compared driving data from the Manual and AutoPilot modes to confirm that an Autopilot effect exists. Due to the small number of data sets, this analysis is considered to be preliminary; but the exercise is helpful for identifying potentially relevant vehicle attributes for future studies of trust in automation.

6.1 Dispositional Trust

To assess the relationship of dispositional trust to driving behavior, we computed correlation coefficients between a subject's average TOAST score and each of the 24 driving attributes identified in Table 3. TOAST asks subjects to indicate their level of agreement (using a 7-point Likert scale) with nine statements about trust in automated systems. Table 18 shows strong correlations between average TOAST scores and the number of downhill braking events (DBn); the length of time brakes were applied during downhill braking (DBl); the standard deviation of distance to the left line when driving downhill (DLs); and average driving speed in a straight line condition (SSa). In other words, the data suggested that subjects who trusted automated systems more tended to rely on the Autopilot more, because:

•

When driving downhill, the number of braking events was lower (DBn) and the time spent braking (DBl) was less.

•

When driving downhill, the variation in the vehicle's distance from the left line (DLs) was relatively low.

•

When driving straight, the vehicle's average speed (SSa) was relatively high.

Table 18.

Subject	Drive	TOAST	DBn	DBl	DLs	SSa
21191817	Autopilot	3	1.00	1.26	0.01	39.98
21164616	Autopilot	5.111111	0.00	0.00	0.01	45.00
21199974	Autopilot	5.555556	0.00	0.00	0.01	44.99
21131964	Autopilot	4.222222	1.00	1.24	0.01	45.00
21160564	Autopilot	4.666667	0.00	0.00	0.01	45.00
21191823	Autopilot	5.666667	0.00	0.00	0.01	45.00
Correlations			−0.85018	−0.85394	−0.86469	0.83801

Table 18. Correlations Between Average TOAST Score and Selected Vehicle Attributes

6.2 Learned Trust

To assess if trust in the Autopilot increased over time, we compared the means of the Trust in Automation ratings and the 24 driving attributes from subjects’ first AutoPilot driving attempts with their second Autopilot (Driver's Preference) attempts (Table 19). The data suggest that learned trust occurred between the first and second drives, because:

•

The mean Trust in Automation rating increased.

•

There was increased reliance on Autopilot on the second drive.

•

When driving on a curve, the number of braking events (CBn) was lower and the time spent braking (CBl) was less (i.e., 0 seconds).

•

Driving speed increased when driving on a curve (CSa) and there was less variation in driving speed (CSs).

•

Even though speed increased, the average distance to the left line (CLa) and the variability in the distance (CLs) remained the same.

Table 19.

Subject	Drive	Trust	CBn	CBl	CSa	CSs	CLa	CLs
21191817	Autopilot	5.75	0.00	0.00	40.47	3.09	−0.14	0.01
21164616	Autopilot	5.25	1.00	4.14	42.47	4.88	−0.15	0.01
21199974	Autopilot	4	1.00	0.28	44.76	0.90	−0.15	0.01
21131964	Autopilot	6.25	1.00	2.15	42.24	4.66	−0.14	0.01
21160564	Autopilot	7	1.00	3.03	23.71	10.16	−0.17	0.02
21191823	Autopilot	6.75	0.00	0.00	39.73	8.05	−0.15	0.01
	Average	5.833333	0.67	1.60	38.90	5.29	−0.15	0.01
21191817	Autopilot (DP)	5.5	0.00	0.00	42.27	3.72	−0.15	0.02
21164616	Autopilot (DP)	6.75	0.00	0.00	40.15	2.45	−0.17	0.02
21199974	Autopilot (DP)	4.25	0.00	0.00	44.99	0.13	−0.15	0.01
21131964	Autopilot (DP)	6.25	0.00	0.00	43.49	2.19	−0.15	0.01
21160564	Autopilot (DP)	7	0.00	0.00	44.99	0.13	−0.15	0.01
21191823	Autopilot (DP)	7	0.00	0.00	44.99	0.12	−0.14	0.02
	Average	6.125	0.00	0.00	43.48	1.46	−0.15	0.01

Table 19. Comparisons of Trust and Driving on Curve Between 1st and 2nd Autopilot Attempts

6.3 Situational Trust

To assess if Trust in Automation varies depending on the driving situation, we compared vehicle attributes across various driving situations when driving in Autopilot mode. The data related to braking suggest that reliance on Autopilot decreases as driving situations increase in complexity (Table 20).

•

When driving in a straight line, the average number of braking events (SBn) and duration of braking (SBl) is zero.

•

The average number of braking events (DBn) and duration of braking (DBl) is higher when driving downhill than when driving in a straight line.

•

The average number of braking events (CBn) and duration of braking (CBl) is higher when driving on a curve than when driving downhill.

Table 20.

Subject	Drive	DBn	DBl	SBn	SBl	CBn	CBl
21191817	Autopilot	1.00	1.26	0.00	0.00	0.00	0.00
21164616	Autopilot	0.00	0.00	0.00	0.00	1.00	4.14
21199974	Autopilot	0.00	0.00	0.00	0.00	1.00	0.28
21131964	Autopilot	1.00	1.24	0.00	0.00	1.00	2.15
21160564	Autopilot	0.00	0.00	0.00	0.00	1.00	3.03
21191823	Autopilot	0.00	0.00	0.00	0.00	0.00	0.00
	Average	0.333333	0.416667	0.00	0.00	0.67	1.60
21191817	Autopilot (DP)	1.00	2.85	0.00	0.00	0.00	0.00
21164616	Autopilot (DP)	0.00	0.00	0.00	0.00	0.00	0.00
21199974	Autopilot (DP)	0.00	0.00	0.00	0.00	0.00	0.00
21131964	Autopilot (DP)	1.00	0.76	0.00	0.00	0.00	0.00
21160564	Autopilot (DP)	1.00	1.61	0.00	0.00	0.00	0.00
21191823	Autopilot (DP)	1.00	0.95	0.00	0.00	0.00	0.00
	Average	0.666667	1.028333	0	0	0.00	0.00

Table 20. Comparisons of Driving Attribute Values for Selected Driving Situations

6.4 Comparison of Autopilot and Manual Driving

To compare drivers’ performance in the Autopilot and Manual modes, average speed, speed variation, and average distance to the left line under different driving conditions (downhill, straight line, turns, curve) were examined (Table 21). The averages and standard deviations were computed for the subject pool. The results suggest that the variation of average speed and average distance to the left line (lane keeping) are smaller in Autopilot mode than in Manual mode. With average distance to the left line (lane keeping) as the numerator and average speed variation in a curve situation as the denominator, the magnitude of difference from ranges from 0.1 to 40 times larger. This suggests that the Autopilot is better than human drivers at lane keeping, but more data is needed to evaluate the statistical significance of the difference. The F-test of the subject pool driving behavior variation under all driving conditions when using Autopilot versus Manual mode indicated that the difference between autopilot and manual driving is significant. That is, overall variation is less in Autopilot mode than in manual mode. For the F-test of means, f_mean was 0.92498 and the F_{mean,0.05,11,11} critical one-tail value was 0.35. For the F-test of variance, f_variance was 13.99 and the F_{variance,0.05,11,11} critical one-tail was 2.82. Therefore, δ_{mean, autopilot} is less than δ_{variance, manual} and δ_{variance, autopilot} less than δ_{variance, manual}. and the null hypothesis is rejected.

Table 21.

	DSa	DSs	DLa	SSa	SSs	SLa	TSa	TSs	TLa	CSa	CSs	CLa
Autopilot	43.91	1.39	−0.1583891	39.981968	0.04	−0.16810	30.18	12.86	−0.17931	40.47	3.09	−0.144417
Mean	44.49	0.99	−0.1600000	45.000000	0.04	−0.18550	27.18205	17.41423	−0.16528	38.58167	5.730337	−0.150735
Variance	0.41	0.07	0.0000025	0.000023	0.00	0.00098	4.654949	2.535411	0.00129	72.2713	12.55126	0.000073
Manual	39.53	3.71	−0.1743979	40.612056	1.65	−0.18877	29.94	13.02	−0.18558	40.35	3.92	−0.152079
Mean	43.80	1.55	−0.1700000	47.210000	1.43	−0.18322	27.95637	17.2744	−0.18280	42.73781	3.624244	−0.153506
Variance	3.78	0.49	0.0001000	18.950000	0.53	0.00030	7.481362	4.346794	0.00021	6.865958	1.900749	0.000085

Table 21. Comparison of Autopilot and Manual Driving Modes for Selected Driving Attributes

7 Conclusion and Future Directions

Fully autonomous driving is on the horizon; vehicles with advanced driver assistance systems (ADAS) such as Tesla's Autopilot are already available to consumers. Partially automated driving introduces new complexities to human interactions with cars and can even increase collision risk. Needed are adaptive technologies that can help drivers of autonomous vehicles avoid crashes based on multiple real-time data streams. In this paper, we proposed an architecture for an adaptive autonomous driving advisor, developing two layers of multiple sensor fusion models to provide appropriate speech-based reminders to increase driving safety based on predicted driving status. We also performed preliminary validation of Hoff & Bashir's trust framework using real-life vehicle data, with some interesting findings about relevant vehicle attributes. Results suggest (1) human trust in automation can be quantified and predicted with 80% to 85% accuracy based on vehicle data; and (2) the developed driving assistance model can generate appropriate voice instructions for use by a driving assistance device with 90–95% accuracy. Future directions include (1) obtain more subject data to improve model accuracy and validate Hoff & Bashir's trust framework and/or develop a trust model for autonomous vehicle driving that incorporates dispositional, learned, and situational trust; (2) investigate and integrate telemetry sensors and develop a novel real-time machine learning model built upon ANN topology to improve model accuracy and real-time voice response; and (3) prototype and evaluate a portable plug-in driving assistance device that can provides personalized advice to drivers.

References

[1]

U.S. Department of Transportation National Highway Traffic Safety Administration, National Statistics, available at https://cdan.nhtsa.gov/tsftables/National%20Statistics.pdf. (last accessed on 7/22/21).

Abstract

1 Introduction

2 Literature Review

3 Architecture for Adaptive Autonomous Driving Advisor

4 Stage I Model Development: Predicting Trust in Automation

4.1 Experiment Setup

4.2 Modeling Process

4.3 Stage I Findings and Discussion

5 Stage II Model Development: Adaptive Driving Assistant Model (ADAM)

5.1 Model Development

5.2 Model Evaluation

5.3 Stage II: Findings and Discussion

6 Assessment of Applicability of Hoff & Bashir's Framework Within the Context of Driving Automation

6.1 Dispositional Trust

6.2 Learned Trust

6.3 Situational Trust

6.4 Comparison of Autopilot and Manual Driving

7 Conclusion and Future Directions

References

Cited By

Index Terms

Recommendations

Toward Cognitive Vehicles

Numerical Analysis of Tractor Accidents using Driving Simulator for Autonomous Driving Tractor

How Drivers Categorize ADAS Functions: –Insights from a Card Sorting Study

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations