Abstract
With the development of science and technology, many new human-machine interaction methods have appeared in cars. Therefore, how to improve the interaction efficiency in human-machine interaction has become one of the important research topics. Our research focuses on the interaction between car cockpit and driver. To make a usability evaluation of car cockpit, we designed multiple sets of comparative experiments with different concurrent tasks. In the experiment, we collected front scene binocular image and car speed to calculate driving performance, driver’s heart rate and eye movement to represent driver’s physiological state. Specifically, for front scene analysis, we simplified the feature point matching method and obtained quite accurate object distance estimation. Experimental data showed that car speed was closer to the required speed in speed control task than speed + direction control task or speed + temperature control task; distance was adjusted better in distance control task than distance + temperature control task; driver’s heart rate was higher and has more fluctuation during the operation of secondary tasks; driver diverted their visual attention from the road to inside instruments more frequently during manual control than voice control. These results indicate when the task is more difficult or there is interference from secondary task, the driving performance would decrease and driver would be more stressed. And manual control task is more disruptive to driving performance than voice control task, but it takes more time. Finally, driving will be safer and more effective when using voice control instead of manual control.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
With the development of the times, the rapid innovation of technology in human life has prompted humans to seek to liberate themselves from the environment [2]. It also helps humans enhance the ability to adapt to the environment with the help of external devices. Therefore, various human-computer interaction technologies have emerged. Human-computer interaction technology occupies an important position in human society because it plays a channel of information exchange between humans and machines.
In human-machine systems, our goals include safety, efficiency, comfort, etc. [1] therefore, how to improve the interaction efficiency and reduce operational errors in human-machine interaction has become one of the important research topics. Ergonomics is just a science that studies the interaction between human, machine and environment and its reasonable combination. And its goal is to make the designed machine and environment system fit for people’s physiological and psychological characteristics, in addition, improve efficiency, safety, health and comfort in production. In this paper, our research object is the cockpit of a car. The final purpose is to make a usability evaluation of the car cockpit and this evaluation can help companies produce more human-friendly car cockpits.
In order to realize the usability evaluation, we need to explore how does the driver’s driving performance, behavioral response and physiological state change when he encounters different traffic conditions. The driving performance is evaluated by the binocular camera system. Behavioral response and physiological state of the driver are obtained by the heart rate and respiration rate measuring equipment and head-mounted eye tracking system.
The focus and innovation of our work are on image stereo matching whose result is used to obtain quite accurate object distance estimation under different task scenarios. Currently, image stereo matching algorithms are divided into two categories [3]: blocking matching algorithm [8] based on grayscale and feature point matching algorithm [5]. Compared with the two kinds of methods, the feature point matching algorithm has the advantages of small computation, good robustness, and insensitivity to image deformation [4], so in this paper, the method we used is just the feature point matching algorithm. Feature point matching algorithm mainly includes three steps: feature extraction, feature description and feature matching [9]. It first extracts the features of the image, regenerates them into feature descriptors, and finally matches the features of the two images according to the similarity of the descriptors [7].
2 Method
In order to realize the usability evaluation, we need to explore how does the driver’s driving performance, behavioral response and physiological state change when he encounters different traffic conditions. After completing the experiment, in each group, we can get such experimental data: a group of images taken by a binocular camera during the experiment, drivers’ heart rate, eye movement track characteristics, and the speed of the car from GPS. The processing of these data is shown as follow:
2.1 Image Stereo Calibration and Rectification
In the process of image measurement and machine vision applications, the process of solving parameters (internal parameters, external parameters) is called camera calibration. Through Zhang [10] we can get the relationship between the coordinates of a point in space in the world coordinate system and the coordinates in the pixel coordinate system are as follows:
Where \( \left( {u,v} \right) \) is the coordinate in the pixel coordinate system, \( \left( {X_{w} ,Y_{w} ,Z_{w} } \right) \) is the coordinate in the world coordinate system, \( K_{3 \times 4} \) is the internal parameter matrix, it has five internal parameters; \( \left[ {\begin{array}{*{20}c} R & T \\ O & 1 \\ \end{array} } \right] \) is the external parameter matrix, and \( s = 1/Z_{c} \) is an unknown scale factor.
A perfectly aligned configuration is rare with a real stereo system, since the two cameras almost never have exactly coplanar, row-aligned imaging planes. And the goal of stereo rectification [16] is projecting the image planes of our two cameras so that they reside in the exact same plane, with image rows perfectly aligned into a frontal parallel configuration.
2.2 Disparity Calculation
We know that the object has obvious color changes at the boundary [13], so we choose the feature of gradient to select the feature point. First, calculate the gradient in the horizontal direction using the following formula:
After calculating the gradient values of all pixels in the image, a threshold value τ is set, and all points with gradient values greater than the threshold value are marked as feature points.
After obtaining the feature points of the left and right images, the next step is to match the feature points. According to the positional relationship between the two cameras, we know that the position of any spatial point in the left camera image is always to the right of the position in the right camera image [14]. The cost function that reflects the similarity of the two feature points contains two parts. One is the sum of the absolute values of the corresponding gradient differences in the \( 1*5 \) pixel interval with the feature points as the center; the other is the sum of the absolute values of the corresponding gray level differences in the \( 1*5 \) pixel interval with the feature points as the center. Formulated as:
Where \( i \) refers to the position of the feature point on the left camera image, \( i_{l} \) refers to the position of the feature point on the right camera image with a relative disparity of \( l \), \( \tau_{1} \) and \( \tau_{2} \) are two truncated data. When the cost function reaches the minimum the feature point is the matching point. After feature point matching, the disparity \( d \) of these feature point pairs can be obtained.
2.3 Point Cloud Generating and Refinement
According to the principle of small hole imaging:
Where \( f \) is the focal length, \( T \) is the distance between the two optical centers, and they can be obtained after Camera Calibration, \( d \) is the disparity, \( \left( {x,y} \right) \) is the feature point’s coordinate in the left camera image coordinate system, \( \left( {X,Y,Z} \right) \) is the three-dimensional actual coordinate of the feature point in the camera coordinate system, and so you can get the actual coordinate as:
A sparse point cloud of feature points can be generated (see Fig. 1)
From the point cloud, we can find two problems: the point cloud presents a distinct layered structure, and there are a lot of mismatches during the matching process.
The reason for point cloud layering is that the disparity value obtained by the above matching algorithm is the integer pixel accuracy. In order to obtain higher subpixel accuracy, further subpixel refinement of the disparity value is required.
Mismatches seriously affect target recognition [15], so we remove mismatches in three steps:
-
1.
Filtering: we noticed that the point cloud distribution of the feature points on the target is often dense, and the mismatched feature points will be scattered to the unknown space area. Under this condition, we can remove the sparse feature point set based on the density of the point cloud.
-
2.
Feature points enhancement: when the continuous feature point set is too dense, it will have a certain degree of impact on the subsequent matching results. Therefore, we can enhance a feature point set where the feature points are dense to a feature point.
-
3.
Sorting feature points: feature point matching is to find the matching points of the feature points on the left image along the polar direction, and the feature points on the left image have the order of the coordinates. Therefore, the coordinates of the matching points in the right image obtained via the matching should also be ordered. With this condition, we can remove some matching points that do not meet the ordering.
2.4 Clustering
After removing the mismatch points on the point cloud, we can obtain relatively accurate sparse point cloud results. The next thing to do is to segment each target in the point cloud. As can be seen from the above point cloud, the points where the target is located is often concentrated, so the clustering algorithm can be used to complete the target detection and segmentation.
Commonly used clustering algorithms are K-means algorithm [18], DBSCAN algorithm [17], etc. The K-means algorithm requires knowing the number of data containing classes in advance, which obviously cannot be applied to our scenario. The DBSCAN Algorithm is a density-based clustering algorithm that fits well with the characteristics of our point cloud data and does not require the number of classes contained in the data to be known in advance. Therefore, we use the DBSCAN clustering algorithm. Figure 2 shows the clustering result.
2.5 Timestamp Synchronization
In the experiment, we use distributed acquisition and timestamp synchronization technology. So we can configure flexibly and avoid system crash caused by single device failure. For the sampling timestamp got after clock synchronization, we should match the timestamp of each data file and combine them into one file. Based on a reasonable preset frequency \( f_{p} \), perform downsampling on data with a frequency higher than \( f_{p} \), and interpolation on data with a frequency lower than \( f_{p} \).
3 Experiment
3.1 Experiment Design
In our comparative experiments, the driver should accomplish tasks on different experimental scenarios to induce different behaviors and state of drivers. We divide experimental tasks into primary tasks and secondary tasks. There are three sets of experiments. The primary task of experiment one is maintaining the same distance from the car in front (Distance Control), the primary task of experiment two is maintaining the same speed while driving on straight road (Speed Control), and the primary task of experiment three is maintaining the same speed while driving on winding road (Speed and Direction Control). Three experiments have the same secondary tasks, they are none, manual control of air-conditioning temperature and voice-control of air-conditioning temperature.
3.2 Participants and Apparatus
In this experiment, the main equipment we need includes one binocular camera for images collecting, some portable computers, one heart rate measuring equipment, one head-mounted eye tracking equipment and a car mounted GPS (see Fig. 3).
We invite three drivers to participate in this experiment. They all study or work at Shanghai Jiao Tong University. And they all have good vision and driver’s licenses.
3.3 Procedure and Data Collection
The whole experiment procedure is as follows:
-
1.
Equip all equipment: installing the binocular camera, GPS module, driving recorder; wearing the heart rate watch and eye tracker for driver.
-
2.
Synchronize time: synchronize the system time of all computers to this time server via the local area network.
-
3.
Start the data acquisition program: after the driver is informed of the experimental process and is ready, starting the acquisition programs of all devices.
-
4.
Start experiment one: the primary task is maintaining distance from the car in front. Meanwhile, perform the following secondary tasks sequentially.
-
a.
None.
-
b.
Manually adjust the temperature of the air conditioner several times.
-
c.
Adjust the temperature of the air conditioner several times using voice.
-
a.
-
5.
Start experiment two: the primary task is maintaining speed while driving on a straight road at 20 km/h. Meanwhile, perform the following secondary tasks sequentially.
-
a.
None.
-
b.
Manually adjust the temperature of the air conditioner several times.
-
c.
Adjust the temperature of the air conditioner several times using voice.
-
a.
-
6.
Start experiment three: the primary task is maintaining speed while driving on a winding road at 20 km/h. Meanwhile, perform the following secondary tasks sequentially.
-
a.
None.
-
b.
Manually adjust the temperature of the air conditioner several times.
-
c.
Adjust the temperature of the air conditioner several times using voice.
-
a.
-
7.
Terminate the data acquisition program: terminate all data acquisition programs and place all data in a folder named after the driver’s name and date.
-
8.
Change driver: change the next driver and repeat the above steps.
4 Result
After completing the data processing according to the above method, in each group of experiments, we can get the driving performance via image processing and driving recorder. Plotting the heart and respiration rate curves over time can help us analyze drivers’ psychology state. Eye movement tracker can record the track characteristics of human eye movement when processing visual information. After processing these data, we can get the following results:
Figure 4 shows a comparison of driving performance under different primary task conditions.
Figure 5 shows a comparison of drivers’ psychology state under different primary task conditions.
Figure 6 and Fig. 7 shows a comparison of driving performance under different secondary task conditions.
Figure 8 and Fig. 9 shows how does the driving performance change in experiment one under secondary task conditions.
Figure 10 shows how car speed and eyes gazing area change over time in experiment two with manual control task.
5 Discussion
As can be seen from Fig. 4, car speed in experiment 2 is closer to 20 km/h and it has less discrete. Calculation results show the average speed of the car in Experiment 2 is 19.4 km/h, and the standard deviation is 0.878; while the average speed of the car in Experiment 3 is 17.8 km/h, and the standard deviation is 1.527. Calculation results prove driving performance in experiment 2 is better, in other words, the driver’s driving performance is better at lower task difficulty, this result conforms to common sense in our cognition.
As can be seen roughly from Fig. 5, driver’s heart rate in experiment 2 is lower and smoother. Calculation results show the average heart rate of the driver in Experiment 2 is 83.9, and the standard deviation is 4.68; while the heart rate of the driver in Experiment 3 is 86.8, and the standard deviation is 9.02. Calculation results prove heart rate in experiment 2 is really lower and smoother, in other words, the driver is more relaxed at lower task difficulty, this result conforms to common sense in our cognition.
In Fig. 6 and Fig. 7, the secondary tasks of both experiment 2 and experiment 3 are none, manual control of airconditioning temperature and voice-control of air-conditioning temperature. Obviously, the driver needs to allocate more attention to complete the required tasks in experiment 2/3. We can think that these two sets of experiments were interrupted by secondary tasks. As can be seen from Fig. 6 and Fig. 7, car speed in experiment 2(a) with no secondary task is closer to 20 km/h and it has less discrete. Calculation results are shown in Table 1:
Calculation results prove driving performance in experiment(a) is better than experiment(b), (c), and it is more obvious in experiment 2 than experiment 3. So we can get the conclusion that the driver’s driving performance is better at no secondary tasks, but this comparison is less obvious when the primary task is more complicated. This result also conforms to common sense in our cognition.
In Fig. 8 and Fig. 9, the secondary tasks in experiment 1(b) and (c) are manual control of airconditioning temperature and voice-control of air-conditioning temperature. And the car’s distance from the target represents the drive performance. In Fig. 8, the secondary task is manual control, we can find that in the time period of 0–7 s in which the driver performs the manual control, the distance increases sharply from 20 m to 32.6 m; in the time period of 7–13 s with no secondary task, the distance decreases gradually; in the time period of 13–20 s with another manual control task, the distance decreases at the former period’s descent rate. We can realize when performing the manual control, the driver’s most attention is paid on completing the secondary task, so the drive performance would become extremely poor. In Fig. 9, the secondary task is voice-control, we can find in the performing period, the distance is constantly adjusted. Because when performing voice-control, driver doesn’t need to look at the touchpad and have more attention to complete the primary task compared with performing manual control. This comparison shows it is safer and more effective using voice-control instead of manual control.
In Fig. 10, the secondary task is manual control of airconditioning temperature. In the time period of 6–12 s and 22–28 s, driver performs manual control. We can find from both periods that when the task begins, the driver’s eyes gaze at the center touch screen to complete the manual control. And in the meanwhile, the car’s speed decreases rapidly because driver’s most attention is paid on secondary task. After finishing manual control, the car’s speed is corrected again.
Another result in experiment is that the average time taken in manual control tasks is 14.91 s, and the average time taken in voice control tasks is 7.83 s. Obviously, using voice control takes less time.
6 Conclusion
From all experiment results, we can get conclusions as follow:
-
1.
The higher the difficulty of the primary task, the lower the driving performance, and driver’s emotions will be more intense.
-
2.
When there is interference from secondary tasks, driving performance will also become lower.
-
3.
Manual control tasks are more disruptive to driving performance than voice-control tasks.
-
4.
Manual control tasks take more time than voice-control tasks.
-
5.
While performing secondary tasks, driver’s attention paid on primary task would be reduced, so the drive performance would be lower.
References
Patel, B.N., Rosenberg, L., Willcox, G., et al.: Human–machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ Digit. Med. 2(1), 1–10 (2019)
Zhang, S., Lu, Y., Fu, S.: Recognition of the cognitive state in the visual search task. In: Ayaz, H. (ed.) AHFE 2019. AISC, vol. 953, pp. 363–372. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-20473-0_35
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2002). https://doi.org/10.1023/A:1014573219977
Zhang, K., Fang, Y., Min, D., et al.: Cross-scale cost aggregation for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1590–1597. IEEE (2014)
Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. TPAMI 33(5), 978–994 (2011)
Mei, X., Sun, X., Dong, W., Wang, H., Zhang, X.: Segment-tree based cost aggregation for stereo matching. In: CVPR, pp. 313–320. IEEE (2013)
Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M.: Fast cost-volume filtering for visual correspondence and beyond. In: CVPR, pp. 504–511. IEEE (2011)
Wang, Z.-F., Zheng, Z.-G.: A region based stereo matching algorithm using cooperative optimization. In: CVPR, pp. 1–8. IEEE (2008)
Yang, Q., Wang, L., Yang, R., Stewénius, H., Nistér, D.: Stereo matching with color-weighted correlation, hierarchical belief propagation, and occlusion handling. TPAMI 31(3), 492–504 (2008)
Zhang, Z.: A flexible new technique for camera calibration. TPAMI 22(11), 1330–1334 (2000)
Ma, L., Li, J., Ma, J., et al.: A modified census transform based on the neighborhood information for stereo matching algorithm. In: 2013 Seventh International Conference on Image and Graphics. pp. 533–538. IEEE (2013)
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_56
Hirschmuller, H., Scharstein, D.: Evaluation of cost functions for stereo matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Ok, S.-H., Shim, J.H., Moon, B.: Modified adaptive support weight and disparity search range estimation schemes for stereo matching processors. J. Supercomput. 74(12), 6665–6690 (2017). https://doi.org/10.1007/s11227-017-2058-y
Choi, N., Jang, J., Paik, J.: Illuminant-invariant stereo matching using cost volume and confidence-based disparity refinement. JOSA A 36(10), 1768–1776 (2019)
Kumar, S., Micheloni, C., Piciarelli, C., et al.: Stereo rectification of uncalibrated and heterogeneous images. Pattern Recogn. Lett. 31(11), 1445–1452 (2010)
Tran, T.N., Drab, K., Daszykowski, M.: Revised DBSCAN algorithm to cluster data with dense adjacent clusters. Chemometr. Intell. Lab. Syst. 120, 92–96 (2013)
Arunkumar, N., et al.: K-Means clustering and neural network for object detecting and identifying abnormality of brain tumor. Soft Comput. 23(19), 9083–9096 (2018). https://doi.org/10.1007/s00500-018-3618-7
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wei, C., Wang, Z., Fu, S. (2020). Usability Evaluation of Car Cockpit Based on Multiple Objective Measures. In: Harris, D., Li, WC. (eds) Engineering Psychology and Cognitive Ergonomics. Cognition and Design. HCII 2020. Lecture Notes in Computer Science(), vol 12187. Springer, Cham. https://doi.org/10.1007/978-3-030-49183-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-49183-3_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49182-6
Online ISBN: 978-3-030-49183-3
eBook Packages: Computer ScienceComputer Science (R0)