1 Introduction

1.1 Current practice

Traffic signs are the language of roads and play an important role in people’s daily travel. A graphic symbol can be easily understood and remembered by drivers (Song 2010). Therefore, graphic symbols are widely used on traffic signs. Diagrammatic guide signs (DGSs) are widely used throughout the world (see Fig. 1), which can be particularly useful for describing various ramp configurations and informing drivers of unexpected ramp directions at expressway interchanges. As expressway systems become increasingly varied and complex, DGSs are also becoming increasingly complex. Consequently, it is becoming more difficult for drivers to anticipate the geometry of upcoming interchanges (Fitzpatrick et al. 2013).

Fig. 1
figure 1

DGSs in different countries

With the development of urban road networks, interchanges are becoming increasingly dense. For example, in Beijing, there are 245 interchanges and more than 37 types of DGSs within the Fifth Ring Road. Many DGSs in China have become much more complex and more difficult to read than those in other countries, as shown in Fig. 1d–f. Recent survey results have shown that only 35% of drivers in Beijing fully understood DGSs (Zhao 2016). Difficulty in understanding DGSs can easily lead to driving errors during the search for exits (Mast and Kolsrud 1972).

In China, the national standard (General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China and China National Standardization Management Committee 2009) only stipulates that “DGSs might be installed near interchanges. At complex or closely spaced interchanges, sign information can be decomposed and used to guide drivers gradually”. The Beijing standard (Beijing Bureau of Quality and Technical Supervision 2007) states that DGSs should be installed 500 m prior to interchange exits. The Beijing guideline (Beijing Municipal Commission of Transport and Beijing Technology Commission of Traffic Standardization 2016) references more specific standards and indicates that DGSs must be specially designed for particularly complex interchanges. However, no clear definition of complex interchanges nor any design guidelines are provided. Overall, there are no regulations regarding the limitation or optimization of complex DGSs in China, and problems related to complex DGSs have not been addressed in the relevant standards.

The diagrams of DGSs are very simple in European countries. In the US, the Manual on Uniform Traffic Control Devices (Federal Highway Administration 2009) offers specific design criteria and guidance for application conditions and placement methods for DGSs in Section 2E.22 of Chapter 2E. Notably, the construction and traffic facilities of high-level roads began earlier in European countries than in other countries, and the design concepts and application conditions for expressway guide sign systems in many countries are quite different (see Fig. 2). Consequently, related standards for DGSs in European countries are not fully applicable to other countries. Experience demonstrates that research on DGSs needs to be carried out in accordance with the actual situations in different countries.

Fig. 2
figure 2

Expressway guide sign systems, including DGSs, in different countries

1.2 Previous research

In European countries, many studies have been performed on evaluating the effectiveness of DGSs through driving simulator experiments and field experiments. Some studies have suggested that DGSs have a beneficial influence on driver performance at interchanges and can provide navigational information for drivers unfamiliar with the area (Hanscom 1971; Zwahlen et al. 2003). In consideration of the high economic costs of overhead span-type sign bridges and the high desirability of advance guidance information, it has been recommended that DGSs should be added to preexisting guide signs (Zwahlen and Schnell 2000; Zwahlen et al. 2003). However, it has also been found that DGSs require drivers to spend more time reading and interpreting the information they present in comparison with conventional signs. DGSs are particularly beneficial to driver performance at interchanges where traffic must exit on the left side of the through route. Suitable application conditions for DGSs have been studied and proposed (Mast and Kolsrud 1972). Chrysler et al. (2007) conducted a four-phase human factor study using PowerPoint slides and a driving simulator to test driver comprehension of diagrammatic freeway guide signs. The results showed that the performances (lane change distance and number of unnecessary lane changes) of some standard signs were equal to or better than those of DGSs in the majority of cases. Fitzpatrick et al. (2013) investigated the optimal installation of interchange guide signs via a desktop simulator experiment; the results showed that arrow-per-lane signs exert a positive effect on drivers, especially in helping them to make correct lane change decisions. These study results and related limitations concerning DGSs have given full support for the improvement of relevant standards and applications in engineering practices.

At present, many complex DGSs still exist. However, their negative effects have not received sufficient attention in some countries, especially in China. The majority of related studies have focused on advance guide signs and exit signs at interchanges (Li 2011; Shu 2004; Wang et al. 2017; Zhao et al. 2018). Only a small number of studies have focused on DGSs. Lin et al. (2013) summarized standards for DGSs by referring to the standards of America, England, Germany, Korea, Japan and Taiwan and found that the lane-guiding function of DGSs was not explicit in China. Li et al. (2015) compared the interchange guide sign standards of China with those of other countries and identified existing deficiencies and future research directions regarding DGSs in China. Zhao et al. (2015) conducted a driving simulator experiment to analyze the effectiveness of one typical type of DGS on an urban expressway. The study found that the DGS had a significant effect on the drivers’ speed, standard deviation of speed, acceleration and standard deviation of acceleration. Li et al. (2018) conducted a visual cognition experiment in Beijing, in which thirty-seven types of DGSs were classified into three complexity categories based on an evaluation of drivers’ visual recognition performance and cognitive complexity. Clearly, researchers in China have begun to take notice of the problems associated with DGSs. However, these studies have only pointed out the importance and necessity of studying complex DGSs. There has been no systematic study involving the effectiveness evaluation and optimization of complex DGSs. Complex DGSs have been found to result in reading difficulty for drivers (Mast and Kolsrud 1972; Zhao et al. 2015; Li et al. 2018). The essential problem in enhancing the effectiveness of complex DGSs is in improving their influence on drivers’ cognition.

Drivers’ cognitive processes with regard to traffic signs can be separated into four stages: perception, recognition, decision-making and behavior (Wickens 1987). Traffic signs that are complex present drivers with a large amount of information to be processed, which can result in excessive visual effort for drivers (Lyu et al. 2017). Expressways always have large traffic flows traveling at high speeds. Drivers may experience difficulty in noticing and comprehending complex DGSs within a short period of time. This difficulty can cause confusion among drivers and cause them to miss exits (Chrysler et al. 2007). In fact, driving behavior is the direct manifestation of drivers’ performance in road traffic and is closely related to driving safety and traffic operations (Cafiso and La Cava 2009). Many studies have indicated that reasonable traffic signs can facilitate driver cognition and improve driving behavior (Jamson et al. 2005; Jiang et al. 2010; Lin et al. 2013; Rahman et al. 2017; Zahabi et al. 2017). Nevertheless, the influence of DGSs with different complexities on drivers is not clear. This situation may cause the negative influence of complex DGSs to be neglected, leading to difficulties in promoting the optimal design of complex DGSs.

DGSs are becoming increasingly complex in countries such as China. However, complex DGSs have different influences than do simple DGSs on driving behavior, which may lead to driving mistakes and traffic safety problems. Currently, little attention has been paid to the effectiveness of DGSs with different levels of complexity. To address this gap, five types of DGSs with different levels complexity were selected in this study with the aim of determining their different levels of effectiveness. Five experimental scenarios involving five types of interchanges and DGSs were designed, and a driving simulator experiment was conducted. Eight behavioral indicators were obtained and used to evaluate the comprehensive guidance effectiveness of the five DGSs. The hypothesis of this study was that the five different types of DGSs would exert different influences on driving behavior. A higher complexity leads to a lower effectiveness. There is a negative correlation between the DGS complexity and the comprehensive guidance effectiveness of the DGSs. This experiment was an exploratory study, in which an attempt was made to explore the relationship between the complexity of DGSs and the corresponding driving behavior and guidance effectiveness. The study also tried to provide a scientific basis for the future design and setting of DGSs.

2 Methods

2.1 Participants

Thirty participants were recruited to participate in this study. Two participants were excluded from participating in the experiment because of their incompatibility with the driving simulation environment. In total, twenty-one male and seven female participants completed the experiment. These participants ranged in age from 22 to 55, with an average age of 33.18, and had an average driving experience of 8.72 years; all participants were recruited via advertisement. None of the participants had color vision deficiencies, and all were reported to have normal or corrected-to-normal vision (participants with myopia were required to wear contact lenses to ensure that they had corrected-to-normal vision to participate in this experiment). Moreover, the participants were required to have a regular circadian rhythm, without sleep disorders or simulator sickness, before the experiments. All of the drivers agreed to and signed an informed consent form before participating in the study and the participants were compensated for their participation.

In driving behavior research, according to the central limit theorem, a sample size of 30 or more participants is desirable (Li and Pan 2010). Nevertheless, researchers usually obtain a smaller sample size because of high resource demands. For example, in previous studies, 13 participants participated in an experiment focusing on ground-mounted diagrammatic signs (Zwahlen et al. 2003), 24 participants were recruited for research on the nighttime legibility of traffic signs (Susan, Chrysler, and Hawkins 2003), and 24 participants were recruited to take part in a driving simulator study investigating guide signs for exits along highways (Qiao et al. 2007). Therefore, the sample size of slightly fewer than 30 participants is acceptable in this research.

2.2 Five selected types of DGS diagrams

There are 245 interchanges and more than 37 types of DGSs on the Fifth Ring Road in Beijing. In a previous study (Li et al. 2018), a visual cognition experiment was conducted to test the complexity of 37 types of diagrams. Factor analysis and cluster analysis were used to conduct quantitative evaluation and classification. Finally, the 37 types of diagrams were separated into three categories, namely, low complexity, medium complexity, and high complexity, as shown in Fig. 3a.

Fig. 3
figure 3

Three levels of diagram complexity and the five selected diagrams

Five typical diagrams were selected to be presented in the driving simulator to study their effectiveness, as shown in Fig. 3b–f. These five types of diagrams are widely and frequently applied. Diagram 1 was chosen from the low-complexity category, and Diagram 2 was chosen from the medium-complexity category. Diagrams 3, 4 and 5 were chosen from the high-complexity category. The high-complexity diagrams will be the main focus of the present work, as well as subsequent optimization research. Therefore, more signs were chosen from the high-complexity category than from the other categories.

2.3 Apparatus

A fixed-base driving simulator at the Beijing University of Technology was used in this experiment. Real-time data were collected, including operating performance data (e.g., accelerating, decelerating and steering) and drivers’ maneuvering behavior data (e.g., accelerator, brake, and clutch usage). The data acquisition frequency was 30 Hz, and the virtual scenario was projected onto three large screens, providing a 130° field of view. The driving simulator is able to generate various sensory effects for participants, such as visual, auditory and tactile effects.

2.4 Design of the scenarios

Five DGSs, labeled as DGSs 1–5, were designed based on the five selected diagrams shown in Fig. 3. Each DGS corresponded to one type of interchange in one experimental scenario. These signs were the only control factor in this experiment. In Beijing, DGSs with low complexity are commonly used, and there are few cases in which there is no DGS installed prior to an interchange. Therefore, the low-complexity DGS1 was treated as the experimental control in this study.

In this experiment, experimental scenarios were designed and built in software according to the typical characteristics of Beijing expressway. Actually they do not exist in the real road environment of Beijing and other places. Each experimental scenario was designed based on one route and one destination. Thus, five experimental scenarios were developed (Fig. 4).

Fig. 4
figure 4figure 4

Five scenario routes corresponding to the five types of DGSs

  • As shown in Fig. 4a, route A–B–C–D–E corresponding to DGS1 leads to destination 1 (Lishui Bridge), which is circled in red.

  • As shown in Fig. 4b, route A–B–C–D–E corresponding to DGS2 leads to destination 2 (Guokang Road), which is circled in red.

  • As shown in Fig. 4c, route A–B–C–D–E–F–G corresponding to DGS3 leads to destination 3 (Lanxi Bridge), which is circled in red.

  • As shown in Fig. 4d, route A–B–C–D–E–F–G corresponding to DGS4 leads to destination 4 (Jianghai Bridge), which is circled in red.

  • As shown in Fig. 4e, route A–B–C–D–F corresponding to DGS5 leads to destination 5 (Xiaobao Road), which is circled in red.

To ensure that the only differences were those related to the control factor, the road sections from point A (starting point) to point D were the same on all five routes:

  1. 1.

    A 0.5 km-long straight segment (A–B, a two-way, four-lane urban road with a speed limit of 40 km/h).

  2. 2.

    A 1.0 km-long curve (B–C, a two-way, two-lane curved road with a speed limit of 30 km/h).

  3. 3.

    A 4.38 km-long straight segment (C–D, a two-way, six-lane urban expressway with a speed limit of 80 km/h).

After point D, the ramps for each of the five types of interchanges were all one-way, one-lane roads with a speed limit of 30 km/h. Compared with drivers taking the other three routes, drivers taking the routes corresponding to DGS3 and DGS4 had to travel an additional 0.35 and 0.47 km (the length of D–F), respectively, to arrive at the 3rd exit to reach experimental destinations 3 and 4.

In a previous study, the destinations of left exits were found to take more visual time than those of right exits and straight directions in DGSs (Li et al. 2018). Thus, the left exits of interchanges were tested with priority in this study (see Fig. 4). The right ramps and straight directions will be tested in the next study. On the routes of all five experimental scenarios, there was low traffic volume; thus, traffic did not affect the experimental vehicle during the formal experiment. Each simulated route was approximately 6–8 km long and took approximately 9 min to complete for every participant.

2.5 Design of the guide signs

Each interchange guide sign system used in the experiment included several advance guide signs, a DGS and several exit signs. In this study, each guide sign system included three advance guide signs (a, b and c) and one DGS (d), which was different in each system. Moreover, different exit signs (e, f and g) were installed before the interchange exits. Finally, five types of guide sign systems were installed on the five scenario routes (see Fig. 5). Then, these routes were loaded into the driving simulator system to establish the five experimental scenarios in which the participants were asked to drive.

Fig. 5
figure 5figure 5

Installed guide sign systems with the five types of DGSs

2.6 Procedures

To avoid the influence of familiarity effects, the five experimental scenarios were presented in a random order. Each participant was required to complete the following five steps to finish the experiment.

  1. 1.

    Demographic questionnaire

    Before beginning the experiment, each participant completed a questionnaire that covered basic information and other conditions in the pretesting stage. The basic information mainly included age, gender, and driving experience. Moreover, whether the participant consumed drugs, tobacco, alcohol, tea or other caffeinated drinks was also recorded. The participants were banned from using drugs, tobacco, alcohol, and caffeinated drinks during the experiment. All participants agreed to and signed an informed consent form before taking part in this study.

  2. 2.

    Test drive

    Each participant was given 5–10 min to practice in a test scenario with the aim of allowing the participant to adapt to the simulator environment and identify his/her potential for motion sickness in the simulator. Participants with motion sickness were not allowed to participate in the experiment.

  3. 3.

    Driving task instructions

    The participants were given a driving destination and were asked to make their best effort to complete the task in accordance with their usual driving habits on roads. During driving, the participants were told to obey the speed limit. If there was a collision or other accident, they were supposed to follow the instructions of the experimenters. The participants were also told that completing one task successfully would take approximately 9 min and that 3–5 min for resting would be provided after the completion of each task.

  4. 4.

    Formal test

    During the experiment, driving behavior data were recorded automatically by the driving simulator. Moreover, the experimenters recorded certain events for each participant, such as road crashes and whether he/she completed the task. After completing one task, each participant was given 3–5 min to rest before the next task.

  5. 5.

    Postscenario questionnaire

    The above two steps were repeated until all five scenarios were completed. Then, each participant completed a questionnaire regarding the effectiveness of the guide signs presented in the experiment.

3 Indicator selection

According to the previous studies, drivers typically start to perceive a sign ahead on an urban expressway approximately 200 m prior to the sign (Zhao et al. 2015). Therefore, in this study, we defined the influence area of a DGS as starting 200 m prior to the sign (O) and ending at the location of the sign (O′). To identify key segments that were significantly affected by the DGSs, the influence range of each sign was divided into four 50 m-long segments, as shown in Fig. 6a.

Fig. 6
figure 6

Influence range and evaluation indicators

According to previous studies, effective guide signs have a positive influence on the perception, maneuvering behavior and decision-making of drivers (Chrysler et al. 2007; Fitzpatrick et al. 2013; Mast and Kolsrud 1972; Zhao 2016; Zwahlen et al. 2003). These influences can be quantified by means of subjective perception indicators, operating status indicators and maneuvering behavior indicators collected through driving simulator experiments. To analyze the influence and effectiveness of the five types of DGSs, eight indicators were selected, and an evaluation indicator set was constructed, as shown in Fig. 6b.

  • Difficulty of finding the destination After each scenario was completed, each driver was asked to evaluate the difficulty of finding the destination during the experiment by assigning a score in the range of 0–10. A higher score indicated that it was more difficult for the participant to find the destination during the experiment.

  • Speed Traveling at an inappropriate speed can cause road crashes. Within the limited scope of speed on an expressway, an operating speed closer to the nominal speed in the influence range of a DGS corresponds to better driving performance.

  • Standard deviation of speed The standard deviation of speed can be used to evaluate the volatility of speed. A larger standard deviation of speed indicates higher volatility in the vehicle’s operating status, corresponding to more unstable driving behavior.

  • Deceleration Deceleration reflects the speed adjustment ability of a driver. A vehicle’s operating status will be threatened under heavy acceleration or deceleration. In the 200 m influence range of an ideal DGS, drivers should be able to easily comprehend the information presented by the sign and maintain or increase their speed rather than needing to decelerate to understand the sign.

  • Standard deviation of deceleration The standard deviation of deceleration represents the variability of speed adjustment, i.e., the steadiness of the driving status. When a vehicle is passing through the influence range of a DGS, a high standard deviation of deceleration implies that the vehicle’s speed changes considerably. A higher standard deviation of deceleration also indicates that the driver is adjusting his/her speed more frequently and might feel nervous and uncomfortable, suggesting a low adaptability to the DGS (Ding et al. 2014; Jiang et al. 2010).

  • Driving time Driving time is the average time required for the participants to drive through the influence range of a DGS. A short driving time implies a short visual cognition time and a short judgment time, which indicate smooth vehicle operation and high-efficiency driving for drivers under the influence of the DGS.

  • Percentage of unfound destinations After each participant had completed one experimental scenario, the experimenters recorded whether he/she had found the correct destination. Effective guide signs will help drivers find the correct exits and destinations more easily. Thus, a lower percentage of participants with unfound destinations indicates a more effective DGS.

  • Gas pedal power Gas pedal power can be used to evaluate the influence of signs on drivers’ maneuvering behavior (Ding et al. 2014). Therefore, for each driver, the intensity of the pressure exerted on the gas pedal and the duration and frequency of the driver’s use of the gas pedal were measured. A higher gas pedal power within the influence range of a DGS indicates that the DGS is more influential. The gas pedal power was calculated as shown below:

    $$P_{\text{GPP}} = \int \nolimits_{T} f\left(t \right){\text{d}}t$$

    where \(f\left( t \right)\) is the function describing the pressure intensity on the gas pedal over time within a range of 0–1; \(t\) is the travel time in s; and \(P_{\text{GPP}}\) is the gas pedal power.

The eight indicators above were selected to directly reflect the cognitive processes and driving behaviors of the drivers. These indicators were used to perform an impact analysis through repeated-measures analysis of variance (rANOVA) and an effectiveness evaluation through the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS).

4 Results

4.1 Influence on driving behavior

One way repeated measures ANOVA (rANOVA) was used to analyze the data with SPSS 22.0. It aimed to compare the effects of the different DGSs on the eight indicators within the DGS influence range. In previous studies with small sample sizes, the results of rANOVA have been assigned different explanations based on the difference between the p value and the 0.05 significance level. A value of p < 0.05 indicates that the analysis results show a significant difference, and p > 0.05 indicates that the analysis results show no significant difference.

4.1.1 Difficulty of finding the destination

The rANOVA results show that there were no significant differences in the average difficulty of finding the destination under the influence of the five DGSs (p > 0.05). However, the complexity of the DGS showed the same trend as the difficulty of finding the destination. As the complexity of the DGS increased, the difficulty of finding the destination also increased, as shown in Fig. 7. Note that the difficulty score for DGS2 was slightly lower than that for DGS1 and that the difficulty score for DGS5 was slightly lower than that for DGS4. This result implies that DGSs showing intersections in the form of a ring island will lead to less difficulty than will interchanges with several ramps.

Fig. 7
figure 7

Difficulty of finding the destination

4.1.2 Speed

The average speed over the entire influence range showed a decreasing tendency with the increasing DGS complexity (p > 0.05), as shown in Fig. 8a. Compared with the speeds under the influence of the other DGSs, the speed for DGS5 was the lowest, and that for DGS1 was the highest. As shown in Fig. 8b, there was a statistically significant effect on the speed [F(4,108) = 2.943, p = 0.024] in the fourth segment (0–50 m before the DGS). In this segment, the speed was significantly lower for DGS5 than for the other DGSs. This finding indicates that the drivers needed to drop to a lower speed to read DGS5 than they did to read the other signs while driving.

Fig. 8
figure 8

Speed in the DGS influence range

4.1.3 Standard deviation of speed

The standard deviation of the speed over the entire influence range exhibited an upward trend as the DGS complexity increased [F(4,108) = 4.186, p = 0.011], except in the case of DGS4, as shown in Fig. 9a. Compared with the standard deviations of speed under the influence of the other DGSs, the standard deviation of speed was the highest for DGS5 and the lowest for DGS1. A significant effect was observed in the third segment (50–100 m prior to the DGS) [F(4,108) = 3.904, p = 0.0154], as shown in Fig. 9b. This result indicates that the driving status was more unstable in the 50–100 m prior to a DGS, especially in the case of DGS5.

Fig. 9
figure 9

Standard deviation of speed in the DGS influence range

4.1.4 Deceleration

There were no significant differences among the decelerations over the entire influence range for the five types of DGSs (p > 0.05). As shown in Fig. 10a, the decelerations under the influence of DGS1 and DGS4 were lower than those for the other DGSs. This result may imply that DGS2, DGS3 and DGS5 will lead to more deceleration behavior in identifying the destination exits than will the others. As shown in Fig. 10b, the deceleration in the first segment (150–200 m prior to the DGS) was significantly affected by the DGSs [F(4,108) = 3.370, p = 0.012].

Fig. 10
figure 10

Deceleration in the DGS influence range

4.1.5 Standard deviation of deceleration

There was no significant difference among the standard deviations of deceleration over the entire influence range for the five types of DGSs (p > 0.05), as shown in Fig. 11a. Figure 11b shows that the standard deviation of deceleration in each segment was also not significantly affected by the signs (p > 0.05).

Fig. 11
figure 11

Standard deviation of deceleration in the DGS influence range

4.1.6 Driving time

No significant differences in the driving times over the entire influence range were observed among the five types of DGSs (p > 0.05), as shown in Fig. 12a. The driving time in each segment was also not significantly affected by the signs (p > 0.05), as shown in Fig. 12b. Nevertheless, the driving time in the fourth segment increased with the increasing DGS complexity. From the first to the fourth segment, the driving time showed an increasing trend under the influence of each type of DGS, especially DGS5. This phenomenon may be due to the deceleration in the first three segments, causing more time to be needed in the fourth segment to return to normal speed.

Fig. 12
figure 12

Driving time in the DGS influence range

4.1.7 Percentage of unfound destinations

Figure 13 shows the differences in the percentages of unfound destinations among the five types of DGSs. The percentage of unfound destinations increased with increasing DGS complexity, except in the case of DGS5. The highest percentages of unfound destinations were observed under the influence of DGS3 and DGS4. However, the percentage of unfound destinations was lower for DGS5 than it was for DGS3 or DGS4, even though the graphic of DGS5 was harder to read. This is probably because it is easier for drivers to find a left exit in a road environment with two exit ramps than in a road environment with three exit ramps.

Fig. 13
figure 13

Percentage of unfound destinations

4.1.8 Gas pedal power

The rANOVA results indicate significant differences among the total gas pedal power measurements for the five types of DGSs [F(4,108) = 6.241, p = 0.001]. A more complex DGS was associated with a lower gas pedal power, except in the case of DGS4, as shown in Fig. 14a. The gas pedal power was significantly affected by the signs in the first and third segments of the influence range [F(4,108) = 3.215, p = 0.016 and F(4,108) = 8.688, p = 0.007, respectively], as illustrated in Fig. 14b. Moreover, the gas pedal power decreased in the first three segments and increased in the last segment for the DGSs. This behavior may be associated with the drivers’ visual processes. A driver needs to decelerate to read a DGS within 50–200 m before the DGS. After obtaining the desired information, the driver may start to accelerate again to find his/her destination exit within 0–50 m prior to the sign.

Fig. 14
figure 14

Maneuvering behavior indicator in the DGS influence range

The above results of the rANOVA were analyzed to determine the averaged relative validity of this study. To statistically address the absolute validity, the omega squared (\(\omega^{2}\)) was estimated and used to indicate the effect sizes of this study. The result showed that the average \(\omega^{2}\) was 0.359. According to Cohen’s criterion (1988), this \(\omega^{2}\) is larger than 16% which indicates that there is a strong relationship between the complexities of the five DGSs and the driving behavior indicators.

4.2 Comprehensive guidance effectiveness

Table 1 summarizes the results of the eight indices and ranks the potential safety advantages in terms of each index. Numbers 1–5 in the list represent the range from the best to the worst. For example, the lower was the standard deviation of speed, the safer the situation was for drivers. Therefore, the rank ordering of the standard deviation of the speed of DGSs 1–5 should be 1, 3, 4, 2 and 5, as exhibited in Table 1. In this study, eight indicators were used to analyze the effects of the DGSs, but the proper weight of each indicator was not clear. The rank orderings of the eight indicators were different. An accurate assessment cannot be performed by relying on only one indicator.

Table 1 The values and rank orderings of eight indicators for five types of DGSs

An attempt was made to comprehensively evaluate the levels of effectiveness of different complex DGSs. A previous study found that the entropy weight method was an objective method of weight assignment, and it is used in many studies (Qin et al. 2019; Xu et al. 2008). TOPSIS is a method used for multicriteria decision analysis, which is flexible, objective, fast and practical. This method of analysis can help people make better choices and is widely used (Ding et al. 2016a, b; Olson 2004; Qin et al. 2018). Therefore, in this study, the five types of DGSs were comprehensively evaluated with the help of the entropy weight method and the TOPSIS method. The eight selected analysis indicators were used to calculate the comprehensive guidance effectiveness for each DGS. The TOPSIS procedure was carried out as described below.

  1. 1.

    Construction of a multiple-objective decision matrix

The eight indicators were used as evaluation indices for the five DGSs. Thus, the multiple-objective decision matrix for TOPSIS had the form \(X = \left( X \right)_{mn} \left( {m = 5, \, n = 8} \right)\), as shown in Table 1. The eight indicators of operating status, maneuvering behavior, driving efficiency and task performance are listed in this table.

The TOPSIS method requires that all indicators vary in the same direction, that is, it requires uniform indicator monotonicity. Thus, a consistency transformation was performed to obtain a new multiple-objective decision matrix, denoted by \(X_{IJ}^{\# }\). Then, the formula \(X_{ij}^{*} = \frac{{X_{IJ}^{\#}}}{{\sqrt {\sum \nolimits_{i = 1}^{n} (X_{ij}^{\#})^{2}}}}\) was applied to obtain the normalized matrix, \({\text{X}}_{ij}^{ *}\).

  1. 2.

    Determination of the index weights

The entropy weight method was used to calculate each index weight on the basis of the previously mentioned matrix \(X_{ij}^{*}\).

$$W_{i} = \left(\begin{array}{*{20}l}0.1215&\quad 0.1332&\quad 0.1221&\quad 0.1269&\quad 0.1239&\quad 0.1215&\quad 0.1216&\quad 0.1305\end{array} \right)$$

The results imply that different indicators have different effects on evaluating the comprehensive effectiveness of a DGS. The ranking of the 8 indicators from largest influence to smallest is as follows: deceleration, percentage of unfound destinations, standard deviation of speed, standard deviation of deceleration, gas pedal power, difficulty of finding the destination, driving time, and speed. It can be seen that the four most influential indicators are related to operating status. Thus, it was determined that the deceleration, the standard deviation of speed, the percentage of unfound destinations and the standard deviation of deceleration all play important roles in evaluating the effectiveness of DGSs.

  1. 3.

    Calculation of the weighted value of every indicator for matrix structure standardization

The weighted and normalized decision matrix was obtained according to the formula \(U_{ij} = W_{i} *X_{ij}^{ *}\). That is,

$$U_{ij} = \left| \begin{array}{*{20}l} 0.06 &\quad 0.07 &\quad 0.06 &\quad 0.07 &\quad 0.06 &\quad 0.05 &\quad 0.06 &\quad 0.08 \\0.06 &\quad 0.05 &\quad 0.06 &\quad 0.05 &\quad 0.06 &\quad 0.05 &\quad 0.06 &\quad 0.06 \\0.06 &\quad 0.05 &\quad 0.05 &\quad 0.05 &\quad 0.06 &\quad 0.05 &\quad 0.05 &\quad 0.04 \\0.05 &\quad 0.08 &\quad 0.06 &\quad 0.07 &\quad 0.06 &\quad 0.05 &\quad 0.05 &\quad 0.04 \\0.05 &\quad 0.03 &\quad 0.05 &\quad 0.04 &\quad 0.04 &\quad 0.06 &\quad 0.05 &\quad 0.06 \\ \end{array} \right|$$
  1. 4.

    Calculation of Euclidean distances and performance degrees

The maximum and minimum indicators from the matrix \(U_{ij}\) were used to form two vectors, namely, the positive ideal solution (\(U^{ + }\)) and the negative ideal solution (\(U^{ - }\)):

$$\begin{aligned} U^{+} & = \left(\begin{array}{*{20}l}0.06&\quad 0.08&\quad 0.06&\quad 0.07&\quad 0.06&\quad 0.06&\quad 0.06&\quad 0.08 \end{array} \right) \\ U^{-} & = \left(\begin{array}{*{20}l}0.05&\quad 0.03&\quad 0.05&\quad 0.04&\quad 0.04&\quad 0.05&\quad 0.05&\quad 0.04\end{array} \right) \\ \end{aligned}$$

Next, the distances between each case to be evaluated and the positive and negative ideal solutions were calculated as follows:

$$\begin{aligned} D_{i}^{+} & = \left(\begin{array}{*{20}l}0.01&\quad 0.04&\quad 0.05&\quad 0.04&\quad 0.06\end{array} \right) \\ D_{i}^{-} & = \left(\begin{array}{*{20}l}0.07&\quad 0.04&\quad 0.03&\quad 0.06&\quad 0.02\end{array} \right) \\ \end{aligned}$$

Finally, the performance degrees, that is, the comprehensive guidance effectiveness levels, of the five types of DGSs were obtained as follows: \(C_{i}^{ *} = (\begin{array}{*{20}l}0.90&\quad 0.48&\quad 0.33&\quad 0.57&\quad 0.19\end{array})\). The value of the comprehensive guidance effectiveness ranges from 0 to 1, with a value closer to 1 indicating a better effectiveness.

As shown in Fig. 15, the effectiveness of DGS1 is the highest, and it is above the third quartile (Q3) degree. The effectiveness levels of DGS2, DGS3 and DGS4 are similar and are all near the level of the second quartile (Q2) degree, i.e., between the first quartile (Q1) and Q3. The effectiveness of DGS4 is higher than those of DGS2 and DGS3. The effectiveness of DGS5 is the lowest, at below Q1. The results indicate that the more complex is the diagram on a DGS, the lower is its comprehensive guidance effectiveness, except in the case of DGS4.

Fig. 15
figure 15

TOPSIS results

In summary, the influences of the five tested DGSs on driving behavior were different. Overall, the more complex a DGS was, the worse was its influence on driving behavior, except in the case of DGS4. Comprehensive guidance effectiveness also clearly varies among the five types of signs. There is a negative correlation between DGS complexity and the comprehensive guidance effectiveness of the DGS, which is consistent with the research hypothesis.

5 Discussion

A driving simulator experiment was performed in this study. The effects of different DGSs on eight analysis indicators were analyzed in detail. TOPSIS was used to synthetically evaluate the effectiveness of the five types of DGSs. The accuracy of DGS classification achieved in a previous study was verified, and corresponding design suggestions were extracted for engineering applications.

In a previous cognition experiment (Li et al. 2018), the ranking (from low to high) of the five DGSs in terms of complexity was DGS1, DGS2, DGS3, DGS4 and DGS5, as shown in Fig. 3a. By comparison, in this driving simulator experiment, the final ranking (from high to low) of the performance degrees was DGS1, DGS4, DGS2, DGS3 and DGS5, as shown in Table 2. A DGS that presents more difficulty in visual cognition can be expected to have a more negative influence on driving behavior, and vice versa. The ranking results of these two experiments were highly similar even though different evaluation indicators were used. The results are also consistent with those of another previous study, which found that a high visual demand can easily cause drivers to reduce speed (Mast and Kolsrud 1972; Engström et al. 2005). However, the ranks of DGS4 in the two experiments are inconsistent. Notably, the results of the former study were based on a comprehensive consideration of the recognition time for all exits, whereas in this study, only the recognition of the left exit was tested for each DGS. Hence, the performance degree of DGS4 was slightly higher than those of DGS2 and DGS3. This can probably be attributed to the fact that the left-exit diagram in DGS4 is more common and is thus easier to identify by drivers than those in DGS2 and DGS3. The diagrams of the left exits in DGS2 and DGS3 are quite uncommon. In the next study, other ramps or exits of the five DGSs will be tested, and a comprehensive consideration of driving behavior under the influence of all exits of the five DGSs will be included in the effectiveness evaluation.

Table 2 The comparison of the evaluation results in the two experiments

DGS2 and DGS5 are very similar. The only difference between them is the position of the second exit. However, this difference results in a considerably different effect on drivers’ visual cognition and driving behavior. To explain the reason for this phenomenon, an event-related potential experiment has been conducted. The results will be analyzed in a future paper to reveal the cognitive mechanisms of drivers presented with DGS2 and DGS5. In addition, only eight indicators that are directly related to the cognitive processes and driving behaviors of drivers were analyzed in this paper. In future studies, additional indicators, such as those related to visual characteristics and brain cognition, may be included to achieve more accurate assessment results.

The findings of this study imply that complex DGSs have a negative influence on driving behavior. Hence, complex DGSs need to be optimized, especially DGSs with particularly high complexities. In countries such as China, a considerable number of complex DGSs are used. However, the related regulations are insufficient to support practical applications. Thus, there is a need for further study of the optimization of complex DGSs. In the present study, we have conducted preliminary research on the optimization of complex DGSs. The experiences of other countries with regard to the design and placement of DGSs were referenced, and the suggestions of experts were adopted while considering the actual road conditions in China. Basic optimization methods for complex DGSs, such as the simplification of graphics, the addition of guide text on the ground near exits, DGS installed in advance, and the repeated placement of two identical DGSs, were considered. The optimization of complex DGSs will be further studied in a future paper to improve the relevant standards and provide suggestions for engineering applications.

In addition, the hypothesis of this study was verified to be correct. The necessity of optimizing complex DGSs has been demonstrated in this study. However, only five DGSs were evaluated in this paper. This study has established an evaluation indicator system and an effective assessment method for evaluating the guidance effectiveness of various DGSs. The methodology used in this study can also be adopted to evaluate the effectiveness of other DGSs in future research.

6 Conclusions

In this study, the following conclusions were drawn based on the results:

  1. 1.

    Higher complexity of DGSs negatively impacts subjective perception, maneuvering behavior and operating status. In general, the higher the complexity of a DGS is, the more negatively it affects driving behavior.

  2. 2.

    The characteristics of a DGS significantly influence maneuvering behavior. As the complexity of a DGS increases, its negative influence on maneuvering behavior increases. Moreover, complex DGSs lead to a significant decrease in gas pedal power at distances of 150–200 m and 50–100 m prior to the sign.

  3. 3.

    The characteristics of a DGS significantly influence operating status. A higher DGS complexity negatively impacts operating status. As the DGS complexity increases, the deceleration rapidly increases 150–200 m prior to the sign, the standard deviation of deceleration increases 100–150 m prior to the sign, the standard deviation of speed increases 50–100 m prior to the sign, and the speed decreases and driving time increases 0–50 m prior to the sign.

  4. 4.

    A negative correlation relationship exists between the complexity and the comprehensive guidance effectiveness of the five types of DGSs on the whole. The final ranking (from best to worst) of the five DGSs in terms of their effectiveness is DGS1, DGS4, DGS2, DGS3 and DGS5. DGS1 had a more positive influence than the other signs, whereas DGS5 had the most negative influence. The levels of effectiveness of DGS2, DGS3 and DGS4 in influencing driving behavior were similar.

Overall, this study made an attempt to explore the negative correlation relationship between the complexity of DGSs and the corresponding driving behavior and guidance effectiveness. This relationship should be taken into account when designing new DGSs and optimizing complex DGSs in the future. These findings may encourage traffic officials to recognize the driving issues produced by complex DGSs and pay more attention to the importance of reducing the use of complex DGSs. This study can also serve as a reference and provide solutions for use in evaluating the effectiveness of other complex DGSs. The study is meaningful in improving the effectiveness of DGSs in China and other countries in which the situation is similar to that in China in terms of, e.g., the widespread use of DGSs, new expressways, driving habits and other traffic characteristics.