Keywords

1 Introduction

1.1 Training Procedures in Production and Manufacturing

In the course of increasingly digitalized production, the evolvement of complex, digitally networked systems also poses challenges to their operators [1]. In this context, machine operation and associated assembly processes require safe and effective interaction between man and machine [2]. This also changes the way of corresponding training procedures. Face-to-face or text-based training procedures are increasingly being replaced by more flexible digital methods such as blended-learning or virtual learning environments [3]. This becomes particularly relevant if inexperienced or temporary workers have to be trained for new (assembly) processes [4].

By using digital (i.e. virtual) near-the-job trainings, inexperienced employees can be trained in a standardized way without manipulating the cycle time of a machine [5]. This not only improves the efficiency of the training by reducing costs and time, but also increases safety and enables the consideration of different learning paces and previous knowledge [6, 7]. Such trainings play a crucial role in improving the human capital of the workforce [4]. Virtual learning environments are particularly suitable for industrial application, since they offer practicing real-life activities of manufacturing processes in a virtual environment [8]. In addition to instructional videos and learning platforms, the use of Mixed Reality (MR) technologies has become a promising procedure in order to visualize and illustrate workflows and processes near-the-job [2, 9].

Although the benefits of MR technologies are empirically proven and widely acknowledged from workers and operators [3, 10, 11], there are currently only few cautious attempts to apply these technologies in a long-term and targeted manner [7]. There is also a lack of knowledge about which technology should be used for certain tasks or how the features should be specified: “each has unique strengths and limits for aiding learning, so understanding how to choose the right medium for a particular educational situation is an important next step in realizing the potential of immersive media in learning” [12]. Neglecting these different potentials can provoke unexpected effects, e.g. that paper-based training in experimental studies leads to better results than MR technologies [13]. Possible reasons for this are missing or not optimally used features (e.g., gesture control or hand recognition) of the used technology in the respective task [14]. To determine which technology or which features are particularly suitable for a certain task, the following section presents a definition of MR as well as a brief overview of the features of the technologies and devices used in this context.

1.2 Mixed Reality

The term MR describes a continuum between reality and virtuality, involving the merging of real and virtual worlds [15]. Within the reality-virtuality continuum, a distinction is made between the real environment, Augmented Reality (AR), Augmented Virtuality (AV) and the virtual environment (Fig. 1).

Fig. 1.
figure 1

Reality-virtuality continuum [15]

AR describes the immersion in reality and the handling or interaction with virtual objects. Thus, an AR system “combines real and virtual objects in a real environment; runs interactively and in real time and registers (aligns) real and virtual objects with each other” [16]. The use of AR in training of managing industrial equipment gives orientation of the correct utilization of the instruments, which in turn omits errors and production rejects [17]. AR enables the user to manage the equipment from screen or device through the recognition of current state of the equipment [17].

AV describes the immersion into a virtual world, which is extended by reality, while the user manipulates mainly virtual objects. Especially in collaborative use cases and settings AV promises to be successful. In industrial context, for example, virtual meetings can be held in which engineers can jointly manipulate or change 3D models in real time [18]. Compared to AR, immersion plays a more prominent role in AV [19]. Nevertheless, the boundary between AR and AV remains weak and depends strongly on the application and use [20].

Although not directly mentioned in the definition of Milgram and Colquhoun in 1999, Virtual Reality (VR) is a widely used technology that most closely corresponds to the virtual environment and thus is associated with the right side of the continuum [10]. In this paper, VR is understood as part of Mixed Reality. VR describes complete immersion in a virtual world, while the real environment is hidden [15]. Thus, VR offers a very comprehensive view of the structures and installations of a respective environment [21]. Providing the user with perceptive information about the structure and functioning of a real system is a cost-effective advantage of VR [8]. The advantage of using VR as industrial training is to prevent user from exposing risks of the real process equipment and familiarize with the tool and work steps at the same time through virtual immersion [22].

Many authors criticize the apparent separation of these terms in the continuum, as the boundaries of the terms become indistinct in most use cases [20]. The capabilities of technologies and devices are becoming more and more extensive and are increasingly overlapping. In the course of the discussion about the classification of Mixed Reality, suggestions came up to make the distinction rather on the basis of the technological specification (technology-centered taxonomies) [23], the preconditions of the user (user-centered taxonomies) [24], the specification of the presented information (information-centered taxonomies) [25] or the interaction between user and object (interaction-centered taxonomies) [26].

However, the applicability and usefulness of these classifications depends strongly on the use case. In the field of industrial assembly or machine operation, for example, the consideration of task characteristics is of enormous importance, which can hardly be found in any of the existing taxonomies [27]. The intransparency and overlapping of content within existing taxonomies contribute significantly to the fact that MR technologies are not yet part of the everyday tools used in the industry [14]. There is little research on when which technology (e.g. AR vs VR) is more suitable in an industrial context [6]. This impedes their implementation. Thus, companies are faced with the challenge of choosing the right methodology and the right technological tool for the respective task.

This paper aims to bridge the gap between theory-based taxonomies and practical use cases. It will answer the question which MR feature is most suitable for a specific assembly related task. Section 2 presents an industrial use case for a training process in production and defines the characteristics of the task. Based on this, the required features and specifications for the design of the based teach-in procedure are derived. Section 3 describes the study design in which two different technologies for the training process are compared. The results are presented in Sect. 4 and discussed afterwards.

2 Use Case Description

2.1 Assembly of a Pneumatic Cylinder

The Use Case describes the manual assembly of a pneumatic cylinder before its further processing in a partially automated machine. The whole assembly process consists of eleven steps. For each step it is specified whether the right or left hand shall be used. The steps are described in the following: First, the piston rod should be grasped from the colored left box and inserted into the workpiece fixture. Secondly, the piston buffer must be removed and mounted on the piston rod. Then, a piston has to be grasped from the upper left box and mounted on the piston rod. The fourth step includes taking the rod guide ring from the box. Subsequently, the ring magnet has to be grasped and a new ring magnet should be ordered via button. The ring magnet must be placed on the piston rod with the marked red side up and the rod guide ring must be mounted on the ring magnet. The mounting bush and sealing ring should then be placed on the piston rod. Then, the piston from the right box has to be mounted on the piston rod. The buffer piston must then be screwed onto the piston rod. The assembled piston rod should be inserted into the workpiece carrier and turned until it clicks. At the last step, the cylinder tube should be placed on the workpiece carrier, which will be sent to the respective station by pressing the start button.

The learning process for these steps in a face-to-face training that takes 40–70 min depending on the previous knowledge of the worker. The cycle time of the machine must then be reduced for several hours at a time.

An MR prototype has already been tested in a preliminary study. A 3D visualization of the plant was constructed and the sequence of individual work steps was visualized using Microsoft HoloLens. An overview on the virtual plant and the workspace can be seen in Fig. 2.

Fig. 2.
figure 2

3D visualization of the plant

The machine was projected into real space in order to train the workers before they work on the real system. The first heuristic usability evaluation of the experts was positive. However, some important remarks concerned missing possibilities for interaction with the individual objects, as well as a possible irritation caused by an overlap of the virtual objects with the real environment in which the training was conducted. Based on the preliminary study, the required functionalities for the present use case are derived in the following section, which will define the test scenarios.

2.2 Specifications of MR Features

Based on theory-based taxonomies [20, 24,25,26], different characteristics should be taken into account when designing an MR-based training method. In Table 1, the features of the technologies are specified and assigned to different classifications, which are applied to the use case in the following. The first three categories (task, user, information) play an essential role in designing the training environment. The last two categories (interaction, technology) additionally contribute to the choice of the medium.

Table 1. Specification of general MR-features

The task of assembling a pneumatic cylinder requires the acquisition of procedural knowledge in order to complete the work steps in the correct order. Dede and colleagues report that VR, for example, can be very effective for learning procedural tasks in which students learn a sequence of steps to complete a task that requires maneuvers in three-dimensional space [12]. With regard to the user, demographic variables such as age, gender, and educational background could play a role with regard to the success of the MR-based training procedure. Even more important, however, is the users’ previous knowledge of the task. For example, inexperienced workers may need more or different instructions than experienced workers [28]. The presented use case is primarily aimed at inexperienced temporary workers who have to practice the assembly without any prior knowledge. The didactic structure of the training should take this into account by offering a modular structure, where steps can be repeated and played in different paces [7]. Furthermore, affinity and enthusiasm for technology may have an influence on the assessment of the system [29].

The objects of the virtual assembly should be displayed as realistically as possible to allow realistic orientation in the training environment. Therefore, true-to-scale 3D representations of the objects were used. Important information was additionally visualized in short, clearly arranged text fields.

In a heuristic evaluation with the MR prototype [30], the lack of possibilities for interaction was particularly criticized. Therefore, the comparison of two different media with different possibilities for interaction should contribute to evaluate the importance of e.g. haptic control. The first medium, which was also used in the preliminary study, is the Microsoft HoloLens. Here, there are possibilities for voice input as well as the possibility to use a clicker for the selection of different steps for interaction. However, the control is primarily visual, by directing the gaze at an object or field and then confirming this by voice or clicker. Since the scenario is projected into a real environment, it can be classified in the Reality-Virtuality spectrum in the middle range between AR and AV.

In contrast, the second medium is intended to offer possibilities for haptic interaction. Therefore, the same training application was implemented in a VR environment, which is related to the right end of the continuum. In the VR environment, it is possible to select steps or to move objects yourself with two controllers. Since the use case requires differentiating between using right and left hand, a feedback function for selecting objects with the right or left controller is built in.

The technical classifications are based on the preliminary work of Milgram and Kishino, to whose publication reference is made for more detailed information [23]. The first medium (AR/AV) uses a partially modelled world in the EWK dimension by virtually projecting the entire machine into the real world. The second medium (acer mixed reality glasses), on the other hand, uses a 360° view of the real environment in which the virtually modelled machine is installed. In terms of RF, both media use 3D animations with high fidelity. The extent of presence in the EPM dimension is relatively well established using Head Mounted Displays (HMDs), although it can be assumed that the VR environment allows an even higher degree of presence.

According to the feature specification with regard to the use case of the cylinder assembly, it becomes evident that the media particularly differ in the interaction possibilities with the work components. The following section presents the deduced hypotheses, from which the study design will be derived later on.

2.3 Hypothesis

The hypotheses to be examined in this study relate on the one hand to the perception and assessment of the respective medium (AR/AV vs. VR). At this point, the evaluation of usability and ergonomics is differentiated. On the other hand, hypotheses on the perception and evaluation of the training process using MR are formulated and tested in the further course of the study. Since there is no evidence for the suitability of one of the media compared to the other for such an application, the hypotheses are formulated and tested in a non-directional way (Table 2).

Table 2. Hypotheses

3 Method

3.1 Study Design

The study design consists of an experimental within subject procedure with pre- and post test. This ensures that each participant could test both AR/AV and VR training application. At first, the participants were asked to fill out a questionnaire before they were assigned to a test condition. The order of the test conditions was randomly assigned. Test condition A included the following procedure: (1) Fill out a pre-test questionnaire, (2a) conduct training with AR, (2b) fill out a post-test questionnaire on the use of AR, (3a) conduct training with VR, (3b) fill out a post-test questionnaire on the use of VR, (4) answer the question of which medium would be the best choice. In test condition B the same procedure was chosen, but at first the VR training and subsequently the AR training was performed.

The same scenario was run through with all participants. The training took place in a room outside the production hall in order to avoid disturbing influences. After a brief introduction to the use of the technology, the participants had time to inspect the virtual assembly cell. After the orientation phase, the participants were able to watch a guided exercise scenario. The participants did not have to become active themselves, but had to use the clicker to proceed with the next step. Afterwards, they were able to view the entire assembly process again in animated mode - this was possible either in real time or in a slower version. The last step included the independent execution of the assembly steps. In the AR/AV version, the components had to be selected via eye control and clicker - the animation of those components was then carried out automatically if the selection was correct. In the VR version, the participants had the task of gripping the components themselves with the controllers (a distinction was made here between gripping with the right and left hand) and placing them in the correct position. The duration of use of the devices varied between 8 and 25 min.

The description of the questionnaires used in pre- and post-tests is presented below. Subsequently, an overview of the participants of the study is given.

3.2 Questionnaires

Pre-test.

In a pre-test, the participants were asked about their demographic data (age, gender, educational level) and their current job position. In order to investigate the participants’ previous knowledge of the task, they were asked whether they had ever carried out pneumatic cylinder assembly on the machine before. In addition, the participants were asked whether they had ever participated in a training procedure using AR/VR.

Subsequently, the participants were asked about their technical affinity with five items (e.g., “My enthusiasm for technology is…”) on a six-level scale from “very low” to “very high”. To complete the data on the participants, they were also asked which media (e.g. laptop, smartphone, tablets) are available to them, how often they use them and how easy-to-use the respective medium is. In addition, we used the “locus of control” questionnaire (KUT) to assess general control beliefs while dealing with technology [31]. With its 8 items (e.g., “I feel so helpless when dealing with technical devices that I do not even touch them”) on a scale of six levels from “not at all” to “absolutely”, the German questionnaire achieves a reliability of Cronbach’s α = .85.

Post-test.

The post-test was divided into two parts for each test condition. First, questions were answered to evaluate the training medium. First, the System Usability Scale (SUS; [32]) was used to evaluate the respective technology. A SUS score between 60–80 means that the system is marginally acceptable, scores above 80 indicate good to very good system usability and 100 points indicate an excellent rating system that fully meets the users’ expectations. In order to ensure that the answer scale of the entire questionnaire is consistent, a six level scale “do not agree at all” to “fully agree” was used in this study. Accordingly, an adjusted factor was used in the calculation of the overall score in order to ensure the comparability and significance of the SUS score. This scale reached a Cronbach’s alpha of α = .80 in this study.

In order to be able to cover not only system usability but also ergonomic issues, six self-created items were used (e.g., “The field of vision of the glasses restricted me.”) and assessed on the same scale. This scale reached an α = .68 and thus has to be treated with caution. Furthermore, the questionnaire contained two open questions: “What difficulties did you encounter in the learning process with the training medium?” and “What functionalities (e.g. speech/gestural control) did you find particularly helpful?”

The second part included the evaluation of the training process. The Nasa Task Load Index (NASA TLX) was used to measure the perceived load during the task. It measures the subjectively perceived demand with a multidimensional scale that distinguishes, for example, between physical and psychological stress [33]. The German summary contains six dimensions, namely mental, physical and temporal stress as well as performance, effort and frustration. The original scale has 20 gradations from “very low” to “very high”. Adapted to the German version, we have used a 10-step scale with the poles “little” and “much”.

In order to capture the evaluation of the training process in depth, two open questions were asked: “To what extent do you feel prepared for the task by using the system?” and “To what extent would you prefer the system used to other training methods (please explain)? After both test conditions were carried out, a final question was asked: “Which of the two tested training media would you prefer? Please give reasons for your answer”.

3.3 Participants

A total of nine men and seven women participated in the study (N = 16). Four participants were under 20, six of the participants were between 21–25 years old, one participant between 26–30 years old, four participants were between 31–40 years old and one participant was between 41–45 years old. The sample consisted of six company employees, two temporary workers, and the remaining eight were trainees, apprentices or managers. Three participants stated that they had intermediate level of secondary education, three had technical college entrance qualification. Six participants stated that their highest educational level was high school and/or university entrance qualification, the remaining eight already had an academic degree (nBachelor = 2, nMaster = 1, nPhD = 1).

Two participants stated that they already had experience with MR-based training methods. One of the participants had already conducted the MR-based training for the respective assembly, three other participants already knew the assembly, but based on manual training.

3.4 Analysis

The analysis of the collected data was conducted using SPSS. Since we are dealing with a small sample (N = 16), the inferential statistical evaluation is based on a non-parametric data level. For group comparisons, the Wilcoxon test for paired samples is used. In the following, an overview of descriptive statistics is given first. In order to comply with the SUS analysis requirements, the items on the system usability and ergonomics scales were adjusted from one to six to zero to five.

4 Results

An overview of the results of the pre-test is given below. First, descriptive statistics for the respective questionnaires in the pre- and post-test are reported. Subsequently the test results of the hypotheses are presented. In the framework of the pre-test, the use of technology, the handling of technology and the affinity of the sample towards technology are presented.

4.1 Pre-test Results

All study participants reported to have access to laptop/PC and smartphone for private or professional purposes. 14 participants work with tablets, while only three participants used Microsoft HoloLens or Oculus Rift/HTC Vive. The ease of use of these technologies was assessed on a scale from one “very difficult” to six “very easy”. The Microsoft HoloLens received the lowest value with a mean value of 4.50 (Min = 4, Max = 5, SD = .58, n = 3), the smartphone received the highest mean value with \( {\bar{\text{x}}} \)  = 5.56 (Min = 1, Max = 6, SD = 1.3, n = 16).

An overview of the participants’ affinity for technology is given in Table 3. The locus of control while dealing with technology in the sample had a mean value of \( {\bar{\text{x}}} \)  = 4.44 (min = 3.13, max = 5.38, SD = 0.74, n = 16).

Table 3. Descriptive statistics for technical affinity.

Hypothesis 1: There is a correlation between technology affinity and the assessment of the perceived system usability

A Spearman’s correlation was run to determine the relationship between technical affinity values and SUS Index for AR/AV and VR. No correlation was found, neither for the AR/AV SUS Index (= −.12, p = .67, n = 16), nor for VR’s system usability (= .25, p = .35 n = 16).

4.2 Post-test Results

The results of the post-test include an overview of the descriptive statistics for the evaluation of the media and the respective learning process. In each case, the results of the hypothesis tests are presented afterwards.

Hypothesis 2a: The usability of the two media AR/AV and VR differs significantly

Descriptive statistics (mean value, standard deviation, minimum and maximum) of the System Usability Scale are presented in Table 4. The six level scale was adjusted to zero = “do not agree at all” to five = “fully agree”. The calculation of the SUS score shows that the AR/AV system has an overall score of 63.63. The SUS score for the VR system is 72.75.

Table 4. Descriptive statistics for system usability. (The following items have been translated by the author into English for better comprehensibility.)

The Wilcoxon test for paired samples shows a significant difference with a median of 35.5 for AR/VR and a median of 37.5 for VR (z = −2.047, p = 0.041, n = 16). Hypothesis 1a can therefore be confirmed.

Hypothesis 2b: The evaluation of ergonomics of the two media AR/AV and VR differs significantlyÜ

Descriptive statistics for the six-item scale for ergonomics is shown in Table 5. According to the Wilcoxon test for ergonomics of AR/VR (median = 23.5) and VR (median = 23.5) with z = −.596 and p = .551, it shows that the central tendencies of the respective test conditions are not different. Hypothesis H1b must therefore be rejected.

Table 5. Descriptive statistics for ergonomics.

Difficulties Using the Media.

Within the framework of the open questions, the participants reported the following difficulties in the training process with the medium AR/AV: The glasses were very heavy and/or unsuitable for wearer of glasses, limited field of vision (5), lack of integration of the hands (2), not realistic, difficulties with orientation (Where does the next step take place?) (4), difficulties with clicker operation.

When using VR, the participants reported the following problems: The field of vision flickers/is blurred (2), dizziness (2), spatial boundaries are missing, it is difficult to immerse oneself in the animation, lack of intuitiveness when operating the controllers (e.g. when grasping), sound feedback is sometimes confusing.

Helpful Functionalities.

The following comments were noted as helpful features for AR/AV: Clear audio instructions through the steps as a supplement to the text, pleasant wearing comfort, real environment is still visible (2), free movement in space is possible, cursor control via view, as well as clear color highlighting of the components to be used. The following features were found to be helpful when using VR glasses: Controller (possibility to grasp and move the components) (7), differentiation between right and left hand (2), color highlighting of the components to be used (2), high level of realism by imitating the real environment, and comfortable wearing comfort.

Hypothesis 3: The used media AR/AV and VR differ significantly in their evaluation of the training process

The evaluation of the NASA Task Load Index shows that the perception of the test persons regarding their load during the task does not differ according to medium. Figure 3 shows the mean values of the individual scales for the media in comparison. The Wilcoxon test shows that hypothesis 2 must be rejected with z = −0.910 and p = 0.363.

Fig. 3.
figure 3

NASA task load index for AR/AV and VR (n = 16).

Preparation for the Task by the System.

When using the AR/AV medium, participants noted that they felt well prepared (6) because they knew the process. Some participants noted that they would repeat the exercise several times before the actual assembly (2). Others said that they felt less prepared because of the lack of orientation (3) and poor transferability to the real workplace. With the VR system, the majority of the test persons felt well to very well prepared for the upcoming task (4). One participant noticed that he felt as safe as if he had received a personal instruction to the system. Two participants also indicated that they would repeat the exercise in advance. One participant states that he feels relatively safe, but that he would not be able to answer adequately if he were asked questions.

Comparison of MR to Other Training Methods.

Some participants indicate that AR/AV System is preferable to text-based process instructions (3). It is especially emphasized that the method is suitable for an initial familiarization with the process. Personal learning by another employee is still preferred by two participants. Due to the lack of haptics, the system is merely a supplement to, but not a replacement for, instructions on the physical machine. Participants positively noted that each employee can learn at his/her own pace, the training is carried out in a very standardized manner and no time or money loss is incurred due to cycle time overruns. One participant stated that he prefers a mixed form of both systems.

The VR system is described by the participants as very realistic. The participants state that it is particularly suitable for obtaining an overview and orientation of the workplace. It is preferred over text forms, but the advantage of VR over simple videos is questioned. However, when problems arise, the participants see difficulties.

Choice of Medium.

The subjects rated the VR system significantly better and would choose to use it (14 for VR, 2 not quite clear, 0 for AR). The mentioned advantages of VR in this question refer to the fact that orientation is easier, the use of the hands helps to memorize procedures, the field of vision is larger and there is a higher wearing comfort. The test persons describe the experience as closer to reality.

A correlation matrix was created exploratively in order to identify the extent to which there are correlations between the decision for VR and the surveyed parameters system usability, ergonomics, and task load. Table 6 shows that there are significant point-biserial correlations between the decision for VR and system usability (= .63, n = 16, p < 0.01) as well as between VR and the evaluation of ergonomics (= .65, n = 16, p < 0.01). There seems to be no significant correlation between the task load and the decision for VR.

Table 6. Correlation matrix

5 Conclusion

5.1 Limitations

Before discussing the results of the present study, the limitations of the study are presented briefly. An essential aspect is the small size and representativeness of the sample. Especially in studies that take place outside the laboratory but directly in industry, it is difficult to acquire an adequate number of participants. The problem of the low validity of the data of such small number of participants was countered by a conservative approach to statistical analysis and additionally recorded qualitative statements. Nevertheless, any results based on such a small amount of data should be interpreted with caution. Furthermore, the composition of the sample does not fully reflect the characteristics of typical temporary workers. For this reason, we have made sure that the previous knowledge for the task should be as little as possible in order to deduce statements about inexperienced workers. Another problem is the use of a non-validated scale for the factor ergonomics. This scale should be revised and validated in larger surveys. Furthermore, the scale of system usability was evaluated in this study by a six-level item scale and not, as in the original, by a five-level scale. This could also have an influence on the reliability of the scale.

5.2 Discussion of Results

The results on the tested hypotheses are summarized in Table 7 and are discussed below, taking into account the qualitative results.

Table 7. Decision on hypotheses

The pre-test shows that the participants in the study had little or no overall experience with MR. Even though the sample has a relatively high affinity for technology and indicates that they feel confident in using technology, there seems to be no correlation with the subsequent evaluation of the systems’ usability. The enthusiasm for technology seems to play a subordinate role in the training procedure for the presented use case. This leads to the assumption that people with less technical affinity could also get along well with the system - and vice versa.

The evaluation of the media was divided into the evaluation of system usability and ergonomics. Regarding system usability, both media had a rather lower score, which is described in the manual as only marginally acceptable. However, this could be due to the fact that the participants in the study were asked to closely examine both systems. The assumption that system usability scores differ can be confirmed. Data show that the VR system is rated significantly better.

Statistically speaking, there are no differences between the systems in the scale of ergonomics. This result is surprising, as the written comments mainly mention comments on ergonomically related difficulties (e.g. the restricted field of vision). This indicates that the items of the scale should be revised. The written comments on the media show that all participants prefer clear guidance through the process steps (e.g. by highlighting the objects in color or using cue arrows). When independently carrying out the assembly steps, the use of controllers in the VR system was found to be helpful. The high degree of reality provided by the 360° photo of the assembly hall is perceived as pleasant. However, some participants stated that they found the simultaneous perception of reality in AR/AV more pleasant and safer.

With regard to the evaluation of the training process through the NASA TLX, the descriptive statistics show only a small recognizable difference in the area of physical demand, which can be explained by the increased movement and interaction in VR during assembly. Overall, the load during the assembly does not seem to differ between the media. At this point it should be questioned whether the scale used is sufficiently valid for the criteria to be measured.

In the written comments it becomes evident that the participants feel well prepared for the realistic assembly through both media. Most of them note that they want to repeat the exercise before the assembly is carried out on the real system. Compared to other (e.g. text-based) instructions, the participants regard the MR procedures as superior. However, a few number (two) of the participants prefer the possibility to be trained by another person or that MR is combined with training in a real setup. Accordingly, it should be ensured that the test persons are provided with a supervisor who can answer questions if needed. The high standardization and the possibility of individual learning speed was perceived positively.

The final evaluation of the media clearly shows that the participants prefer the VR system. The main reasons are the good orientation in space, the natural field of view and the use of the controllers to interact with the work pieces. The explorative correlation matrix shows that there are correlations between the decision for VR and the system usability of VR and it’s evaluation of ergonomics. Although this does not allow any conclusions to be drawn about a cause-and-effect relationship, it nevertheless confirms the importance of placing a strong focus on usability in the development of such systems and of involving users early in the development process.

5.3 Conclusion and Outlook

The findings allow drawing conclusions about the specification and importance of different MR technology features. The underlying features seem to be more important than the clear separation of the devices. Thus, for a task that requires interaction of the user with a real or virtual object, there should be appropriate possibilities for interaction in the system. This can be realized in AR/AV technologies as well as in VR. The following table (Table 8) summarizes the collected findings in preliminary guidelines for classifications, which should be reviewed and extended in further studies.

Table 8. Guideline for developing MR training systems

The listed points result mainly from the qualitative results of the participants’ written comments. Thus, it is evident that the collection of qualitative data is essential when evaluating new systems. In order to be able to integrate the empirical results of the quantitative survey, a further study should be conducted with a larger sample - if necessary also in a laboratory setting - and compared with the available findings. Furthermore, possible relevant user factors (such as openness to new experiences or the current mood) should be surveyed and examined for correlations with the evaluation of the systems.