Keywords

1 Introduction

In present society, the high picture quality of mobile phones and the booming of mobile live broadcasting and short video industry have increased people’s demand for the quality of mobile video shooting. This is true especially in terms of the stability and various photographic modes.

Products that are used in everyday life and that are not usable, that do not meet the requirements of users, cause frustration of users, and usually users no longer want to use these products [1].

According to ISO 9241-11, usability refers to the effectiveness, efficiency and subjective satisfaction of a product when it is used for a specific user in a specific environment [2]. Before embarking on the development of a new product, it may be beneficial to conduct evaluations of existing products in order to gain an understanding in terms of the usability of what is already available. This might form a baseline against which the usability of the new product can be judged [2]. Therefore, we chose two mainstream mobile phone gimbals in the market as research samples.

The purpose of this paper is to compare two Mobile Phone Gimbals from DJI and MOZA to investigate usability issues and access which product has better usability and provide iterative recommendations for Mobile Phone Gimbal. Because our usability scenario is comparing two products [3], task success, efficiency (time and steps) and self-report of satisfaction are chosen as metrics. Being able to complete a task correctly is essential for most products. By looking at efficiency, we will get a good sense of how much effort is required to use the product. Some self-reported metrics of satisfaction provide an overall user experience with the product, especially when comparing multiple products.

A mobile phone gimbal comprises a handle body, with the stabilizer and a fixing device for mobile phone on the stabilizer (see Fig. 1). The stabilizer includes three motors which are arranged at top of the handheld part and are orthogonal in a space, wherein the three motors are respectively an X-axis motor, a Y-axis motor, and a Z-axis motor [4], to guarantee the stability of mobile phone camera in all directions in space.

Fig. 1.
figure 1

The structure of a mobile phone gimbal (E.g. DJI OSMO MOBILE 2)

2 Materials and Methods

2.1 Materials

Two mobile phone gimbals from the Chinese brand DJI and MOZA have been chosen (see Fig. 2). In terms of the appearance, they are very similar in the size, shape and material. Relevant product data can be seen in Table 1. The significant difference between them is the button type on the handheld body. During the test all subjects will perform several tasks with both devices.

Fig. 2.
figure 2

DJI OSMO MOBILE 2 (left) and MOZA Mini MI (right)

Table 1. Product data

2.2 Subjects

Seven subjects were recruited via online questionnaire. All subjects were between 20–24 years old and have basic video knowledge like shutter, focus and slow motion. They were divided into three groups (3 novice users, 2 intermediate users and 2 professional user) on the basis of familiarity and proficiency with the mobile phone gimbal, and we defined these three groups as one never used a mobile phone gimbal, used several times, used more than 10 times, respectively (see Table 2.). There was no prior training, and all subjects were only shown the product introduction (basis functions and modes) and notes on the products homepage.

Table 2. User grouping

2.3 Task Design

Each case uses the same evaluation protocol. First, subjects will receive a short questionnaire (gender, age and field). Then, a concurrent think-aloud protocol is administered when they were given five tasks to complete using two products. One researcher will observe and record their emotions as they speak. At the same time, usability performance metrics (task success, time, satisfaction, steps, and errors) will be also assessed.

Subjects have five minutes to complete each task. If they could not complete within a limited time or do not want to proceed, they continue to the next task. The first task is to install the mobile phone on the gimbal and adjust it to the balance. Next is connecting Bluetooth and accessing the app for each product (task 2) to take a 2–5 s video, from left to right, and keep zoom in to maximum (task 3); Then they are asked to use slow motion mode to shoot a 2–5 s video (task 4); The final task is to adjust to self-timer mode of vertical screen and take a photo (task 5). After each task, the subjects will be given a short interview to measure task satisfaction. After carrying out all tasks, they fill out the questionnaire including 16 questions.

The usability tests will last around 60 min. The tests will be conducted in a cafe and each of them is performed in a closed room to minimize distraction. Audio and screen capture recordings will be made during the tests.

2.4 Task Analysis

In order to collect and measure the metric like efficiency more conveniently, we identify the actions of each tasks we want to capture. According to the common functions of smart phone gimbals, the tasks are divided into five aspects, including mounting and balancing, connecting, parameter adjustment, mode switching and working mode switching (landscape/portrait). The optimal steps for each task are defined according to the actual operations, in which are same in the task 1 and 2 for both products. In task 3–5 when it comes to the menu of apps, the operations of MOZA differ from OSMO, and the former is indicated by dashed lines. The task descriptions will be displayed to users at the start of each task.

3 Results

3.1 Effectiveness

Effectiveness refers to the degree of accuracy and completeness [1] of users when they complete specific tasks and achieve specific goals. Task success [3] is the broadest metric from it. All subjects were shown the end state in text clearly of each task before they started, and were told to give an oral self-report when they thought they have finished the task or wanted to give it up. We divided the success of tasks into three levels [3] (see Fig. 3).

Fig. 3.
figure 3

Number of users with different levels of tasks success (No problem = Complete the task accurately without help; Have problems = Complete the task with some problems; Fail = Task failed or abandoned)

In general, the level of task success differs from types of users and tasks. When using OSMO, three novice users failed in task 1, and only an expert (P7) completed the task accurately, other users had the problem with balancing the gimbal, and repeatedly rotated the two knobs on the mobile phone holder and shaft. Similarly, there also were two novice users (P1 and P3) who failed in task 5 involving mounting and balancing. In task 4, there were some problems for two novice users (P1 and P2) in switching slow-motion mode. However, no significant differences between types of users were seen in task 2 and 3 when connecting and adjusting parameter.

For MOZA, the degree of task completion is generally better than that of OSMO, except for task 4, where a novice user (P1) failed to find the slow-motion mode due to a timeout, and five users completed the task with some other problems.

3.2 Efficiency

Efficiency refers to the amount of effort required to complete the task [1]. Task time [3] is an excellent way to measure the efficiency in the usability test. The time it takes a subject to perform a task says a lot about the usability of the product. Another way is measuring the number of actions or steps that subjects took in performing each task. In the steps of completing the task, the operation and cognitive errors of subjects will affect the performance of the task, so the number of error steps made during the interaction is also very revealing.

Task Time.

In the task 1 and 5 which involve mounting and balancing, they took longer time than other tasks (see Fig. 4), and the effort novices made are two or three times as the experts’ (see Table 3). It took longer time for subjects to use OSMO to complete the tasks (above two and three minutes respectively) comparing with MOZA, which has a greater experience in balancing. The bigger size of MOZA leads to a larger tolerance range and saving time.

Fig. 4.
figure 4

Average task time of each task (shown in second)

Table 3. Task time (shown in second); “/” means the user failed in this task due to a timeout above 300 s, which is used to calculate the mean score

There was no significant difference between types of users when completing task 2, while a marked difference can be seen in task 3 and 4. Two novice users (P1 and P2) spent more than two minutes to complete task 4 by OSMO, but the intermediate users and expert users took less than half a minute. It seems that the interactive ways of zooming in/out are different, and users can push up the zoom slider by OSMO or rotate the dial wheel by MOZA. The later has no obvious sign indicated that the dial wheel is used to zoom in and its finer scale and slower zoom speed cause lower efficiency.

Comparing the two products, the efficiency of using MOZA to complete the operation involving buttons and app interface (task 3 and 4) is lower than that of OSMO. Especially the novice users spend a lot of time to search for the position where the mode displays. However, as for Bluetooth connection (task 2), OSMO has more confusing feedback, so it takes more time than MOZA.

Task Steps and Error.

From the optimal steps (see Fig. 5), when choosing a special shooting mode such as slow-motion and selfie mode, MOZA has two more steps than OSMO because of its functional structure which affects the user’s efficiency. Users tended to make errors in task 4 and 5 by using MOZA, and there were six subjects except for P7 trying every icon on the first level of menu to search the slow-motion (see Table 3), as they considered that the basic mode switching should be on the first level of the functional structure as OSMO does. MOZO allows switching mode by the button on the handheld body, but a user (P5) regarded the record button which set in the middle of dial wheel as the menu button.

Fig. 5.
figure 5

Average task steps

Subjects are also prone to errors when mounting and balancing the phone, especially novice and intermediate users, because they are not familiar with the balancing principles and structure of gimbal. Five subjects thought they can adjust the balance by rotating the knob (which is actually used to unlock the arm-shaft and mobile phone holder), and it made them to spend more time trying to rotate each knob.

Another bigger error occurred in task 5 due to the smaller size of OSMO. There was no room for mobile phone to rotate, so subjects have to demount the mobile phone before rotating the mobile phone holder to the selfie mode. Four subjects expressed puzzle and they did not want to re-mount and balance again when they found it impossible to rotate the holder with mobile phone.

3.3 Satisfaction

Satisfaction refers to the degree of subjective satisfaction and acceptance that subjects feel in the process of using products [1]. After carrying out all tasks, the users were asked to fill out the questionnaire, which involve usefulness, learnability, satisfaction, ease of use [5] (see Table 4.), and given a short interview about their scores.

Table 4. Satisfaction questionnaire (5-point semantic differential scale, −3 = Totally disagree, 3 = Totally agree) Q1–Q3, Q4–Q7, Q8–Q10 and Q11–Q13 are related to the usefulness, learnability, satisfaction and fault tolerance.

The average satisfaction of usefulness is 1.89 and 2.16 for OSMO and MOZA respectively, and the difference is shown in Q2, as the novice and immediate user thought them spend too much time on mounting and balancing.

For the experienced users, they use the mobile phone gimbal with ease, but in the operation of the other brands of mobile phone gimbal which they are not familiar with, is likely to be troubled by the existing experience, for example, an intermediate user (P4) who only used OSMO several times, felt confused and spent time to get used to the dial wheel to choose mode. However, the expert user (P6) who were used to use DSLRs (Digital Singular Lens Reflex) suggested some possible improvements to the product, such as “I think the wheel dial is more accurate (than the zoom slider).” While a few users (P4 and P5) thought “The zoom slider is better, as the zoom speed of dial wheel is too slow.”

The average satisfaction of novice users is lower than that of other types of users. Compared with others, the novice users clearly showed confusion and uncertainty about the next step as they said “Is this balance?” and “Should I press this now?” (P1) “Now…is it shooting?”

The ease of learning scores are as low as 0.77 and 1.48, indicating that when using both gimbal, users cannot easily detect and correct their mistakes. In the user behavior analysis (including semantic analysis and operational process analysis), it is further concluded that the reason for the poor fault-tolerant data is more due to not knowing how to correct their own errors than finding them.

4 Discussion

In general, the task 1 and 5 which refers to mobile phone mounting and balancing takes longer time and results in more errors on both devices, as its low compatibility and consistency [3] with users’ expectations based on their knowledge and other similar products in terms of two rotate knobs which used to unlock the adjustable arm and mobile phone holder to move horizontally and rotate respectively (see Fig. 6). In our tests, fives users were confused about two rotate knobs and their purposes, they considered it rotary to adjust balance, the reason is that there is no obvious sign on the arm that it can be retracted. Therefore, as the improvement for OSMO MOBILE 2, we keep only one knob behind the mobile phone holder which can be loosened to move mobile phone holder horizontally and rotate it to the self-timer mode (see Fig. 7).

Fig. 6.
figure 6

OSMO MOBILE 2

Fig. 7.
figure 7

Improvement for OSMO MOBILE 2

We finished our test and developed our improvement for OSMO MOBILE 2 on June 10, 2019, it is worth noting that the DJI Technology Co, Ltd launched their new generation mobile phone gimbal OSMO MOBILE 3 on August 13, 2019 which removed two knobs and the adjustable arm. It needs to be balanced according to the adjustment position of the phone on the holder, and kept fixed by pressure and gear clasp [6], which is consistent with the structure of selfie stick. The similar proposal of keeping only the structure close to the phone be adjustable proved that the previous structure of adjustable arm leading to users’ cognitive complexity.

Additionally, in our test we found that when installing the mobile phone to the gimbal, users have to operate the mobile phone holder and adjustable knob by hands, so a steady fulcrum is necessary. Both the table and body can be used as the fulcrum and that depends on the usage scenario. It is better to add a layer of anti-skid silica gel material on the bottom of handheld body.

The main features of mobile phone gimbal are shooting modes such as slow motion, time-lapse and panorama, so we took account of them in the task 4. There are differences in the interactive mode and functional framework between two apps of OSMO and MOZA. When switch the mode, OSMO can only complete the task through tapping the screen of mobile phone, on which the mode switch is on the first menu (see Fig. 8). It was similar to the camera interface on mobile phone, all users except for P1 could quickly and clearly complete the task of switching shooting mode within two steps. Conversely, MOZA can be operated both by tapping the mobile phone screen and buttons on the gimbal - Press the UP of dial wheel to enter the primary menu of setting, choose “camera mode” to the secondary menu (see Fig. 9) and users will find the slow-motion on the third level of menu with “photo” and “video” (see Fig. 10) by rotating the rotate wheel. The fact that MOZA puts the mode switch on the third menu below “setting” that associated with setting param, and the same level menu with “photo”, leads to users’ neglect when recording video. Actually, very few users tapped into secondary menu in our test. Remarkably, three intermittent and expert users (P4, P5 and P6) who used to use DSLRs (Digital Single Lens Reflex Camera) indicated a preference for buttons and dial wheel of MOZA, which was also adjustable to zoom in and out. Compared with the zoom slider of OSMO, the dial wheel is more accurate with its larger operating range and scale. Besides, they believed that operating the handle could prevent the damage of motors causing by the pressure generated by clicking on the screen. However, there was still a significant learning curve for the novices to get used to the complicated menu of MOZA. Future research could take account of the ease of learning. In response to this problem, we add a dial wheel around the original M button on OSMO, allowing users press the UP button on dial wheel to enter the first menu and select by rotating the wheel (see Fig. 11), which is easy to use with just one hand, providing more versatility and freedom of movement.

Fig. 8.
figure 8

Slow-motion on DJI GO app

Fig. 9.
figure 9

Slow-motion on MOZA genie

Fig. 10.
figure 10

Functional framework of MOZA Genie app

Fig. 11.
figure 11

Improvement for buttons

In addition to the basic functions involved in our tasks, users can quickly switch advanced functions with the M button on OSMO, which are not considered in our test due to their complexity and low frequency of use. As we can see, from DJI’s newly released OSMO MOBILE 3, the implementation of these features has shifted to a trigger button behind the handle body [6].

5 Conclusion

This paper presented the result of a usability research of two Mobile Phone Gimbals from DJI and MOZA to provide iterative recommendations for Mobile Phone Gimbal. In order to measure the effectiveness, efficiency and subjective satisfaction, we chose four metrics of task success, time-on-task, errors and self-report. We recruited 6 users to complete five tasks with two gimbals following the principle of think aloud protocols and gave them a short interview to conduct the research.

The results revealed that there is a main usability issue of installing and balancing in novice users for both products. OSMO has simple buttons which is easier for the novice, but users have to tap the screen of mobile phone to complete some tasks. On the contrary, MOZA enables users to operate with single hand all the time by more buttons on the product, but it is too complicated to learn for the novice. Therefore, we iterate on the buttons and interaction mode of the gimbal products.

We designed a new gimbal on the base of OSMO Mobile 2. In the terms of the buttons, we keep only one knob which can be loosened to move mobile phone holder horizontally and rotate it to the self-timer mode. When users adjust the mobile phone on the gimbal, they need a steady fulcrum to support single-hand operation, so we add a layer of anti-skid silica gel material on the bottom of handheld body. Additionally, a wheel dial around the M button and a trigger on the back of the product are added to provide more efficient and interesting interaction experience between machine and the application.

With the improvement of technology, the access of short video has been increasingly convenient, and more people participate in the creation of video. In order to improve the shooting quality, there are many consumer products like OSMO mobile 2 coming into everyday life. As an innovative product in the emerging market, the mobile phone gimbal still needs constant usability tests to find problems and provide better experience. This is also the purpose of our research. The ease of operation on the machine and the efficiency of the interaction accessing with the application are both problems deserve our attention.