Keywords

1 Introduction

Due to size, weight, portability, connectivity, built-in sensors, and diversity of applications, mobile devices are now ubiquitous. In addition to tablets, smartphones in particular have achieved a high level of market penetration, and sales continue to grow [1]. However, the broad ubiquity of mobile devices is not limited to the consumer market. Businesses have also recognized the potential of mobile devices for supporting the engagement of business customers, suppliers and employees [2]. For employees, mobile devices provide the opportunity for new mobile work styles that shift work to the most convenient location and time [3]. By supporting these state-of-the-art styles, organizations can increase flexibility, optimize operational performance, enhance employee productivity, improve customer service, and attract and retain employees [3,4,5,6,7]. Benefits for mobile workers include better utilization of downtime, reduced travel time, access to data while on the move, enhanced situational and activity awareness, improved knowledge sharing tools, and support for multiple communication channels [3, 8,9,10].

As a result of these benefits, the percentage of “anytime, anywhere information workers” in the USA and Europe has risen and continues to do so. Overall, about 48% of employees use smartphones on a regular basis in a work context [11]. In many cases companies do not provide smartphones and therefore allow employees to use their personal devices (PD) due to the expected mutually beneficial effects. This concept of using a private device at work to undertake business tasks is called bring your own device (BYOD) [12].

BYOD was initiated by employees. The “bring your own” phenomenon has risen significantly among employees with direct client-contact [13]. Since users are typically very familiar with their PD, 61% of smartphone users want to use just one device for both work and personal activities [11]. Gartner estimated in 2013 that more than one third of employers worldwide no longer provide employees with mobile devices and that by 2018, 70% of mobile users will manage significant parts of their workload on personal smart devices [14]. In fact, 59% of organizations allowed employees to use their own devices for work purposes in 2017. Another 13% had planned to allow use within a year [15].

In this situation, organizations have to consider that when restrictions are imposed on personal activities, employees try to avoid such employers [16] or it may lead to work-life blurring [17, 18]. Removing such constraints by providing BYOD services therefore generally improves employee satisfaction; moreover, it can serve as a cost reduction and cost avoidance mechanism for companies [14]. However, in bypassing the IT department, employees seize the power to decide which IT tool fits their job needs best. Junglas [19] conducted a study, which defined and explicated the concept of IT empowerment, i.e., the level of authority an employee assumes in utilizing IT to control or improve aspects of his/her job, and tested this concept in the context of IT consumerization. Overall, the results demonstrated that employees can no longer be viewed only as passive consumers of technology as they now voluntarily accept responsibility for deciding which tech tool best fits their job needs, and thus, shift some of the fundamental tenets of IT governance.

Consequently, organizations have to scrutinize BYOD concepts and manage the associated risks in order to enable this business transformation by protecting and securing networks and data regardless of how workers access them [20]. To manage these risks, possible security issues and challenges have been [21, 22] and basic mitigation strategies have been assigned [23,24,25,26]. Additionally technical security frameworks have been developed [27], and wider organizational security frameworks (including people, policy management as well as technology) have been established [28]. Accordingly, successful BYOD deployment requires a comprehensive approach [29].

When facing these challenges, many organizations reject BYOD because they do not want to open up internal networks or provide access to information systems using unknown devices due to security and privacy concerns [30, 31]. Instead these organizations provide employees with a specific mobile device integrated into the organization’s IT infrastructure (here is your own device - HYOD) or even let them choose from a set of devices (choose your own device - CYOD) [24].

These business devices can contain discernible differences compared to the employee’s PD, which may include the operating system, software, screen size, or other hardware specifications like the processor or memory. Currently, four main mobile operating systems exist on the market, while two in particular were dominant during the period of this study (2015/2016: Android 82.8%, iOS 13.9%, Windows Phone 2.6%, BlackBerry OS 0.3%) (IDC, 2017). While iOS and Android dominate the consumer market, many organizations provided their employees with devices containing the mobile operating system Windows Phone since they wanted to keep the number of supported operating systems low and avoid BYOD due to security concerns, legal issues, technical complexity, administrative overheads, and high costs [7, 32]. In recent years, the overall share of new Windows phone devices on the market has decreased and therefore most users are unfamiliar with it. For this reason, we choose Windows phone devices as company provided devices (CPD) for this study.

Consequently, if employees are required to use a certain device for business activities, they may be confronted with an operating system and applications with which they are not familiar. Since the main benefits of BYOD are greater employee productivity and satisfaction, revoking BYOD may lead to lower results in these categories (PAC, 2013). Therefore, we conducted a usability study that assessed these constructs after users performed typical business tasks on their PD and on a device they were unfamiliar with (CPD - company provided device).

The remainder of this paper is structured as follows: in Sect. 2 we present related work; the methodology is described in Sect. 3; Sect. 4 includes all results based on the three metrics: effectiveness, efficiency, and satisfaction; and we discuss the limitations of this study in Sect. 4. Finally, we provide a discussion of the results and conclusions in Sect. 5.

2 Related Work

To increase work progress, users need to learn how to access systems and their data and how to leverage the information they provide to perform their daily tasks as effectively as possible [33]. To assess the work progress, multiple qualitative and quantitative methods like interviews, surveys, or tests can be used. When asked, employees indicated they would perform better with their PD. De Kok et al. found that in companies that implement a CYOD program, and despite the fact that 70% of employees stated they can perform their work well on the device provided, a majority (52%) believed their performance would improve further if they could choose a device themselves [34]. Similarly, a study by Harris revealed employees think that using their PD would enable them to complete more tasks on time (49%), be more innovative (50%), and be happier (53%) [25]. Additionally, in a meta-analysis, Niehaves et al. [35] and Pillet et al. [36] established that, of all positive aspects of IT consumerization, employee satisfaction was mentioned the most.

Based on such surveys, various theoretical models and frameworks have been developed that suggest correlations between BYOD and employee performance and satisfaction. Niehaves et al. [37] expect the use of consumer IT devices for business purposes to contribute to work performance of employees via more highly perceived autonomy and competence. Additionally, Giddens and Tripp [63] theorize that the increased perceived competence is influenced by device self-efficacy and device innovativeness, and that BYOD may increase job satisfaction due to higher job autonomy and job performance. Köffer et al. [38] suggest that part of the increase in job performance is due to increased work satisfaction when using PDs. Additionally, Ostermann et al. detail decision factors such as the inconvenience of handling two devices, work life conflict concerns, perceived privacy risk and perceived financial risk in a multi-item scale to measure the influence of business use and private use of company owned devices in comparison to private devices (BYOD) [39].

Consequently, several contributions suggest that BYOD is generally associated with higher productivity and higher job satisfaction. However, these contributions are based on self-reported data in surveys (e.g. [7, 25, 32, 34]. In a literature review, no study could be identified which utilizes an experimental usability evaluation setting measuring performance by efficiency, effectiveness, and satisfaction as a methodological approach to investigate these effects; yet, there are usability studies that investigate related problems.

In usability testing of mobile devices perceived usability can be influenced by the hardware, the operating system, and the application used. Therefore, evaluating the usability of devices may include one or more of these layers. Some studies investigate hardware issues such as the battery life [40] or the impact of mobile phone screen size on user comprehension [41, 42]. However, general conclusions for the leading platforms (iOS, Android and Windows Phone) are not possible due to the variety of different devices with varying hardware configurations.

Several empirical studies compare the usability of operating systems by focusing on specific features such as virtual keyboards of different mobile platforms [43], tactile feedback from touch screens while entering text [44], text input methods on smartphones (ITU-T, Swype, Swiftkey, Thick Buttons, Keypurr) [45], or the position of virtual keyboards [46].

Moreover, some studies have compared the usability of different smartphones like Nokia (Symbian), HTC (Windows Phone), and Palm Treo (Palm OS). Results showed significant differences in the usability for some functions (e.g. searching for information on a website using an internet browser, making a call); however, these results are based on outdated hardware and software [47].

A more recent study compared the usability of the different device types, e.g. tablets and smartphones (iPad 2, iPhone 4S, Huawei Impulse 4G, HTC status), for accessing health-related information via different applications (web browser, mobile apps). It found significantly different results in task completion time for devices and applications [48]. One important difference, though, is that this study used only one metric and did not consider possible learning effects based on previous experience with devices or operating systems.

A study in 2014 identified usability issues by applying a qualitative approach, using a task-based test setting, observation, and thinking aloud to compare three smartphones (Apple iPhone 4 s - iOS 5, Samsung Galaxy Nexus - Android 4.0, Nokia Lumia 800 - Windows 7.5). It concluded that for efficient and effective device operation, the user experience (UX) must consider three different layers: hardware, operating system, and application [49].

Additionally, coherence between OS characteristics and the UX assessments of devices has been investigated. It was found that certain characteristics of operating systems (Windows Phone 8, iOS 6, Android 4.2) lead to satisfactory or unsatisfactory UX assessments. For example, iOS 6 and Android 4.2, provide a satisfactory support architecture whereas Windows Phone 8 devices were deemed to be difficult to use, saddled with inadequate graphic user interface (GUI) support, and complicated to learn [50].

Galetta [51] illustrates a methodology for a task set to evaluate smart phone platforms with regard to ease of learning and ease of use. Results provide preliminary evidence which indicates significant differences concerning these two constructs among the leading smartphone platforms (Android, iOS, Windows and Blackberry), especially between novice and expert users. Furthermore, the adaptation to new interaction styles has been investigated. It was found that users do not have significant difficulties when transferring to an unfamiliar mobile phone model [52].

Thus, to the best of our knowledge, to date there is no study that investigates users’ adaptation process to an unfamiliar smartphone and assesses it through usability tests that measure the performance and satisfaction of users. Therefore, we conducted a usability evaluation comparing how iOS and Android users rate their PD (used for BYOD) in comparison to a Windows Phone device (CPD) when performing typical business tasks.

3 Methodology

The study was based on a within-subjects design. At multiple measurement points a triangulation of complementary usability methods was used. Every measurement point included a usability test, a standardized usability questionnaire, and a qualitative survey. The usability tests assessed the usability of the devices using the ISO 9241-11 definition. It defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use” [53]. Effectiveness and efficiency were measured by a usability test and satisfaction was determined by the standardized questionnaire PSSUQ (Post-Study System Usability Questionnaire) [54]. Additionally, qualitative questions were asked to gather information on potential reasons for the evaluation results. The study comprised twenty participants and included three iterated evaluation points to measure effects on the performance and to assess the potential change in satisfaction over time. Further details on tasks, metrics, participants, procedure, and used devices are presented in the following sections.

3.1 Tasks

We wanted to test tasks that cover typical business activities. Because we tested smartphones, we selected the four business-related tasks most executed on smartphones based on the Forrester research [29]. The results of this study indicate that (i) reading or viewing documents, spreadsheets, or presentations; (ii) accessing employee intranet/portal; (iii) accessing email and/or calendar applications; and (iv) taking work-related photos and/or videos are the most relevant tasks carried out with smart devices to be considered within our research.

Based on this selection we identified suitable actions involving these tasks. In some cases, the complexity of the tasks was increased by further activities to avoid a trivial level (e.g. accessing and reading a document) and to ensure that the data would express measurable differences between the analyzed devices. This approach led to the selection of the following tasks:

  • Task 1 - Taking a Photo: Participants had to take an arbitrary photo, create a new folder called work, move the picture into this folder, and check whether it was successfully moved in the respective default media database (iOS: Photos; Android: Gallery; Windows Phone: Photos).

  • Task 2 - Writing an Email: This task was not limited to accessing emails but also included writing an email. Participants had to open the standard email client, create a new email, insert a specific subject, address it to a pre-defined email address, type a specific text, assign the priority high, and include the photo from Task 1.

  • Task 3 - Creating a Calendar Entry: This task was not limited to accessing a calendar but also included creating a new calendar entry. Participants had to open the default calendar app, create a new appointment, and enter a specific name, date, time, location, and duration, as well as a reminder. After saving the appointment, participants had to change to the monthly view. Finally, participants had to move the appointment to another day and add a specific text note.

  • Task 4 - Searching for a Document: This task was supposed to mimic the search for documents in an intranet portal. It combined activities for searching, accessing, reading, and saving a document. Consequently, participants had to open the standard web browser, type in a specific search term, open a specific PDF file, save the URL as a bookmark, copy the URL, and save it as a separate note.

3.2 Metrics and Measurement

Since the study was based on the ISO 9241 definition of usability [53], we used established metrics for effectiveness, efficiency, and satisfaction. Because the tasks were somewhat complex, we decided to combine metrics for effectiveness and efficiency in a composite score metric.

Efficiency.

Efficiency is defined as the “resources expended in relation to the accuracy and completeness with which users achieve goals” [53]. It is typically measured by the effort needed to successfully execute a task. In our study, efficiency was measured by the time needed to complete a task measured in seconds (task completion time). Task completion time is widely accepted as a valid metric in usability tests [55].

Effectiveness.

Effectiveness is defined as the “accuracy and completeness with which users achieve specified goals” [53]. Two commonly used ways to measure effectiveness are “completion rate” and “number of errors” [56].

  • Task completion was assessed based on three levels; whether the user could finish the task (i) without help or, (ii) with help/support, or (iii) did not finish/gave up. The need for help was an especially important metric for economic reasons because the human desire to communicate with other employees when performing one’s own tasks reduces the other employees’ productivity since they are prevented from executing their own tasks [57]. A task was considered successfully completed if all assigned activities were performed.

  • The number of errors was incorporated by counting the attempts needed to finish the task, considering the levels at (i) the first attempt and (ii) multiple attempts.

Additionally, we deemed tasks not to have been successfully completed if users were unable to complete them within a given time frame. To compare the efficiency of task completion, the primary time frame (180 s) was based on the mean time taken to complete each task [58]. We defined time frames as: (i) primary time frame, within the defined time frame of 180 s; (ii) secondary time frame, max. 90 s outside of the primary time frame; and (iii) exceeding the time frame of 270 s in total. Support was only provided when participants actively requested it and was only provided within the primary and secondary time frame. Combining these levels, we defined five task execution success levels, as shown in Table 1.

Table 1. Task execution success levels

Satisfaction.

Satisfaction is the “freedom from discomfort and positive attitudes towards the use of the product” [53]. Among the numerous standardized questionnaires (e.g. QUIS, SUMI, SUS, UMUX,) the PSSUQ was chosen [59]. The PSSUQ is a post-study questionnaire for assessing users’ perceived satisfaction with the usability of a system in scenario-based usability evaluations. Studies proved the reliability and validity of the PSSUQ and provided evidence of significant generalizability for the questionnaire [54]. We chose the PSSUQ because of its brevity and manageability, and it has been proven to be highly reliable (Cronbach’s alpha = .96) [54]). It also includes subscales, which provided us with the opportunity to compare the smartphones also based on system quality, information quality, and interface quality.

We used the 19-item version of the PSSUQ. Each item was rated on a 7-point Likert-scale from 1 (totally agree) to 7 (totally disagree). There was also the possibility to not answer the question (no answer). The PSSUQ contains the following scales:

  • OVERALL (items 1–19): The scale OVERALL represents the aggregate satisfaction score of a system including the items 1–18, which are also used to calculate the other scales and the control item 19.

  • SYSUSE (items 1–8): The subscale SYSUSE (system usefulness) measures how easy it is to use and learn a system to effectively complete tasks and quickly become productive.

  • INFOQUAL (items 9–15): The sub-scale INFOQUAL (information quality) measures the quality of the system feedback by assessing whether it is easy to understand and effectively helps users. This includes error messages, online help, onscreen messages, and documentation.

  • INTERQUAL (items 16–18): The sub-scale INTERQUAL (interface quality) measures how pleasant the user’s experience is by assessing whether the system provides the expected functionality and capability.

In our study, participants had to complete the PSSUQ after performing all tasks with a certain device. Devices were compared using the arithmetic mean of all Likert-scale ratings.

3.3 Survey

Additionally, every participant was asked five questions after performing all tasks of a particular measurement point to obtain qualitative insights into and arguments about the experience they had when executing the tasks on the different devices. These questions were:

  • What are the advantages and disadvantages of using the Nokia Lumia or your own smartphone in performing the tasks?

  • What are the specific differences in using the Nokia Lumia to perform the tasks compared to the last measurement point?

  • Was the task execution easier than at the last measurement point?

  • Would you consider using the Nokia Lumia in a business context?

  • Would you consider using the Nokia Lumia as your personal device?

3.4 Participants

The study involved 20 participants (13 female/7 male) aged 21–51 (7 participants younger than 30 years, 13 participants aged 30 years or older; mean = 29.55, sd = 7.38). Ten participants were Android users while 10 were iOS users. The number of participants was chosen based on an argument by Nielsen, who claims using 20 participants typically offers a reasonably tight confidence interval collecting quantitative usability metrics [60]. The 20 participants had to meet the following requirements to join the study:

  • They had to have used the respective operating systems (iOS or Android) and devices (versions did not matter) for at least three months.

  • They had to consider themselves as advanced users with good knowledge and skills in handling the device and operating system.

  • They had to be regular users of the standard email client and calendar app on the respective device and operating system.

  • They had to lack any experience in handling smartphones with the Windows Phone operating system, as this was the provided benchmark device for the study.

3.5 Procedure

Every participant was tested three times. At the first measurement point they had to perform the tasks with their PD and with the Windows Phone device. Participants received a short introduction to the Windows Phone device and were allowed to use it for three minutes. To reduce carryover effects at the first measurement point based on order [61] and practice [62] due to carrying out the same tasks twice, half of the participants first used their own device and the other half first used the Windows Phone device.

At the two following measurement points participants used just the Windows Phone device. Using this multi-staged approach we wanted to examine the learning and practice effects as well as the change of effectiveness, efficiency, and satisfaction over time.

Before every task, the participants were provided with written instructions, which included the overall goal to be performed, a list of applications to be used, the state to be achieved, and a time limit. Before each particular task the participants had as much time as they wanted to read the task instructions and were given the opportunity to ask questions to clarify each task prior to attempting its performance.

After completing all four tasks using a particular device, participants had to complete the PSSUQ and were asked qualitative open questions. Table 2 summarizes this multi-stage evaluation procedure.

Table 2. Multi-stage evaluation (PD = personal device, CPD = company provided device)

3.6 Devices

The PDs of participants differed, including various Apple iPhone models, and Android devices from Samsung, HTC, and even Alcatel. The only condition for these devices was that they had the standard email, calendar, and photo gallery apps installed. Other specifications were not relevant for the evaluation settings. The Windows Phone device used was the Nokia Lumia 630 (Windows Phone 8.1, 4.5 inch display, 1.2 GHz quad core processor, 512 MB RAM).

4 Results

4.1 Effectiveness and Efficiency

Effectiveness was measured using the cumulative percentage of participants that reached the task goals in the defined levels of effectiveness (see Table 1). Figure 1 shows the effectiveness of all tasks. It includes the results for the PD (personal smartphone) at the first measurement point and the results for the CPD (Windows Phone Nokia Lumia) at all three measurement points.

Fig. 1.
figure 1

Effectiveness of all tasks

Efficiency was measured by the task completion time in seconds using the geometric mean, standard deviation, and range (see Sect. 3.2). Less time for task completion and subsequently a lower time range indicates more effective task completion. Table 3 includes these metrics for all four tasks and three evaluation points.

Table 3. Efficiency metrics - task completion in seconds

Figure 2 illustrates these results using a diagram. The bars represent the mean values, while the whiskers show the 95% confidence interval. As already mentioned, this data only includes successfully finished tasks. Since Task 1 on the CPD (Windows Phone Nokia Lumia) at the first attempt led to a high number of failed tasks (60%), the resulting data is skewed. Using data from all participants and calculating the time spent until participants gave up would lead to much higher values (mean: 247; std. dev: 51, max: 370). The next two attempts with the CPD in Task 1 also led to unfinished tasks (attempt 2: 20%, attempt 3: 10%). Participants also had problems with Task 4 where Attempt 1, 10%, and Attempt 2, 5%, failed to finish the task.

Fig. 2.
figure 2

Efficiency of all four tasks using the mean time of successfully finished tasks

Results for Single Tasks

  • Task 1 (Taking a Photo). This task was very difficult for the participants with the CPD. 60% were not able to finish at the first attempt, while the rest required assistance. Even, at the third measurement point (CPD3) some participants still were not able to finish the task. Post hoc tests using the Bonferroni correction revealed that Attempts 1 and 2 on the CPD led to significantly lower effectiveness levels (p = .000) when compared with the PD. Attempt 3 also led to a lower effectiveness level, but not on a significant level (p = .547). Consequently, the time spent on the first attempt was very high; however, task completion time decreased significantly at the following two attempts. Post hoc tests revealed that the results of the first two measurement points on the CPD were slower in comparison to the PD (measurement point 1 and 2: p = .000; measurement point 3: p = .07).

  • Task 2 (Writing an Email). The results for Task 2 show an entirely different picture in comparison to Task 1 (see Fig. 1), which indicates that the task is a relevant moderating variable, as participants had already had a similar success rate with the CPD at the first attempt rather than with their PD. In the course of the evaluation, all participants managed to finish the task without help in the given time frame (level 1 and 2). Post hoc tests using the Bonferroni correction led to no significant differences in the effectiveness level. These results are supported by the data for efficiency; i.e. participants were slightly faster even on the first attempt with the CPD. This led up to a significantly faster performance in Attempt 3 on the CPD compared to the PD (p = .006). Answers on post-study questions showed that participants liked the Outlook style of the CPD email app, and therefore were very comfortable with using it.

  • Task 3 (Creating a Calendar Entry). Task 3 seemed to be difficult for the participants, though they quickly adapted to the situation. Using the CPD, the participants struggled at the first attempt but improved significantly in Attempt 2, reaching a similar level as with their PD. Interestingly, there was no significant improvement at Measurement Point 3. Post hoc tests using the Bonferroni correction revealed that only Attempt 1 on the CPD led to a significant lower effectiveness level (p = .014). Concerning efficiency, Task 3 was performed significantly more slowly with the CPD at Measurement Point 1 (p = .008), though performance improved so strongly in Measurement Points 2 and 3 that no significant difference could be observed. This led to a similar task completion time of the PD and the CPD at Measurement Point 3.

  • Task 4 (Searching for an Online Document). The task again shows very interesting results for the PD and the CPD. At the first measurement point, over 50% of the participants needed assistance or failed completely. However, they evidently learned very quickly, and effectiveness improved considerably for the next two attempts, leading to a better result than with their own device at Measurement Point 3. Post hoc tests using the Bonferroni correction revealed that only Attempt 1 on the CPD led to a significantly lower effectiveness level (p = .000). In terms of efficiency, Task 4 was performed significantly more slowly with the CPD only at Measurement Point 1 (p = .002). However, the performance showed a progressive improvement of the participant’s efficiency, leading to an even faster task completion at Measurement Point 3 in comparison with the PD.

4.2 Satisfaction

Satisfaction was measured using the PSSUQ [59], which had to be filled out by the subjects after completing the four tasks with a particular device. The 19 items were rated on a 7-point Likert-scale from 1 (totally agree) to 7 (totally disagree). Based on the order of the response categories, low values correspond to high satisfaction ratings. Responses were analyzed using the arithmetic mean for the 4 PSSUQ scales OVERALL, SYSUSE, INFOQUAL and INTERQUAL (interface quality).

Figure 3 illustrates the means of all four scales and measurement points. The whiskers represent the 95% confidence interval. Results show that participants rated their PD (iOS or Android devices) better than the company provided CPD Nokia Lumia Windows Phone in all scales and at all measurement points.

Fig. 3.
figure 3

Satisfaction measured by PSSUQ based on [59]

Post hoc tests using the Bonferroni correction revealed that using the CPD led to significantly lower satisfaction ratings (high PSSUQ score) after Measuring Points 1 and 2, but not after Measuring Point 3 in the scales OVERALL (1) p = .000, (2) p = .002; (3) p = .079; SYSUSE (1) p = .000, (2) p = .007, (3) p = .256; and INFOQUAL (1) p = .000, (2) p = .038, (3) p = .384.

The INTERQUAL scale shows the biggest differences between the PD and the CPD. Compared to the PD, the ratings after all attempts with the CPD are significantly worse (Bonferroni correction: p = .000; p = .001; p = .016). This sub-scale incorporates items 16–18, which are represented by the following statements:

  1. 16.

    The interface of this system was pleasant.

  2. 17.

    I liked using the interface of this system.

  3. 18.

    This system has all the functions and capabilities I expected it to have.

Participants’ answers to the post-study questions stated that this difference was caused by the way the CPD structures the home screen using animations and the starting of apps using an alphabetical list rather than icon grids, as used in iOS and Android. Participants considered these representations to be unusual, unclear, and confusing.

Item 19 (Overall, I am satisfied with this system – Fig. 4) reveals that overall the satisfaction rate for CPD improves over time; but, the difference to the PD remains large as satisfaction with the CPD is significantly lower after all three attempts (PD: PSSUQ score = 1.20; CPD (1): PSSUQ score = 3,45 p = .000; CPD(2): PSSUQ score = 2.65, p = .001; CPD(3): PSSUQ score = 2.40, p = .010).

Fig. 4.
figure 4

PSSUQ item 19 - average satisfaction ratings of the personal device and the company provided device (whiskers represent the 95% confidence interval)

Differences Between iOS and Android Users in Satisfaction

When we look at differences between iOS and Android users there is a significant mismatch in the PSSUQ score (see Fig. 5). The users of iOS were more satisfied overall with their PD, and the rating of the CPD did not increase as strongly as the satisfaction ratings of Android users. Compared to the PD, iOS users rated the CPD significantly lower after the first two measurement points (Bonferroni correction: (1) p = .00; (2) p = .014; (3) p = .067).

Fig. 5.
figure 5

OVERALL satisfaction ratings of the CPD (Nokia Lumia Windows Phone) by iOS and Android users (whiskers represent the 95% confidence interval)

Overall, the satisfaction rating of Android users after Measurement Point 3 was 1.99, which was already close to their PD score at the beginning of the study (1.81). Therefore, ratings were only significantly lower after the first measurement point (Bonferroni correction: (1) p = .000; (2) p = .206; (3) p = 1.000).

5 Discussion and Conclusion

The paper examined the user adaptation process to a new smartphone platform using usability tests to simulate an adoption process of private devices (PD – in the form of iOS and Android phones) for a possible BYOD setting in comparison to a company provided device (CPD – in the form of Nokia Lumia Windows Phone). The tests used a summative usability evaluation approach to measure effectiveness, efficiency, and satisfaction at three different measurement points. The results will be discussed separately at first before final conclusions are drawn.

With regard to effectiveness, the study shows mixed results. Taking a photo (Task 1) and searching for an online document (Task 4) led to significant difficulties at the first and second measurement point. When performing strongly business-related tasks, such as writing an email (Task 2) and creating a calendar entry (Task 3), the results indicated a reasonable adaptation process for the CPD due to its similar style of the apps with Microsoft Outlook, with which most participants were familiar.

The analysis of data for efficiency showed that participants were even faster in writing an email (Task 2) with the CPD in comparison to their own device, even at the first attempt. In comparison to that, taking a photo (Task 1) was executed much faster on the PD, and at the third measurement point, participants were still substantially slower on the CPD. Creating a calendar entry (Task 3) and searching for an online document (Task 4) were initially executed more slowly on the CPD, but after repeated use, the participants executed these tasks faster than with their PD at the third attempt.

Consequently, we can conclude that there is a positive tendency for the performance indicators of efficiency and effectiveness, for all tasks except for taking a picture. Qualitative statements from the post hoc survey stated that this was particularly different on the CPD as opposed to the participants’ own device. Furthermore, the ratings for both efficiency and effectiveness show that the selected tasks significantly moderated the results of the study (e.g. email and calendar apps of the CPD scored well, since their interfaces replicate the desktop Outlook experience well).

In general, people seem to adapt to new or different devices, operating systems, and user interfaces pretty quickly. Performance ratings improved rapidly, in most cases they reached the level of the PD, and occasionally were even better with the CPD at the first attempt. These ever-improving performance scores are in line with existing research showing that users adapt quickly to an interface or interaction style and do not face significant difficulties when transferring to an unfamiliar mobile phone [52].

When it comes to satisfaction ratings, the results show a different picture. Ratings for the CPD were generally lower and did not reach the level of the PDs in any instance. However, there are considerable differences between the two subgroups of the panel. Satisfaction with the CPD was lower for iOS users in comparison with Android users. Android users’ satisfaction with the CPD at Measurement Point 3 (PSSUQ score = 1.99) was rated close to the initial score for the personal device (PSSUQ score = 1.81) at the beginning of the study.

In summary, these results indicate that employees are likely to be equally effective and efficient after a short adaption phase with a CPD. On the other hand, the level of satisfaction when using CPDs will be presumably lower in comparison to their own devices. In our study, this effect was mainly observable in iOS users whose satisfaction rating of the PD was twice as high as for the CPD (PSSUQ score iOS = 1.54, CPD = 2.98). Ultimately, companies will have to decide for or against BYOD based on strategic and company culture considerations. Hence, future research could either focus particularly on how the satisfaction of the users could be measured and improved with, for example, additional training for CPD, or how to securely integrate PDs into companies’ IT infrastructure.