1 Introduction

With increasing various means of accessing information available today such as desktop PCs with monitor and keyboard/mouse, laptops with built-in camera and trackpad, tablets with multi-touch screen and Virtual Reality (VR)/Augmented Reality (AR) goggles with specialised hand-grip control devices, the communication between requests and results is vital for a useful interactive system [1]. Obviously, these interactions are currently constrained by the way input and output capabilities and one-to-one pairing are facilitated. People generally accept the interaction with a system using its pre-defined input and output components available on either separate devices (e.g. keyboard and monitor to a desktop PC) or on a same device (e.g. touch gestures and visual display on a modern tablet). This style of interaction works well in the scenarios or environments for which the devices were designed. What if there are changes in these factors? Large displays are great for sharing information in presentation settings or public areas. However, the centralization of operations on a large display could introduce the cognitive gap between the speaker and the team members while everyone discusses about the content being displayed on the large screen. While tabletops promote collaboration by supporting multiple users working simultaneously, people sitting at one side of a tabletop are unable to read the documents displayed on the opposite side of the table. Although mobile phones have become a necessity as personal communication and management tool for our digital lives, they are not ideal for displaying large amounts of data on the screens. Smartwatches are great as a health-tracking devices as well as extensions to mobile phones, but their limited screen space can severely reduce the usability of even simple and mundane tasks such as typing or browsing information.

Usability issues arising from diverse situations in using a device as a single, self-contained unit lead us to consider the combined use of multiple devices to enhance the overall user experiences within an interactive session. Numerous techniques of coupling interactions between devices have been proposed: wall display and watchband [2], wall screen and tabletop [3,4,5], presentation screen and Personal Digital Assistants (PDA) [6, 7], large public display and mobile phone [8,9,10], smartphone and smartwatch [11,12,13], or between two tablets [14, 15], etc. Cross-device interactivity often requires the explicit device connection setup at before the interaction is taken place. Making use of multiple devices at the same time may surface potentially significant usability issues which might not have been originally planned or anticipated by the developers of individual devices.

This paper addresses these usability issues, articulates and generalises the concept of coherent and consistent interaction amongst multiple devices, owned by a user or multiple users, in their own social situations. Human seeks or offers assistances among one another using languages, signals, common understanding on situations and relations. Need a helping hand? Asking a family member, a friend, a colleague or someone we know seems to come first in our mind. Not only they know us, they understand which is the best way to help us. Close family members and friends even predict when and what we need, to offer their assistances. Encapsulating these concepts with cross-device interactivity in terms of its usability, SocioCon applies such basic analogy in theoretically articulating and suggesting the mechanisms for creating and managing assistances amongst interactive devices to efficiently continue the same on-going activity or task. By augmenting our physical world with groups of pre-linked interactive devices, we are able to leverage the input and output modalities of different devices to maintain the usability in using them.

Starting with a series of typical and expected scenarios where multiple devices around us are used together to carry out one task or activity, we develop the SocioCon concept in which some of the essential usability issues in such scenarios are identified and generalised. To refine the SocioCon concept, we develop prototype systems that realise some of these scenarios to further identify the usability issues. Making various interactive devices technically connected together is one issue, but what the connected devices mean when used within an interactive session in terms of usability is an issue that deserves an in-depth study as more and more diverse interactive devices become available today and this trend will only increase in the coming years.

2 Motivation

2.1 Multiple Devices to Meet People’s Needs

Having both input and output modalities on the same device allows users to perform tasks using single devices, but also bind the users with the devices’ advantages as well as disadvantages. Rekimoto et al. [6] addressed the disadvantages of PDAs being isolated from the users’ nearby computing devices in their proximal interaction model. Two users could connect their PDAs, acted as Internet Protocol (IP) mobile phones, to the nearby displays during a conversation to share the screens between them. Usability issues due to input/output-bound are also common on smartphones when users interact with large amount of data on the small screens. Viewing content being spread over few pages not only requires the users’ short term memory to refer to the visited content, but could introduce interaction costs as well. A PC-like work environment can now be easily created by plugging the smartphone into a display dock which is connected to an external monitor/TV, a keyboard and a mouse [16]. The smartphone can still be used to make phone calls or to send messages while the output screen showing what the user is working on.

Our work seeks to leverage the advantages of each device when connected together for a more optimal overall interaction. The following scenarios illustrate how we envisage a social circle for devices would create opportunities for optimal usability. Our specific research interests are to understand the nuances of how people would transit using the input/output modalities offered from different devices during a task, whether and how it matters when those devices belong to other users or the public.

Scenario 1

Mike and his best friend Jack are watching a movie in the common room at their dormitory. Mike uses his smartwatch to remotely control the video playback on the TV. Simple operation commands such as changing the channels and increasing the volume are very easy to carry out using touch gestures performed on his smartwatch. However, when he wants to search for a TV program, typing search keywords on his smartwatch often results in typo due to the small keyboard constrained by the screen size of the watch. Before Mike starts feeling annoyed, a floating phone-icon shows up on Mike’s watch surface, hinting that Jack’s phone is available as an input device. Mike could choose to ignore the suggestion and carry on typing using the virtual keyboard of his watch, or he could tap on the phone-icon to start using Jack’s phone keyboard for typing upon Jack’s consent.

Scenario 2

Mary is showing her holidays pictures to Jane. When more friends gather around, it becomes difficult for everyone to see the pictures on her phone. Some of her friends try to crane their necks over others’ shoulders to see the pictures. Mary also has to zoom in some of the pictures for her friends to see more details in the photo before moving on to the next photo. A floating tablet-icon button appears on Mary’s phone screen, hinting to her that she can output the photo from her phone to her tablet for better viewing by the group. Tapping on the tablet-icon enables the photos to appear on the tablet. Mary continues showing the pictures to her group of excited friends.

Scenario 3

It takes more than 10 minutes for the bus to come. Looking around, Mark sees the public display, situated a couple of meters away, showing interesting information about a new movie. Instead of leaving his seat to approach and touch on the public display for viewing further information, Mark uses his own smartwatch to remotely interact with the public display, for example, tap on the smartwatch surface to view more information, swipe to navigate content.

One of the implications of the above scenarios is that there is no one-size-fits-all to the challenges of using a single device to meet all needs and requirements from users for any situation. In this paper, we take these as guiding scenarios in shaping the detailed concept of coherent, connected multi-device usability in Sect. 4.

2.2 Do People Want It?

The scenarios drawn above are meant to be general and perhaps soon-to-be expected situations we envisage in this study. We conducted a quick informal survey to check and confirm that they make sense and relate well to people. Ten people responded from mixed-nationality who are graduate students and researchers at a university. Almost all of them use smartphones and laptops often (Fig. 1). Each of them owns at least 3 computing devices (e.g. smartphones, tablets and laptops, etc.). We asked questions relevant to the previous three scenarios.

Fig. 1.
figure 1

Devices are often used by users

Figure 2 showed feedback of participants’ choices. Most of respondents strongly support the idea of transit to another device to continue the current task/activity in an optimal way. Few were concerned about using other people’s devices. Majority opted for using personal devices over publicly-provided devices and modalities.

Fig. 2.
figure 2

Participants’ choices on transitions

The three scenarios are selected amongst typical usability issues we have observed, which we believe they will become more common as we are surrounded with more devices. They demonstrate how there would be needs for a transition from one device to another for optimal interaction with a system or more efficiency in performing a task.

In the following section, we review prior work that explore multi-device interactions which our approach shares similar points of view with. In Sect. 4, we propose a novel social circle concept for connected use of interactive devices and its general guidelines, creating opportunities for interactive devices to share their advantages in input/output modalities to other devices which join the same social circle. We discuss potential applications of this model through the three scenarios, and report the studies in which two of the scenarios have been implemented and user-tested. Section 5 concludes the paper with future exploration and future work.

3 Background and Related Work

3.1 Multi-device Interaction

The concept of multi-device interaction is not new. Pick-and-Drop [17] proposed a pen-based direct manipulation technique to transfer data between displays of a PDA and a kiosk terminal, or between two PDAs. Stitching [15] applied a similar idea where a pen stroke across two tablets can be used to transfer files. InfoTable and InfoWall [18] enabled the interchanging of digital information amongst portable devices with the support of two cameras. Content sharing between a tabletop and a wall screen were possible with i-LAND [3], MultiSpace [4] and Select-and-Point [5]. Two tablets or phones could join to display a photo [14] or a canvas [19] across their device surfaces. Overall, these systems focused on distributing interactions or content resources over multiple devices for sharing resources. In contrast with these approaches, we explore the potential of utilizing input/output interaction modalities and particularly focus on the usability implications in such combined usage scenarios.

3.2 Systems with Different Input and Output Devices

Researchers have strived to enhance the system usability and improve tasks efficiency by using a device to aid the operation of another. Mobile devices such as PDA and phones have been used as customizable input devices to desktop (Shortcutter [20]), LCD displays (Ballagas et al. [8], Hyakutake et al. [10]), interactive TV (PDA-ITV [21]) and public displays (PocketPIN [22], C-blink [23]). The output of users’ artefact (e.g. notes, photos) between personal devices and public displays was explored in SharedNotes [7], Hermes Photo Display [9], PresiShare [24] and Digifieds [25]. While these systems utilised handheld devices as input and/or output devices that act as controllers or companions to the larger screen displays, Duet [26] proposed join interaction in which two mobile devices can perform their input and output techniques. Interface between phone and watch was divided such as a palette was hosted on a smartwatch to work with a canvas on a smartphone. Kubo et al. [27] proposed the use of the wrist-tilt gesture on the watch combined with the push on a button on the phone at the same time to zoom out the view, which can also be achieved by using a pinch gesture on the phone. Not stopping at being used as sensing device, smartwatches were also used as sub-displays of applications running on phones [27, 28]. Our work contributes to this paradigm in terms of identifying the usability issues which may arise before, during and after the interactivity between one device to another.

3.3 Continuity in Multi-device Interaction

A number of studies have investigated the continuity of the user interfaces beyond the device physical boundary. MEDIAid [29], a set of UI sketches illustrated the shift of streamed audio-visual content from a TV to a mobile phone. FollowMe [30] provided a pluggable infrastructure for context-aware computing that enabled a user’s task to be carried on when the user changed his/her environment context. Context Awareness Supported Application Mobility (CASAM) or “Application Mobility” [31] was described as the migration of an application between different devices during its execution, e.g. during a video conference, a user switched from a laptop to a tablet and continue his/her video conference from the tablet. These studies share the same focus on the transfer of the content and its state across multiple devices. Aiming to leverage the capabilities and relationships amongst devices to create a social circle of interactive symphony, we are interested in exploring the emerging “continuity” interaction concept [32] amongst connected devices. This theme refers to an interaction which starts on one artifact and ends on another to enable users to re-access the content across different devices. However, it is not clear how the same functionality, feedback and interaction technique would be achieved across multiple devices with different affordances. The seamless “continuity” in interaction while switching amongst devices is one of the usability issues when using multiple interactive devices within one session, thus warrants much more in-depth investigation. In this paper, this aspect is a part of the overall usage scenarios where various available interactive devices are to be visible, selectable and connectable for the recommendation of interaction with a target device, after which the most suitable interaction paradigm for the chosen input and output devices is to be pushed to support satisfactory user experience within that session.

3.4 Ecosystem of Connected Devices

Various studies have aimed towards creating environments for multi-device interaction. In multi-device ecosystem [29], a semi-automatic content adaptation engine would query the user in determining the most suitable way to continue streaming the media content. Turunen et al. [33] presented a multimodal media center interface based on a combination of a mobile phone and a large high-definition display. Users can interact with the system using speech, using mobile phone to physically touch on icons, and gestures. Henrik and Kjeldskov [34] explored interaction space created around multi-device music player for co-located users to listen to music together. Results of investigating the usage practices of combining multiple devices (e.g. TVs, personal computers, smartphones, tablets) in users’ daily tasks and activities showed that the users continuously encountered problems in multi-device use [35]. These problems included the connection issues, incompatible content formats, unavailable applications across devices platforms and limited text-entry capability; all of which highlighted the challenges in creating ubiquitous digital ecosystems to support the collaboration and the continuity of users’ activities. This work complements the concept of connected devices ecosystem by envisioning a more generic usability environment in which the factors and their relationships in pairing between interactive devices and their optimal modes of interaction paradigms are to be alerted, selected and used within a session.

4 SocioCon: A Social Circle for Interactive Devices

It is reasonable to assume that in the near future, interactive devices and their relationships will be characterised by a high level of automated connectivity and flexible ways to authorise it by the users. On-demand connections are likely to take place anytime and anywhere.

Taking privacy concerns, devices need to know which other devices have agreed to be connected and interacted to them, and what modes of interaction they are equipped with. For this purpose, sharing device input/output channels and its level (direct-share or propagation-share) needs to be defined only by devices’ users. This personal definition is generally based on the trust amongst users as well as whether or not users are willing share their devices, their capabilities and data to others.

SocioCon recommendation strategy is based on a hybrid sensing of foreground and background: what application is currently in use, what activity the user is doing (e.g. standing, walking, sitting, lying, eating), what device the user is using (e.g. smartwatch, smartphone, tablet, laptop), and how the device is handled (e.g. being held by one hand or both hands, in pocket, in bag). For example, sensing its user walking while looking at the phone to suggest alternative option of output the content to a pair of AR glasses (frequent view) or a smartwatch (occasional view). In short, SocioCon examines a combination of user-device-application contexts to determine if a user requires an assistance in alternating input or output amongst currently available devices. In this paper, we focus on the usability and transition aspects of connections amongst devices as per the scenarios drawn in the previous section. With that, we construct a simple set of general guidelines as a basic mechanism for how devices in social situations should behave in terms of connectivity, then characterise SocioCon from them.

4.1 General Guidelines

  • Each device shall have its own profile with information about device default status, sharing levels, sharing modalities and sharing channels.

  • Each device shall have its own spectrum of roles: being a self-contained interactive device itself, or being an input and/or output sharing channel for others.

  • Each device shall be able to initiate a relationship and to indicate a relationship type with other devices. Relationship type for each relation mimics the way how people feel and behave towards each other (e.g. family, friends, friends of friends or public). It shall influence how transparent multi-device handshakes would be to users and whether or not the sharing propagates.

  • SocioCon shall provide adjustable sharing level between devices. Non-sharing mode shall be available for a user to stop all sharing activities of a device.

  • Sharing channels shall be asymmetric. Device A could share its input and output modalities to device B, but B may choose to share either its input or output, or only accept the sharing from A.

  • Based on the user’s context, the device context and the application context, SocioCon would recommend the best available input/output option based on a set of available devices in the same social circle.

4.2 Conceptual Model

Device Profile

SocioCon device profile contains user’s preferences on device default status (available, pending, connected, inactive), sharing levels (device-centric, user-centric, once) and sharing modalities (input, output or input + output) for sharing channels (IN-share for accepting the input and output modalities offered by other devices, and OUT-Share for offering its own modalities to others).

Establish Relationship with Other Devices

When a device is within proximity of another device, any of the device owners can initiate a SocioCon relationship if there is none existed. The user can indicate the relationship type between devices (SocioCon-Buddies, SocioCon-Family, SocioCon-Friends, SocioCon-Friends of Friends and SocioCon-Public) (Fig. 3) and can further customise the device sharing parameters. Device role is dynamic, each device could take a role as an input device in one scenario and an output device in another, or both. The transparency of multi-device handshake - the basic rules for the way input and output modalities are shared between them, is highest with the SocioCon-Buddies type and decreases with others.

Fig. 3.
figure 3

Relationships amongst devices as social circles

The proposed social connections amongst devices use analogy of human relationships in defining relationships between devices, but not necessarily the same (e.g. devices belonging to family members could belong a SocioCon group other than SocioCon-Family, likewise devices belonging to two close friends could participate in a SocioCon-Buddies group). The main objective is to establish basic guidelines for a device to understand which devices are available as alternative input and output options, and how much willingly they want to share.

Transition Composition

One of the important design consideration in such a multi-device environment is when the user switches between the input/output device within an interactive session, i.e. a transition of device and its modalities. The transition to alternative input/output modalities comprises of 3 components:

  1. 1.

    A set of available devices that offers sharing their input/output modalities. The devices social circle provides the knowledge of available devices.

  2. 2.

    A set of trigger factors that invokes the recommendation for alternative input/output from other devices.

  3. 3.

    A transition model that depicts how multiple input/output devices support one another.

Trigger Factors

The recommendation for input/output options can be triggered by the contexts of user, device, application, or a combination of any of them. These trigger factors can be implicit (SocioCon recommends the transition) or explicit (user indicates the transition).

Transition Models

Transitions of input/output can be categorised into 4 models: Input Shift, Input Shift-Relay, Output Shift and Output Shift-Relay, illustrated in Fig. 4. (‘I’ denotes input, ‘O’ denotes output).

Fig. 4.
figure 4

Transitions of input/output devices

Input Shift model represents a situation when there is a switch of input device from I1 to I2, both of which belong to the same social circle with O. Input Shift-Relay similarly represents the transition of input from I1 to I2, but with I2 not participating in any social circle that involves O. Output Shift and Output Shift-Relay represent the transition of output from O1 to O2, each differentiated by whether O2 is within the same social circle with I or not. This particular categorisation is based on the social relationship amongst participating devices, which may impact on the future user interaction guidelines of input/output transition, and is transparent to the users.

The social relationship amongst devices influences how instant the input transition or output transition happens. SocioCon-Buddies relationship type allows the input transition and output transition to occur upon the user’s positive response to SocioCon recommendation amongst its participating devices (usually belonging to the same user). For other SocioCon relationship types, depending on their settings such as sharing levels, channels and modalities, a minimum level of consent by the user is required before the input/output channel transits from one device to another. SocioCon status for each participating device is automatically updated accordingly (e.g. stopping a device from participating to a particular relationship or all relationships by simply changing its status to Inactive for that relationship (local) or in device profile (global)).

Table 1 enlists some examples of using user’s physical activity, device-in-use and current task as trigger factors to suitable transition models.

Table 1. Examples of trigger factors and transition models

To aid understanding of the trigger factors and the transition models, we use our 3 scenarios (see Sect. 2.1) to illustrate. In scenario 1, as Mike’s watch and Jack’s phone join the SocioCon-Friends group, when the typo rate exceeds a threshold, SocioCon suggests to use Jack’s phone as an alternative device for typing the search text. If Jack’s phone participates in the social circle of the TV, the input transition is classified under the Input Shift model, else the Input Shift-Relay model. Jack’s consent is a decision factor to make the input transition happens. SocioCon updates the transition status from the time Mike taps on the floating phone-icon (Input-Receive Request) to the moment when Jack taps on the floating watch-icon (Input-Offer Accept). This is illustrated in Fig. 5a. An adaptation to Input-Offer interface may take place on Jack’s phone. Either Mike or Jack can interrupt the request or the offer an input modality by choosing the Disconnect option in the context menu of the respective floating icons.

Fig. 5.
figure 5

Transition models: Input Shift/Input Shift-Relay between devices of 2 users (a), Output Shift between devices of the same user (b), Input Shift between personal and public devices (c)

In scenario 2, both the phone and the tablet are owned by Mary, thus belong to the SocioCon-Buddies group. Crowd-sensing and the browse of pictures on Mary’s phone triggers the Output Shift. When she taps on the floating tablet-icon showing on her phone, the pictures are instantly output to her tablet. This is because Mary’s tapping action gives her consent to the output transition. This is illustrated in Fig. 5b.

The transition in scenario 3 (Fig. 5c) happens in a similar way to that of scenario 1, but only for the first time when Mark’s watch joins the SocioCon-Public group. Public displays are situated with a general purpose of sharing information to everyone, hence the viewing and browsing their information are strongly encouraged. For this reason, a social profile for a public display would have a default settings of SocioCon-Buddies type for its IN-Share channel, which accepts the input commands for the navigation and selection of objects from other devices. Mark, who wishes to use his smartwatch as an input device to a public display could establish a social relation between his watch and the display using the SocioCon-Public type. Input Shift model enables Mark to interact with the display using his watch by simply tapping on the public display icon showing on the watch social circle list.

Instant input/output transition possibilities would likely encourage more opportunistic switching between devices when situations arise in ways that will benefit the tasks the users are engaged in. We developed prototypes for some of these situations and conducted usability testing to understand the usability aspects of the device transition and the implications for the interaction modalities, described in the next section.

Transition Usability

Similar to the way human interact with one another using their best suited languages and protocols, we expect optimal interaction interfaces and methods would be required for our devices to participate effectively in their own relationships on SocioCon. Basic interface adaptation on connected devices during input/output transitions may be required to accommodate the differences in interaction techniques between devices. User interface of substitute device may take a different form to adapt with the interaction technique and to differentiate the substitute mode from its host mode.

The nature of interacting with public display involves consideration on easy and quick information retrieval. The scenario 3 involves the interaction between a smartwatch and a large public display, both of them tend to have better supports for tasks and activities that do not require objects manipulation in details. We envisage that this particular arrangement of multi-device interaction will become a common and typical situation in our public, urban setting, thus we investigated further as one of our prototypesFootnote 1. To overcome the challenges of huge differences in screen sizes of a large display and a smartwatch, we applied the ‘hop-to-select’ traverse style which enables the jumping from the current-selected object to the nearest selectable object in our prototype applications (Fig. 6). This traversal style combines the navigation and the selection of an object, thus the name ‘hop-to-select’.

Fig. 6.
figure 6

Hop-to-select traverse

Basic object manipulations, such as viewing the context information of a selected object, are possible with simple common touch gestures performed relatively on the smartwatch. This is achieved by mapping a subset of common gestures (Tap, Long Tap, Swipe, 2-Finger tap and Shake) sent by the watch to the common input commands (select an object, show/hide context information, show all interactable objects, traverse previous/next/up/down, exit current screen or undo) on the large display. As the touch gestures performed on the watch screen do not require an exact spatial coordination between the watch face and the wall display, users remain visually focused on the screen while continuing to interact with the system (Fig. 7).

Fig. 7.
figure 7

Eyes-off interaction with large display using smartwatch

We developed three applications with this interaction strategy (Photo Browser, Slides Presentation and Planets Explorer) to demonstrate a seamless multi-device user interactivity using a smartwatch (input) and a large display (output) in conducting a task (information sharing) in a particular setting (on university campus). Ten participants were recruited one by one to engage in the interactivity during which all interaction, comments and other particulars were video-recorded and analysed afterwards. Each participant tried all three applications one by one in a free, explorative manner. Most participants enjoyed the interaction, effortlessly tapping and swiping for browsing the contents of each application. Although all participants had experience of interacting with large displays, only two were familiar with using the smartwatches. Notably, those who have not used the smartwatch navigated skillfully after few minutes of using the system. Participant 5 and 9, who had been already experienced with smartwatches, praised the intuitiveness and responsiveness of this interaction technique. At least eight out of ten agreed on the ease of learning, ease of use and the usefulness of this eyes-off interaction, particularly in making presentation or collaborative learning using large displays. Figure 8 shows overall users experience rating from this usability testing.

Fig. 8.
figure 8

User experience rating of multi-device interaction: smartwatch as input and large display as output

The connected use between smartwatch (as input) and public display (as output) implies a particular usage setting where the user has his/her own multi-touch device that can serve as input but wants to share the output on a separate, large screen. As the user looks around the environment to identify any such large screen, the Output Shift model (lower left in Fig. 4) is determined based on the device sociality as mentioned in the previous section, the pre-defined hop-to-select interaction strategy is then pushed to this combined interaction system.

Another potential situation that could employ this interaction style is when a user, who has been browsing the content directly by touching on a large interactive display, decides to step back in order to be able to take in a full view of the large display content. The change of the user’s spatial context triggers the recommendation of the user’s personal devices (e.g. smartwatch) that participate in the same social circle with the large display. Suppose the user agrees to use his/her own smartwatch as input device to the large display as suggested by SocioCon, the Input Shift model (upper left in Fig. 4) will apply the hop-to-select interaction strategy to this connected interaction. In summary, depending on what each of the connected devices is and the details of their interaction modalities, the optimal interaction strategy will need to be identified and replace the ones each device had been using at that moment.

We also expand our input modalities to tangible input devices in our experiment with the Visual Field Visualizer (VFV) system, based on a real needs of eye-specialists identified by collaborating with a local hospital [37]. The VFV system comprises of a smartphone, a VR headset and tangible input devices (Fig. 9). It measures one’s visual field by presenting stimuli (light) at different location in the tested visual field region, records the user’s response of seeing the presented stimuli to construct the user’s visual region. During our on-site observation of how visual field tests are conducted at a local hospital, we observed some patients having difficulty in pressing the response button connected to the machine. To eliminate the external factors which impact users’ timely responses, multiple choices of input options were considered before being narrowed down to 3 Bluetooth devices: a round-shape button, a joystick and a mouse (Fig. 9c). This setting loosely implements the scenario 1 in which a user may want to switch to another input device due to the usability issue of current input modality. Usability testing with the system involved 39 participants, each wearing the VR headset and responding to the visual stimuli with an externally connected input device. Situations for switching the input device happened, for example after a participant expressed her difficulty in pressing the small Bluetooth round-shaped button, we quickly passed her a Bluetooth mouse to resume her visual field test. The idea of providing multiple choices of input was welcomed by the participants as some felt easier to click on a mouse button than a round-shaped button while others prefer the button or the joystick for their compact sizes. We received positive responses from participants on the system usability.

Fig. 9.
figure 9

VFV components: Android phone (a), VR case (b) and Bluetooth button/joystick/mouse (c)

A VR headset typically requires an on-device input button or separate hand-held input device(s), both of which are dedicated to the brand or model of VR headsets. The VFV prototype is a manifestation of the Input Shift model (upper left in Fig. 4). Considering a user might want to switch the system-provided input modality for a particular sub-task where the switched-to input device is more optimised for it. Other trigger factors include the usability issues of an input device or user’s personal preferences. The transition to alternative input devices would be possible if their social relationship exists.

The Bluetooth button/joystick used in the VFV system could flexibly take a role as an input device as well as an output device (see general guidelines in Sect. 4.1) to the VR headset. An example for this case is when a separate haptic feedback is output to the connected input device (e.g. in the form of an armband or textile wearable) due to the discomfort of vibration on the VR headset itself. The Output Shift model determines the details of the output transition. Auditory feedback can be output to the user’s earphone by the Output Shift-Relay model (lower right in Fig. 4). Using different forms of feedback to acknowledge the user’s positive response in seeing a stimulus would be seen as a contribution to the overall usability improvement to the VFV system in particular and to the visual field test practices in general.

5 Conclusion

As more and more interactive platforms emerge offering different interaction modalities and their associated benefits, more burden is eventually imposed on the end-users to make the right choices in the devices, their combined uses in ways not necessarily anticipated or expected by the developers of the individual devices. We expect that such situations will only increase in the coming years.

While a sizable number of studies are available and conducted today to experiment the technical possibility and feasibility in such multi-device connectivity and cross use, the usability side of the picture (i.e. how the user is aware of such possibility in a given situation and how to switch between them, what is the best interaction modalities and style, and how to ensure to minimise a negative experience in the transit as well as during the consequent interaction) needs to be studied in itself.

In this paper we outlined a possible way to frame this transition usability issue by suggesting three expected usage scenarios. We contribute with the general guidelines and the SocioCon conceptual model that will support how multi-device transition and their usability implications should be addressed and studied. We are interested in the design issues that arise when considering multiple computing devices, each of which is more capable of input and output modalities than others, which might be combined in many different ways. Transition usability emerges when a user switches between the input/output modalities on the same device, or between input/output modalities provided by difference devices in a session. With the growing interest in Internet of Things (IoT) devices and AR/VR applications in which the interaction modalities and styles may not be natively fixed or pre-designed but most likely require some level of cross-input/output device connections, we believe that our study in transition usability is timely and important.