Keywords

1 Introduction

Immersive display spaces, such as CAVE Automated Virtual Environments [1] and use of head-mounted displays [2], provide advantages for viewing and exploring immersive visualizations over standard desktop displays [3]. Volumetric visualizations are especially useful to explore in immersive environments to better understand spatial relationships [4]. However, when performing selection tasks in these types of environments, there are challenges that can hinder this human-visual process, such as dense data rendering, rendering ambiguity, limitations to visual channel, and/or occlusion of data points [5]. Other issues result from portions of the desired target selection to be too entangled with non-desired target areas. Typically, interaction solutions designed to solve these problems may have low learnability and require some additional instruction to use a technique. We would like to harness innate actions of reaching and grasping we well as other physical properties of objects interacting. This can be difficult to accomplish, but in this paper, we present a novel selection technique that enables users to convert the rendering to geometric-based rendering to manipulate objects using innate physical actions and seeing physics-based responses, but instead for the purpose of selection. There are known advantages to direct manipulation techniques and a plethora of well-designed manipulation techniques. Our novel idea is to change the rendering of the objects and convert a selection task into a manipulation task. Once they are converted, users manipulate the objects to create the desired selection volume. Once completed the rendering returns to original form, position, and other spatial relationships; however, with the volume of area identified as ‘selected’. Not only that, but our technique will enable selection of disjoint portions of a volume (not all of the volume in one place). In this paper, we present the details of our physically-based selection technique. We present the results of a physical experiment where observations were analyzed and used to inform design of the technique. We also present the results of an evaluation of the feasibility and usability of incorporating physically-based interaction for selection of volumes of data. The goal of this work was to determine what strategies users were able to use this technique and how this technique might be incorporated with other existing interaction techniques.

2 Related Work

2.1 Volume Tool Selection

Previous work has designed selection techniques that provide a predefined volumetric area to the user and a user manipulates that volume tool manipulation to define the selected set of points [5,6,7,8,9]. A predefined shape may not accommodate perfectly to the data that needs to be selected so users may have problems selecting occluded elements. Worm Selector [10] allows selecting complex shapes while providing precision. CAST [11] techniques use context-aware interactive selection to provide a faster analysis of the large datasets. Other work provides a way to refine selection from multiple objects to single sets progressively [12, 13]. However, this technique may not be suitable for dense data points in a volume visualization.

2.2 Bimanual Selection

Several of the techniques implemented uses bimanual selection techniques [8, 9]. Additionally, one technique, which has been designed for bimanual interaction, Volume Cracker [14], allows users to break open a set of data in a physical way. Users can use both hands through tracking to manipulate translation and rotation of the parts of the volume. Our technique is unique from this one in the sense that we are permitting users to change and completely manipulate and reorganize any individual or groups of data positions and other attributes of the data. We incorporate a change in rendering to ease the interaction of volume selection and treat it as a virtual environment geometric object selection task. Furthermore, Volume Cracker also does not return data to original sets as in ours but keeps track of context through use of a spine to connect the two portions [14].

2.3 Touch-Based Selection

Other techniques use interaction with 2D touch-based devices to interact with the volume data [15,16,17]. A few techniques do exist which combine user input and aided selection. Multi-touch touchpad [16] employs two touchpads that uses an asymmetric bimanual technique, which allows selecting a 3D region by requiring only a single action. While these are excellent solutions to reduce fatigue, our intent is to investigate the use of mid-air gestures in situations where it might be more advantageous in other ways by interacting with 6-DOF direct manipulation to solve challenges relating to rendering visibility and occlusion.

2.4 Multiple Object Selection

PORT [18] allowed selecting multiple objects using a set of actions to move and resize while defining a target object. Depth-ray technique [19] that requires two operations to specify the target uses the ray-casting technique with added depth control to select that objects that are occluded. Magic Wand [20] uses an automated procedure to select objects based upon the proximity of the other objects but is sensitive to the geometric outline, which makes it difficult to work. Balloon selection technique [15] uses the distance between two hands of a user to allow the user have control over the depth of selection in a 2D touch surface. Flower ray technique [19] uses the ray-casting technique with marking menu to select multiple target objects concurrently. However, our physical based selection technique in comparison to these works is different as we use mid-air gesture to manipulate the objects/data along with its attributes.

2.5 Semi-automated Selection

Two 3D selection techniques that use gestures for 2D volume selection are TeddySelection and CloudLasso [17]. In both techniques an initial selection is done in a 2D plane by using a Lasso tool, this shape is then extruded to form a 3D cylindrical shape, and the structure of the selected data is analyzed. These solve some of the occlusion problems in an automated way, while our technique serves to solve the issue using direct manipulation. In the future, we will compare performance differences among these techniques but are out of the scope of the work presented in this paper.

3 Physically-Based Volumetric Selection Technique

In this section, we provide a description of how the selection technique works from a user’s perspective and the back-end. There are three sequential stages to this technique: Render-Swap, Manipulate-to-Select, and Transitional Return. Users are intended to use this technique with multiple types of 6-DOF devices. However, we implemented our technique with the Leap controller (Fig. 1) to use mid-air gestures and evaluate the feasibility of physical actions a user would perform in the real world.

Fig. 1.
figure 1

A user performs mid-air gestures and movement with objects using our physically-based selection technique.

3.1 Render Swap

This is the first of three phases. The information for volumetric data is typically rendered as billboard-based rendering, or other similar cloud-like visualizations (Fig. 2). The challenge of this rendering is such that it makes it difficult for users to perceive specific details borders of the data with what they would like to select. The first step to better volumetric selection is to swap the rendering for a geometric rendering approach (Fig. 3). The users will see data points or clusters of data points change to geometric objects. Color encoding will be retained. There is a great deal of flexibility for the user to determine what the geometric objects or glyphs should be, how many data points per cluster, clustering by data properties, and any other set of properties. For the purposes of our evaluation, we chose spherical geometric objects because they most closely matched our physical experiment (experiment one) and innate physical interaction.

Fig. 2.
figure 2

Example of billboard rendering of data.

Fig. 3.
figure 3

Example of geometric-rendering of data.

3.2 Manipulate-to-Select

Once the data is represented as geometric objects, the next phase of our selection technique is engaged. The idea is that those data points or clusters of data points, easier to see and identify. We call this phase Manipulation-to-Select because users may translate, rotate, scale, or manipulate other properties of the data without penalty to the actual data properties or spatial relationship. For our prototype implementation, we implemented translational change only as per our results from our experiment one (see section Experiment 1: Physical Observations). Additionally, physics or a subset of physics aspects are applied. For the purposes of our prototype and evaluation, we enabled physics forces on collisions of the user with the objects and with other objects but did not enable gravity. The reason we did this was to retain the spatial relationships of the elements initially so that users could better begin to identify what to select.

In this phase, the rendered data is no longer locked into each position. Users can move the objects around with the intention to organize the data and separate out what elements users wish to select. There is a lot of flexibility on users’ end for how they would like to organize the data. A user can collect data in a pile to indicate selection, separate out all the elements s/he does not want to select, separate out all the elements s/he does want to select, separate elements into different piles or groups for different levels of selection, etc. We discuss these user strategies, discovered from the evaluation we conducted, in more detail in the User Strategies section of the section: Experiment 2: Evaluation of Technique in Virtual Space.

How the user manipulates the elements for selection could also be categorized as egocentric or exocentric. In egocentric Manipulation-to-Select, users gather the data relative to themselves. For example, a user will collect data close to the body or away from the body. In exocentric Manipulation-to-Select, users will gather elements based on spaces. For example, a user may designate a ‘selection space’ and then move all elements to that space or a user may identify a threshold or line through the space to push all elements to be selected across this threshold. In exocentric Manipulation-to-Select, there is an additional step for users to define the ‘selection space’ either by creating a volume of space through a geometric object, or by organically drawing an area or line in the space. However, once the user defines this area, that area is saved or retained in the workspace and is retrieved during the ‘Render-Swap’ phase.

For the purposes of our evaluation, we only implemented egocentric Manipulation-to-Select however in our evaluation is where we discovered users were performing this strategy of designating areas to put objects for selection. As such will implement support for exocentric manipulate-to-select for a future evaluation.

3.3 Transitional Return

Once users have identified the elements for selection through the task of manipulate-to-select, users may initiate the phase of Transitional Return. In this phase, the elements morph back into each element original position, individual data elements (if a Render-Swap clustered data element- see section on Render-Swap), and original rendering type. We indicate this phase as transitional, because it is important that the morph process does not occur instantaneously. This phase is a continuous animated visual change over a short period of time to maintain context in both (a) where the data points came from/ moving to but also (b) in the process a user conducts when they perform manipulation-to-select. As a user is moving the elements around in the space during manipulate-to-select, key-frames of the change in position of the elements are recorded over time. Once Transitional Return is initiated, those key-frames are loaded and played back for the user in reverse in a much quicker time. We have implemented for users to have the option to include their arm movements in the replay or not. Users also have the flexibility to adjust the speed of the replay.

4 Experiment One: Physical Observations

We conducted a qualitative experiment to determine how users physically interacted with objects when asked to select a target group of objects. We designed this experiment to understand user movement in 3-dimensional space and inform the design of our physically-based volumetric selection technique.

4.1 Experimental Design

Participants were seated at a table and presented with a volume of objects in various randomized configurations. Some of the objects were colored differently than other objects, referred to as target objects. For the objects, we chose to use cotton balls because they are lightweight, can be easily manipulated, do not easily roll away, and stay together for the most part when placed together. The task was to determine the best way for them to identify objects they would like to ‘select’ using physical actions. The configurations consisted of situations, which a user would encounter when interacting with volumetric data: occlusions, non-target data in-between target data, various clustering and non-clustering of data, and variety of size and shape of target volume. Participants were presented with a total of 25 trials. While configurations were assigned at random, all participants received a wide range of configurations from one end of the spectrum no occluded target object to all occluded target objects and of the spectrum from all clustered target object to zero clustered target objects. We collected data on observations, and subjective responses from participants based on questionnaires. In addition, Cyber Gloves, one for each hand, and wrists, tracked participants’ hands by an optical rigid-body tracking system by Opti-Track.

4.2 Results

A total of 10 participants (6 males, 4 females; Mean age = 25.05, SD = 3.28) participated in the study. In this study, handedness was not used as a measure for exclusion, but all participants were right-handed. Video recording and hand/wrist tracking data was analyzed using 3 facilitators to independently identify patterns in participants’ actions, then later came together to determine what factors would be used to inform the design of our selection technique. What we found was that participants would use actions that would separate out the target balls in some way, which related to the configuration of the target balls in relation to the non-target balls. Since the target balls were made of cotton and could be easily manipulated, we hypothesized that participants might shape or squeeze the cotton to indicate selection. However, no participants changed or manipulated the size and shape of the cotton to signal selection. The only actions participants performed were in support of translation. The following are patterns found from our observations:

Less Occluded Targets.

All participants for majority of time in this situation would use their hands to divide out and separate the non-target balls from the targets. The selection configuration resulted in target balls would be left in their current positions and the non-targets spread away in new positions.

More Occluded Targets.

Participants would start with larger groups of target balls and try to move other target balls closer to those groups. Any non-target balls in the way were separated out from the target balls. Any non-target balls or groups of balls were removed from the participants’ view and set aside.

Widely Spread Targets.

Where smaller sized groups of or more single target balls there were, the more the participants pulled together the target objects in a common area than separating away non-target objects. Therefore, in this sense, participants instead changed the position of the target balls, while non-target balls remained majority in original position.

Clustered Targets.

In addition to any separation actions described in treatment of occlusion, majority of participants (N = 7 out of 10) left clusters of targets in the location of majority of target balls, while moving only single or smaller clusters towards those larger groups. Some participants (N = 3 out of 10) instead held the target balls at various times in their hands to signal each group was to be ‘selected’.

5 Experiment Two: Evaluation of Technique in Virtual Space

The goal of this experiment was to evaluate the usability of our designed volumetric section technique. Our technique and three phases of ‘Render-Swap’, ‘Manipulate-to-Select’, and ‘Transitional Return’ are described more in detail in the earlier section: Physically-based Volumetric Selection Technique. We implemented our technique using a Leap motion controller (to enhance freedom of handedness movement) and in Unity using collision-based physics for the hands and objects. Gravity forces were disabled for all objects to help retain spatial relationships during the ‘Manipulate-to-Select’ phase. See Figs. 1 and 4 for an example setup and how users interacted.

Fig. 4.
figure 4

Example of manipulation-to-select

5.1 Experimental Design and Procedure

Participants wore a head-mounted display (Oculus Rift) with a Leap controller attached to the front of the display. Unity 3D was used to render the task environment as well as a natural-looking hand model (male and female models were used to match the gender of the participant). Participant’s hands were tracked using the Leap controller. In addition, participants’ head, wrists, and elbows were tracked using a wide-area optical rigid-body tracking system provided by Opti-Track.

In the virtual environment, participants were presented with a set of volumetric data where some areas were colored different from other areas, signaling the areas as the target for selection. Initially data was rendered using billboard-based rendering where data looked like blended clouds of color. When the participant was ready, they gave a verbal command to begin the Manipulate-to-Select task. At that point, the data would become geometric spheres. While users can change this, for the purposes of the experiment we used a consistent set data point and sphere size. There was a total of 3 clusters of target balls, each one with a range of balls from 5–15. A range of configurations of occlusion and spread of these virtual objects were randomly assigned in the trials.

The task was to use their hands to identify the volumes for selection. Participants then used a voice command ‘Select’ to signal that they were finished with the Manipulate-to-Select task and therefore selection was complete. At that time, Transitional Return would initiate. In this study, since we used participants from a broad general subject pool, it would not make sense to evaluate retention of context in the Transitional Return phase. We plan to do this type of evaluation in the future with expert scientists who use these types of visualizations who could provide a better gradient of performance for context retention of the data. In this evaluation, our goal was to evaluate broad utility and usability of this technique. In addition to the tracking that recorded the arm/hand movements of the participants, position change of the objects in the environment were automatically logged for the Transitional Return but also for our analysis purposes. We collected data on NASA TLX workload to assess fatigue, subjective responses on a modified SUS usability questionnaire, and from an open-ended interview.

5.2 Results

Participants.

Data from 14 participants was collected (mean age = 26.36, SD = 7.43). All participants had or corrected to 20/20 vision. There were 5 females and 9 males. There was one color-blind person but the distinction between target objects and non-target objects were not reported to be a problem. 7 out of 14 participants are moderately to very experienced with these types of volumetric visualizations. The remaining 4 had little to no experience. All participants completed and passed a full range of arm motion test to determine if they had any physical limitations that may influence how they were moving. All participants completed and passed the Butterfly Stereopsis Test, determining the range of stereopsis the participants could see. All 14 participants were different from the participants in experiment one. We made sure to exclude participants from experiment one because we did not want any bias or influence from their experience with the physical objects in influence their experience of the virtual selection technique. In other words, we wanted to make sure the opinions of the participants were in response to the usability and utility of our selection technique, rather than the comparison to an exact physical experience. In interaction design, it may not always be beneficial to provide an exact replicated experience to the real world and we designed the technique with this in mind. The purpose of this experiment was to identify the strengths and weakness of our proposed technique in the context of volume selection. All but one participant was right-handed. We looked at the data of the participant who was left-handed closely in comparison with our analysis and conclusion and did not find any discrepancies relating to hand dominance. In the future, we will actively recruit participants of all range of handedness to determine what differences exist for different dominance. Analysis of handedness is out of the scope for this work.

Utility, Usability, Fatigue, and Overall Workload.

The mean completion time of each trial was 1.24 min (SD = 0.48) with a total of 14 trials each participant, for a total of 17.36 mean time of experience. Participants answered questions on usability, such as ease of use, usage satisfaction, own performance, comfort, and utility, and fatigue, on a scale from 1 to 7. One represents more difficulty, less satisfaction, poor performance, less comfort, higher utility, and less fatigue. Seven represents higher ease of use, higher satisfaction, better performance, more comfort, and higher fatigue. Overall usability ratings are high, as participants reported the following for ease of use (M = 5.75, SD = 1.05), performance satisfaction (M = 5.5, SD = 1.34), comfort (M = 5.62, SD = 1.53), and utility (M = 5.86, SD = 0.86). Overall, we expected much higher reports of arm fatigue, but ratings were generally low (M = 1.07, SD = 1.59).

We analyzed data on NASA TLX Overall Workload and participants reported overall workload (M = 41.26, SD = 22.34), with effort and performance being a more highly weighted contributing factor than mental and physical demand. Overall workload is lower than expected given the physical aspects of the technique.

Other positive themes are illustrated by the following positive comments (negative comments are in the limitations section):

  • “I feel I completed all of the tasks as directed but had intuition with the movements”

  • “I could perform and complete each task in a very comfortable way. Also, I could move comfortably throughout the room without any difficulty”

  • “The Head Mounted Display was showing me almost perfectly what my moves were. It was really impressive!”

User Strategies from Movement Data and Observations

Gather.

Participants often collected target balls close to them in an egocentric manner. Participants generally collected them within a short distance of themselves and then indicated selection.

Separate.

Participants would move non-target balls away from target balls. In some instances, participants would separate target balls from non-target balls, but majority of those actions could be considered a gathering strategy. To be considered under a ‘separating’ strategy, participants removed balls away from other balls but did not collect them into one area.

Expunge Non-Targets.

The ball objects move with speed relative to user’s speed at which the hand is moved. As a result, the user can control the force that is applied to the balls. One strategy that users followed was to quickly and forcefully knock non-target balls out of the scene. There were a few occasions where this was not meant to happen (see Limitations section), however debriefing participants who did this often, did this as a strategy for selection. Participants did not do the reverse as we found in experiment one. No target balls were sent away to indicate selection.

Pointing.

We found a number of participants would touch single objects with their finger or point at (without touching) to identify objects for selection. This is a surprising result since participants did not really do this in the physical experiment. Given this result, a Ray-Casting [21, 22] technique can then be used to select stray individual objects after the Render-Swap phase. Ray-casting has been shown to have high accuracy and fast selection completion times.

Lasso.

Participants would gesture circles around groups of targets to indicate selection. This may be carry-over from other computer-based applications.

Painting.

A small subset of participants used the palm of their hand to gesture strokes as if to ‘paint’ the objects they wished to select, typically below the balls or above the balls. This is different from LASSO because participants did not make complete or semi-complete loops.

Limitations.

In this section, we outline a few themes, each illustrated by an actual comment from a participant, to facilitate discussion of the limitations of the system.

“Performance wise, however, once a ball goes the wrong way, it is impossible to get it back.” We will address this limitation by implementing a snap-back feature along the translation trail of the ball. That way the user can not only ‘undo’ a translational change but also move to a location, as it was intended for. For example, if the user only wanted to move it to a particular location but the object(s) traveled too far.

“Occasionally it was difficult to reach some of the balls that were further away or towards the edge of the usable area.” Other participants commented on unwanted body interaction as well. This limitation can be address by implementing an already useful existing technique of HOMER [21, 22], where the hand can be cast-out into space, thereby extending the physical reach of a user. We did not implement for this study since we wanted to look at how physical actions directly work as a selection technique.

“It looks like the system had trouble responding fast or sudden gestures.” Other participants also reported that at times the hand models would disappear. This was a result of the tracking performance (in particular when the hands disappeared from view of the Leap system). We expect this technology will improve and that the results we are reporting in this paper would only improve with better tracking performance.

6 Discussion

The main difference in strategies found between the physical observations and the virtual evaluation is that participants in the virtual evaluation did not send target objects away from their body or away from other non-target objects. In the physical experiment, participants did this when there was less clustering of target objects as an easier way to pluck out the target objects. We speculated that this might be a result of having the desk serve as a workspace ‘frame’ in the physical space. In the virtual experiment, if participants sent a ball away, they could not retrieve it so that could be interpreted as a non-selection. This may be the reason why participants did not execute this strategy.

Additionally, the PAINTING strategy was an interesting and surprising result, as it was different from the physical experiment and we did not provide color change feedback for selection of the objects. We will incorporate various tools that participants can use for this manipulation phase.

We found through debriefing participants that the mean task completion time was a reasonable amount of time to complete the task. In the future, we will compare our technique to other existing techniques. We will still expect our technique to take more time however, we will expect our technique to counteract difficulty areas where other existing techniques may fail (occlusion, data spread out, non-target data too close to target data, etc.)

The results on low arm fatigue and low workload may be due to the observation that they (participants) used their arms in short bursts of actions rather than continuously holding up their arms. Given that the task was timed but not constrained to complete the task as fast as possible, perhaps users may experience more fatigue without more time to rest arms between actions.

7 Conclusion

In conclusion, we present a novel volumetric selection technique. This technique has three novel phases. The first is the phase that converts the typical volumetric rendering into geometric rendering temporarily for selection. Based on the results from our evaluation, we discovered that once in this phase, a variety of selection techniques proven high performance for virtual environments can be used for selection of these difficult to see volumes. The second is the manipulation-to-select phase. This idea is novel in the sense that you could take any manipulation technique and use it for a selection-based task. Within the scope of this paper, we only looked at translation-to-select and reported the strategies and limitations of the technique. The third idea is that the data converts back into the original rendering, position, and original state of other properties. The data travels along the path in reverse that the data took when a user manipulates the spatial relationships of the data.

This paper presents the details of a novel selection technique, a solution to combat the challenges with volumetric data selection explored using immersive display spaces. We conducted a physical experiment to inform design and then conducted a virtual experiment to evaluate feasibility utility, and usability. Overall, this technique produced high utility, usability, and satisfaction in performance ratings. Surprisingly the technique produces low fatigue and overall workload. Our results also include descriptions of strategies to incorporate and adjustments to make on our technique. Our results will be useful for applications that use alternative rendering as well as volume selection to assist with selection tasks in a physically-based manner. We view this technique to be augmented with other techniques as a way for users to use gesture-based interaction when direct manipulation is more useful.

8 Future Work

We outlined several ideas to combat weaknesses and limitations of this technique in the discussion section and throughout the paper. We plan to implement those augmentations to the technique and then conduct an evaluation to compare the performance of our technique with other existing techniques. We will investigate Transitional Return performance in relation to preserved context for expert users in selection scenarios.