Keywords

1 Introduction

During the study and research on ways of interaction between humans and electronic devices, using their body as a means, focus of Natural Interaction (NI) study, it was found that there are several possibilities of interaction: through voice commands, haptic devices, olfaction, locomotion, gestures or detection and identification of human body parts such as face, hand, thumb, eye retina [1].

Human-Computer interaction (HCI) studies the communication between people e computational systems, it is situated in the intersection between Behavioral Sciences, Social Sciences and Computational and Information Sciences, involving all aspects related to the interaction between users and systems [2]. The research on HCI stimulated to investigate how to use NI as an alternative to the conventional keyboard and mouse complex, with the intention to make the interaction more simple and intuitive.

Technology is used daily by almost everyone, software solutions are used to perform lots of tasks, therefore simplicity becomes necessary to ease interaction with systems and leads to an easier and more sustainable relationship with the media and technology [3].

With the crescent research in NI equipment, such as Kinect (launched in 2010) and Leap Motion (launched in 2013), accuracy in movement detection is improving, thus providing more possibilities in the use of such hardwares. Older NI equipment such as Kinect, have very affordable prices now, which can create another range of possibilities, for example, its use in public schools.

There are studies that focus on perceptual technology and software development, disregarding elements relative to the determination of gestures, for example: [4], which despite being of great importance to the area, does not seek to understand gestures in search of the most appropriate choice for the interfaces created, which can lead to using tiresome, unintuitive or non-functional gestures and can harm the performance of the application.

The process of associating a gesture to a particular action or function of the system is not trivial, because it must take into account a number of factors such as ergonomics, intuitiveness and objectivity [5, 6], which highlights the importance of a Gesture Development Process (GDP). The Process below described aims to help the development of gestures for user interfaces.

2 Describing the Gesture Development Process

The present research shows the construction of a Process that can assist in deciding which gesture would better represent a certain function (action) of a system. The main references used in developing this process can be seen in [7, 8]. This GDP produces artifacts during its execution, the final product of this Process is a Template depicting the finalist and best gestures for the given functions.

The GDP here described has three stages (Fig. 1). In the first stage, “Define Functions”, the problem domain is analyzed and the functions of the application to be developed are determined. This stage is not concerned with any kind of gesture, only with determining the functions to which you want to assign gestures.

Fig. 1.
figure 1figure 1

Illustration of the process here presented

To better the objective of the first stage, understand observe Fig. 2, this image was taken from the game Fruit Ninja [9], in this game players should “slice” fruits as they jump into the screen, on XBOX360 platform, the Kinect device captures user’s gestures in order to make the interaction with the game possible. In this context, the main function of the game is “slice”, but in the game’s menu other functions such as “select” and “cancel” can be observed. So as to exemplify, in the first stage the developer should define which of these functions it would be desired to be triggered by gestures.

Fig. 2.
figure 2figure 2

Fruit Ninja

The second stage consists of three steps. The first step, “Apply Test Scenarios”, needs volunteers in order to perform tests to generate prototype-gestures for the functions determined in the previous stage. One or more test scenarios, that abstract technical thinking and at the same time contextualize the functions to stimulate the volunteers to execute them, are created. All the interactions at this stage are recorded, the number of participants should be directly proportional to the diversity of the gestures obtained for each function, in other words, as the tests are applied, the more diverse are the gestures for each function, the more volunteers are necessary.

During the second step, “Analyze and Register Recordings”, an artifact is created containing the analysis and registers of the identified gestures in the recordings. In this step a decision must be taken: if the results are not considered sufficiently favorable, it is necessary to go back to the previous step, but now with prototypes-gestures as final suggestions for volunteers, in order to refine the gestures development process. Otherwise, go forward to the third step.

In the third step, “Define Vocabularies of Gestures”, the most favorable gestures are determined, based on the previously produced artifact, taking into account the most frequent, the ones chosen by the designer and those whose movement and posture do not compromise physical structures and articulations. A template of the chosen gestures is then generated to be used in the next stages.

It is worth mentioning that one can determine various gesture-vocabularies. At the end of this second stage two artifacts should have been produced: the registers of the identified gestures in the recordings and the template of the chosen gestures.

The third and last stage has two steps, In “Test Vocabulary” the process evaluates the N gesture vocabularies through three tests: Attribution of Semantics, Memory and Stress Tests, as described in [7]. Each test has a score and at the end it will be possible to decide the best vocabulary through the score comparison. In the case of only one vocabulary the score it will be used to infer its quality.

In the following tests, the N vocabularies obtained are tested separately, the same volunteers evaluate all vocabularies, it is important to vary the vocabulary witch each volunteer will start evaluating, in other words, in the case of being two vocabularies, a volunteer will first evaluate the vocabulary A then B, and other volunteer would start with B and later evaluate A, trying to make the numbers of volunteers starting with A equal to the ones starting with B.

The Attribution of Semantics Test consists in presenting to the volunteer a template of the Gesture Vocabulary (GV, meaning a group or list of gestures) and a list of functions, however it’s not revealed the correspondence between the gestures and functions. It is then asked for the volunteer indicate which gesture corresponds to each function. The score is the sum of wrong guesses, divided by the number of gestures.

The Memory Test measures the familiarity of the user with the gesture. Only after the user is acquainted with the gestures is that he will able to maintain the focus in the task at hand rather than to how operate the interface. First it is shown a GV to each volunteer, later, a slideshow with all the functions, staying 2 s in each function. The volunteer is then asked to do the correspondent gesture of that function, until being able to get all functions right, when mistaken the presentation is restarted and the vocabulary is revised. The score of the memory test is calculated by the number of restarts necessary, resulting in a measurement of the difficulty for a new user to become familiar with the gesture vocabulary.

The Stress Test shows the volunteers a sequential list of gestures. Each volunteer should repeat the sequence a determined number of times, enough so that the he can infer any possible discomfort. At the end of the test, each volunteer is questioned about how stressing is each gesture, giving a general classification for them: “no problem”, “slightly tiresome”, “annoying”, “painful” or “impossible”.

At the end of all tests, in “Show Vocabulary of Finalist Gestures”, the vocabularies can be compared through their scores and classifications, then it would be drawn a final template for the gestures of the chosen vocabulary, so as to aid the implementation of these gestures to their respective functions on the application.

3 Applying the Gesture Development Process

To demonstrate the application of the GDP described above, the idea was to develop gestures for common functions in 3D environments, the chosen functions were “Rotate”, “Scale” and “Translate”, as these are the main two-dimensional and three-dimensional geometric transformations [10], and “Select” and “Stop”, because they are related to the beginning and end of interactions in general.

The first stage is to select which function will receive gestures as triggers. In this case; “Rotate”, “Scale”, “Translate”, “Select” and “Stop”.

In the “Apply Test Scenarios” step of the second stage, a case scenario application was created to stimulate the volunteers in the production of gestures for the functions mentioned before (Fig. 3). The Cube application featured a colorful cube that could be selected, rotated, translated, scaled and stopped (when stopped, no interactions could be performed with the cube), all through the mouse and keyboard input. The idea was that the volunteers could observe this interactions, then they were asked to perform gestures that, in their minds, would seem intuitive to trigger the before functions. This test was first carried out with 12 volunteers, all students from Computer Science bachelor, all tests were recorded.

Fig. 3.
figure 3figure 3

Cube application

With the recordings from the first round of tests, it was time to proceed to the second step, “Analyze and Register Recordings”. Here the recordings were analyzed and the gestures captured were registered as shown in the Tables 1 and 2. The most popular gestures were S1, R3, T2, E1 and P2. The decision here was made in favor of another round of tests, in order to see if there would be many more new gestures or if 12 volunteers were enough.

Table 1. “Select”, “Rotate”, “Translate”
Table 2. “Scale”, “Stop”

The first step was then repeated with another 12 volunteers, also students from Computer Science, only now the tests would provide the gestures in Tables 1 and 2 as options at the end of each test, so that the volunteer could change its choice if he preferred one of the gestures presented.

Going forward to step 2, the registry showed one new gesture for “Select”, two new gestures for “Rotate” and “Translate”, and three new gestures for “Scale” and “Stop”. The most popular gestures now were still S1, R3, T2, E1 and P2. This was a good indicator that the first 12 volunteers produced a good variety of gestures, yet it was decided to do a last round of tests with students from different areas, to see if the results would be alike.

Back in to the first step another 12 volunteers repeated the last test, but now with more gesture options at the end of each test. The volunteers from this third round of tests were students from several majors including: Electrical Engineering, Law, Public Administration and Medicine. Going forward to step 2, the registry showed two new gestures for “Select”, “Rotate” and “Scale”, four new gestures for “Translate”, and no new gestures for “Stop”. The most popular gestures with this last group of volunteers were S1, R3, T2, E4 and P2. But in the total 36 volunteers the most popular gestures were still S1, R3, T2, E1 and P2. Seeing as that from 12 to 36 volunteers only one gesture changed (E1 to E4) the results were satisfactory and it was decided to proceed to the next step “Define Vocabularies of Gestures”.

It was selected two gestures vocabularies; one containing: S1, R3, T2, E1 and P2, the most popular choice between the 36 volunteers, and another containing: S3, R7 (Table 3), T1, E4 and P8 (Table 3), less popular choices but considered good to be compared to the first vocabulary, in the next stage. The template produced in this third step can be seen in Tables 4 and 5.

Table 3. Gestures R7 and P8
Table 4. Gesture Vocabulary 1
Table 5. Gesture Vocabulary 2

In the third stage, Attribution of Semantics, Memory and Stress tests were performed with Gesture Vocabularies 1 (Table 4) and 2 (Table 5). This stage counted with 19 volunteers, students from several majors: Civil Engineering, Medicine, Computer Science, Electric Engineering, Law, among others.

In the Attribution of Semantics test the templates (Tables 4 and 5) and a list with the functions before commented were presented to each volunteer. It was then asked for the volunteers to indicate which gesture corresponds to each function. The results for each GV can be seen in Table 6. Bear in mind that the score is the sum of wrong guesses divided by the number of gestures, so a perfect score would be zero.

Table 6. Results from attribution of semantics test

The Memory test showed to each volunteer the GVs 1 and 2 and a slideshow with one slide for the name of each function (“Select”, “Rotate”, “Translate”, “Scale” and “Stop”.), the slideshow would linger 2 s on each slide. When reading the name of the function in a slide, the volunteers were asked to do the correspondent gesture of that function, if they did a wrong gesture the presentation was restarted.

The score of the memory test was calculated by the number of restarts necessary until the volunteer got all gestures correct. This was carried out twice for each volunteer since there was two GVs. The results can be seen in Table 7. The best result would be zero.

Table 7. Results from memory test

The Stress Test showed the volunteers a sequential list of the gestures from each GV. Each volunteer was then asked to repeat the sequence 50 times, for each gesture in each GV. At the end of the test, each volunteer was questioned about how stressing was each gesture, giving a general classification for each of the gestures in both GVs: “no problem”, “slightly tiresome”, “annoying”, “painful” or “impossible”. The result can be seen in Tables 8 and 9.

Table 8. Results from stress test - GV 1
Table 9. Results from stress test - GV 2

In the end the Gesture Vocabulary 1 was slightly better them the GV 2. As this work is still in progress the next step will be to implement these gestures in the Cube application (Fig. 3), using the Leap Motion device and with aid of the final template (Table 4).

4 Final Considerations

The interaction style supported by Natural Interaction devices, such as Leap Motion and Kinect, has a wide variety of potential applications, well beyond those described in this paper. When the interaction style is based on gestures, gesture design and development is among the issues to be addressed.

This paper presented a strategy to guide designers throughout the user interfaces based on gestures process. Nowadays, gestures are a realistic solution for user interfaces. Devices like Leap Motion and Kinect make it easier to develop this kind of user interfaces. The challenge is how to design them. This is the main goal of this paper.

In order to do that, the GDP depicted here presented a guide with all the steps to identify and specify gestures. Also, verification techniques are presented. A conceptual proof of the proposed strategy was verified using the Cube application. Also, a prototype using the Leap Motion device was implemented.

So, we turn back to our initial question: How to Design an User Interface Based on Gestures?

We are still thinking of ourselves as working stations, primarily based on computers devices. But things are changing and the physical world’s functionalities are devices to access computer’s functionalities. This paper approached our first big step in this direction: the gestures design. So, we emphasize the point of view of Pierre Wellner: “Instead of making us work in the computer’s world, let us make it work in our world.” [11].