Experiencing real 3D gestural interaction with mobile devices
Highlights
► We presented a novel approach for 3D gestural interaction with mobile devices. ► 3D manipulation of virtual objects on the screen is not limited as 2D interaction. ► This algorithm is computationally efficient and works well in real-time.
Introduction
Currently, gesture detection, recognition or tracking are terms that have frequently been encountered in the discussions of human computer interaction. Gesture recognition enables humans to interact with computers and makes input devices such as keyboard, joystick or touch screen panels redundant. Having a more effective interaction with mobile devices could be the most significant reason behind the manufacturing of devices with larger screens during the recent years. Although the idea of working with larger touch screen displays helps users to have a better interaction with the device, their limitations in 2D space manipulation, remain an unsolved issue. A novel solution for limitations of 2D touch-screen displays is taking advantage of 3D space behind the camera. Manipulation in the camera’s field of view provides a chance for users to work with any mobile device regardless of the screen size or touch sensitivity. As it is shown in Fig. 1, the user’s hand in farther distances from the camera occupies smaller place in the screen, which is a positive point to compensate the limited area for fingers on 2D displays. Moreover, behind the camera, users are capable of moving in depth, that could handle a lot of difficulties in various applications. Our experiments on mobile applications reveal that in 2D interaction on the screen, users have limitations in moving in depth, zooming into observe the details of an image, map, text or zooming out to skim through a data. Even in more complicated applications rotations around different axes are unavoidable.
Here, the question is whether it is possible to solve these limitations by introducing a new interaction environment? Our experiment shows that in 3D interaction under the mobile phone’s camera, fingertips approximately occupy 10–20% area of the touch-screen display. This observation reveals that interaction resolution in 3D space is at least, 5–10 times higher than 2D displays. Moreover, regarding the higher degrees of freedom in 3D space, limitations in rotation can be handled. Therefore, this new interaction environment can be introduced to significantly improve the efficiency and effectiveness of the human mobile device interaction. Our gesture recognition method is based on the rotational symmetries detection in video input from the camera. This method finds patterns from local orientation of the image. Rotational symmetries can represent different groups of basic low level features such as lines, curvatures, circular and star patterns. Our implemented operator searches for particular features in local images and detects expected patterns associated with the user’s gesture. Tracking the detected gesture enables humans to interact with mobile phone in 3D space and manipulate in various applications. Reliable gesture detection algorithm raises a strong possibility of extracting the gesture motion. By finding the corresponding keypoints in consecutive frames, the relative rotation and translation between two image frames can be computed. Our method is based on very low-level operations with the minimum level of intelligence that makes the system more reliable and efficient in gesture detection and tracking part. In order to convey the depth illusion to viewers, stereoscopic techniques are performed for visualization. Our system adjusts the output stereo channels and renders them in different types of color anaglyphs. The 3D output can be displayed on the mobile device and users simply need to use low cost anaglyph eyeglasses to view in 3D.
Section snippets
Related work
Designing a robust gesture detection system, using a single camera, independent of lighting conditions or camera quality is still a challenging issue in the field of computer vision. A common method for gesture detection is marker-based approach. Most of the augmented reality applications are based on marked gloves for accurate and reliable fingertip tracking (Dorfmueller-Ulhaas and Schmalstieg, 2001, Maggioni, 1993). However, in marker-based methods users have to wear special inconvenient
System description
In this part the proposed 3D camera-based interaction approach is presented. As user moves his/her hand gesture in the camera’s field of view, behind the mobile device, the device captures a sequence of images. Then this input will be processed in gesture detection block. As a result, the user’s gesture will be detected and localized. Afterwards, stable features in each frame are extracted to compute the relative rotation and translation of the hand gesture between two frames. Finally, this
Experimental results
As a matter of fact, for a particular gesture behind the mobile device’s camera, users have freedom to move in a reasonable distance. Moreover, depending on the application, they are free to rotate in different angles. Our observation indicates that the effective interaction happens in the space between 15 and 25 cm from the camera. Interaction in the space beyond 25 cm does not seem to be convenient for users. Clearly, for distances below 15 cm, gesture occupies a large area in the screen and
Conclusion
In this paper we presented a novel approach for 3D camera-based gesture interaction with mobile devices. Rotation, translation, and manipulation of virtual objects on the screen are not limited as 2D interaction any more. Our detection algorithm can estimate the position of the user’s gesture in consecutive frames. The relative pose estimation method can accurately extract the relative rotation and translation of the gesture between two consecutive frames, that can be used to facilitate the
References (42)
- et al.
Vision-based hand pose estimation: A review
Computer Vision and Image Understanding
(2007) - Agrawal, S., Constandache, I., Gaonkar, S., Choudhury, R.R., Caves, K., DeRuyter, F., 2011. Using mobile phones to...
- et al.
Accelerometer-Based Hand Gesture Recognition Using Artificial Neural Networks
In Soft Computing for Intelligent Control and Mobile Robotics
(2011) - et al.
Markerless visual fingertip detection for natural mobile device interaction
Mobile HCI
(2011) - et al.
Surf: Speeded up robust features
In ECCV
(2006) - et al.
A New Method of Finger Tracking Applied to the Magic Board
(2004) - Bretzner, L., Laptev, I., Lindeberg, T., 2002. Hand gesture recognition using multi-scale colour features, hierarchical...
- et al.
A color-based face tracking algorithm for enhancing interaction with mobile devices
International Journal of Computer Graphics - Special Issue on Cypberworlds
(2010) - Choe, B.W., Min, J.K., Cho, S.B., 2010. Online Gesture Recognition for User Interface on Accelerometer Built-in Mobile...
- et al.
Enabling a gesture-based numeric input on mobile phones
IEEE International Conference on Consumer Electronics (ICCE)
(2011)
Multiple View Geometry
Cited by (13)
Augmented reality and indoor positioning based mobile production monitoring system to support workers with human-in-the-loop
2024, Robotics and Computer-Integrated ManufacturingAR Peephole Interface: Extending the workspace of a mobile device using real-space information
2021, Pervasive and Mobile ComputingCitation Excerpt :User interfaces that allow users to directly manipulate displayed virtual objects by hand have been proposed. Hurst et al. proposed a method that allows users to manipulate virtual objects in 3D space, such as by scaling, rotation, and transformation, by moving their fingers, to which markers were attached, in the space behind the mobile device [18]. However, they mentioned that performing operations in the air was not stable and was less accurate than the conventional touchscreen-based method.
SlidAR+: Gravity-aware 3D object manipulation for handheld augmented reality
2021, Computers and Graphics (Pergamon)Citation Excerpt :Gesture-based manipulation methods use the device’s camera to detect and track hand gestures performed in front of the camera to manipulate 6 DoFs of the virtual object. Users can perform a variety of gestures, such as pushing, grabbing, or twisting [11,43–46] to manipulate the virtual object. Alternatively, these gestures can also be performed by other devices, like a pen [47].
Direct hand pose estimation for immersive gestural interaction
2015, Pattern Recognition LettersCitation Excerpt :In this experiment, grab gesture is used for object manipulation. The grab gesture is one of the most natural and frequently used gestures to manipulate objects in 3D space, and has been used to perform gesture-based interaction for mobile applications [15,33]. As shown in Fig. 12, system accurately estimates the user hand motion, and updates the model accordingly.
Intelligent learning systems design for self-defense education
2017, Proceedings - 3rd IEEE International Conference on Big Data Computing Service and Applications, BigDataService 2017