Keywords

1 Introduction

Augmented reality systems enhance a real scene caught by a camera with virtual objects, in real time. The first augmented reality applications were based on marker tracking and consisted of only superposing basic augmentations on the real scene, like: text, images or simple 3D objects.

Further, more advanced technology has been developed using marker less tracking on 2D image or 3D objects. These technologies are usually based on computer vision or depth cameras, but they can also be combined with other devices like accelerometer, GPS, compass, and deployed on mobile devices. Nowadays, augmented reality is useful in many applications: military, industrial, medical but also gaming, commercial and entertaining.

The key measure of an AR system is how realistically it integrates augmentations with the real world. In the simplest systems, the augmentations are superposed to the image of real environments and they always appear in front of the real objects. In a more realistic approach the virtual and real objects are coexisting and real objects can also be placed in front of virtual ones. In this paper, we introduce a solution to manage the occlusion in augmented reality systems.

Our project proposes a solution tested in two specific cases:

  • An application simulating the machining of a virtual piece on a real machine-tool.

  • An application creating a “magical mirror” in which a patient can see his face reflection wearing a virtual dental prosthesis.

We focus our discussion on the first case because the methodology used and the solution tested is the same in the two cases. For the second case, only the differences and the results will be presented.

2 Background and Related Work

Realistic merging of virtual and real requires that the virtual objects behave in a physical manner in the created environment: they can be occluded or shadowed by the real objects around. Theoretically, this can be attained by comparing the depth of each object on the scene and place them accordingly. In practice, different teams proposed different solutions.

To the best of our knowledge, in 1997 Berger [1] published the first algorithm that managed this problem by tracking the objects that can cause occlusion in the 3D scene using image analysis. Since then, several improvements have been proposed: Tian et al. [2] 2010 and Sanches et al. [3] 2012 have used the same method, while Lepetit et al. [4] used stereovision-based approaches to improve the 3D tracking.

With a similar concept, Kamat et al. [5] used “Time of Flight” (TOF) cameras, achieving better results in terms of speed and robustness.

A team from the Tübingen University (Fischer et al. [6]) proposed an approach based on the prior knowledge of the shape and position of the objects that may cause occlusion. Although limited to a specific application, their method was capable of generating augmented images, in which virtual objects appeared correctly overlapped or occluded by the patient anatomy.

In 2012, Dong et al. [7] tried to combine a depth-sensing TOF camera with frame buffer algorithms. The depth information was processed in parallel using OpenGL shading language (GLSL) and render-to-texture (RTT) techniques.

A most recent work of Leal-Meléndrez et al. [8], from 2013, presented a strategy based on Kinect sensor. The distances between real and virtual objects are calculated and the occluded parts of the virtual object are removed from the scene.

The solution proposed by our project is a combination of the last two approaches: the prior knowledge of the environment with frame buffer algorithms ([6, 7]). This solution is tested in the two specific and complementary scenarios:

  • A known environment with rigid and moving objects, where the position of the camera does not change.

  • An unknown environment with deformable and moving objects, where the position of the camera changes.

3 First Use Case

3.1 Scenario

In the first scenario we choose to test our algorithm in a known environment, the inside of machine tool, where the objects, the piece and the tools, are rigid and the position of the camera doesn’t change.

For this scenario we are simulating the machining of a virtual piece on a real machine-tool in real time. The machine is a single-spindle, automatic turning with CNC, EvoDeco 16a, produced by Tornos SA. EvoDeco is a high-end machine, designed to create small pieces with a diameter up to 16 mm (see Fig. 1). The machine has a Fanuc CNC system and an integrated computer. The CNC controls all the machine functionalities: the movements of the working piece and the displacement of the tools on each tool system.

Fig. 1.
figure 1

EvoDeco 16a

First of all, for a better understanding of the project a brief introduction of how the machine works is required. The machine has four independent tool systems (1 → yellow, 2 → orange, 3 → blue, 4 → green and brown as depicted in Fig. 2), consisting in 10 translations and 6 rotating axes. Each of these systems can hold several tools and they can have two or three degrees of freedom. Different tools can be used to shape the raw material into the wanted form. Within our project only the first system (with 2 degree of freedom) is used. The red part in the Fig. 2 is the spindle and it is used to hold the raw material bar. The spindle has 2 axes: one for the rotation and one for the translation. This means the bar is continuously rotating and moving on a horizontal line (left-right on the Fig. 2) while the active tool can execute the cutting.

Fig. 2.
figure 2

EvoDeco tools system

In a normal workflow process:

  • The operator of the machine designs the piece.

  • He/she chooses the appropriate raw material bar and puts it into the spindle part.

  • He/she chooses the appropriate tools to cut the bar and fixes them onto the right support.

  • He/she programs the machine, providing a sequence of movement instructions to follow on the different axes.

The innovative part of this project is to use the AR for machining the piece. The machine is a close and secure environment. Using AR the users can have an overview of what is happening inside the machine and how the raw bar is shaped in real time. AR application gives a more immersive perception of the whole experiment and it can be used as a sales or control tool.

Even if the piece is machined in a virtual environment, the normal workflow of the machine must be followed, and therefore the piece must be designed first. We choose to create a simple piece of 30 mm length and 12 mm diameter, made of brass. It is composed of three cylinders of the same length but with different diameters (see Fig. 3).

Fig. 3.
figure 3

Example of a final piece

The needed tools are: a cutting tool for the initial and final cut and a 90° burin for the rest of the form. A file containing a sequence of movements must also be provided to the machine. The only difference with a real machining is that the machine is working without any material bar, in a so called matterless state (Fig. 4).

Fig. 4.
figure 4

Working environment

As any normal AR system requires a camera, an IDS uEye 5250 is placed inside the machine oriented towards the spindle and the main tools that are used for machining the piece. The position of the camera was carefully chosen to avoid disturbing the normal flow of the machine. Finally, we attached the camera to the machine wall with a magnetic support. The near machine lamp lighted up the machine interior.

The simulation is working on a computer connected to the machine. The rendering can be seen on a screen nearby the machine (see Fig. 4).

3.2 Methods

The technical approach of the solution for the occlusion problem is described below.

3D Models and Animations.

The real piece that we are simulating in AR is changing her shape continuously according to the machine program. The same behavior must be reproduced on the virtual environment. The solution adopted is to create a 3D model of a cylinder of 30 mm length and 12 mm in diameter which represents the initial bar and to apply a deformation to this model for simulating the real machining. To achieve a good quality performance trade off the initial cylinder is modeled by using a lot of little stacked cylinder slices (more precisely, 30 slices). By knowing the exact movements of the machine from the machine program, we pre-calculated the deformation before the visualization is rendered on the scene. The deformation consists in changing the diameter of each little cylinder that collided with the active tool.

However, the piece is not the only 3D model used in the simulation. In order to implement our occlusion solution, we also created the 3D models for each object interacting with the piece on the real machine: in our case, the two tools used on the real machine to cut the piece and the spindle that holds the metallic bar. Because those parts are more complex than a simple cylinder, a modeling tool is used (3DS Max) and the result is imported into the simulation program. An animation sequence is created using the same machine program in order to apply the movements of the real tools to their virtual representation.

Occlusion Management.

The most important part of this project is the management of the occlusion. The proposed solution is based on a prior knowledge about the sizes and the positions of the objects which will partially hide the augmentations. Having this information we create geometrical representation of those objects and we add them into the virtual world. The next step is to use frame buffer algorithms to replace the pixels of those objects with the pixels of the same object rendered in the video image. The resulting image is rendered to the screen.

A 3D model of each of these objects is created and added in the virtual world. Initially, the spindle is not moving however the piece is moving. Thus at the beginning only a little part of the piece is visible while in the end the whole piece will be exposed. As for the tools, they are continuously moving to shape the material into the final piece. In some frames they can be in front of the piece, hiding it either partially or totally. In the virtual world, those cases are managed by the 3D engine (OpenSceneGraph) by using occlusion culling algorithms that disable the rendering of objects when they are not currently seen by the camera. Using only the objects that generate occlusions we create a mask at each frame. The white pixels in the mask (Fig. 5, on the right) represent the position of this objects caught for one frame. Using this mask we replace them with the pixels found into the same location on the video image. The resulting image is shown at the screen and is creating the visual effect of the real tool cutting a virtual piece.

Fig. 5.
figure 5

Scene without occlusion algorithm (on the left) and occlusion mask (on the right)

The Communication Between the Simulation and the Machine.

The simulation is performed in real time. On the screen the user will see the machining of the virtual piece using the real tools. The two worlds must be synchronized to achieve this state. The machine is working due to the machine program. The same file is parsed and used to animate the virtual representation of machine parts. At the same time the two of them must move together. An IP connection was established to assure the communication between the machine and the simulation. This connection allows us to periodically require the current status of the machine. According to the state of the machine the pre-calculated piece animation is updated (paused or played).

4 Second Use Case

4.1 Scenario

In the second scenario, a more challenging environment was tested. This simulation shows to the user an augmented image of herself/himself wearing his new dental prosthesis. Therefore, the camera is moving and the objects are deformable. We chose to limit the final demo to a full dental prosthesis having the same color for each user. This case is much more complex from a technical point of view, because a tracking system must be developed in order to place the prosthesis at the right place on the user’s image. A Kinect depth camera was replacing the simple IDS camera. Using Kinect allowed us to track the relative position of the user, and in particular her/his face, to the camera.

4.2 Methods

In this case, a virtual prosthesis was modeled using the same 3D modeling tool, 3DS Max. For a higher degree of freedom two models were used: one for the upper jaw and another for the lower jaw.

The real challenging part of this simulation was to place the prosthesis on the right position on the user image. The Kinect face tracking is used to determine the user position relative to the camera. The face tracking outputs the position of 100 points on the tracked face. The closer points to the interest zone are the tip of the nose and chin. Those points are used as the reference points to approximate the exact position of the prosthesis at each frame. An animation sequence is used to translate on the vertical axis the two models according to the mouth opening. The distance between the two reference points is computed each frame and is compared with the default distance which corresponds to the “mouth close” state.

In this second scenario the environment is not known and the interactions between the real parts and the virtual parts are very different. The virtual parts, the teeth, are visible only when the user smiles or opens the mouth. Using Kinect Face tracking, we could retrieve the mouth contour at each frame. This contour is used to create the replacement mask (Fig. 6, left). In this case all the pixels that are outside the mouth opening are replaced with the pixels on the video image caught by the Kinect RGB camera. The finale image is rendered to the screen and the result can be seen in the Fig. 6 (right).

Fig. 6.
figure 6

Occlusion mask (left) and the scene as resulted from the occlusion algorithm

5 Discussion

The preliminary tests we performed during this study showed a series of interesting points to be analyzed and developed in future works.

The simulation of the virtual piece is very realistic. The solutions chosen to solve the challenges of this application show very good results. The application is synchronized with the machine. The piece is shaped according to the movements of the machine and thanks to our occlusion management the real tools can be seen in front of the piece (Fig. 7).

Fig. 7.
figure 7

Final scene with occlusion

For the next steps, it will be interesting to extend the system to a different piece with a more complex shape, or to another machine. Furthermore, we can imagine a system where more than one camera is used in order to have different angles of view or a system where the user can control the camera movements. Finally, in order to make the integration more realistic, it will be interesting to change the lights and shadows of the virtual world accordingly with the light conditions of the real world.

The second scenario was more complex to treat. Therefore, the results of the simulation were less stable compared to the first scenario. The management of the occlusion is working well but the tracking is not always so robust. In the present implementation we used the first version of Kinect for Windows which has some limitations: the user must be positioned at a minimum distance of 50 cm from the camera and he should remain in the field of vision of the camera (57.5° horizontal × 43.5° vertical). A good improvement of the developed system will be to use a more efficient device for tracking the user position, like Kinect version 2 or another device. Furthermore, we can imagine changing the dental prosthesis color to adapt to the user’s features, or using partial prosthesis accordingly to the user’s needs.

6 Conclusion

Occlusion management could be a big improvement for augmented reality in different fields of applications. It allows:

  • Making the integration between real and virtual world seamless.

  • Increasing the immersion offered by AR application.

  • Opening the door to numerous applications otherwise impossible

In this work, we developed a system for managing the occlusion problem of augmented reality systems and we tested it on two concrete cases. We chose this particular approach for at least two reasons: the first one was to gather knowledge while starting with a controlled environment and then, to apply this experience to a more complex case to study the current limitations. The second reason was to ensure that our solution is reliable and adaptable to various situations.

Despite the limitations discussed in the previous section, most of them resulting from the limitations of the tracking device, this work has brought us one step closer to a realistic and easy way to use augmented reality applications out of the lab. The quick evolution of new devices available on the market might soon close the gap by providing a more accurate tracking.

For what concerns the use of augmented reality in the field of machines-tools, we think virtual machining is quite promising. We can imagine also other applications like machines-tools teaching, machines-tools maintenance, and marketing applications.

Finally, we have to continue to explore others technologies and approaches to apply them in other fields of application.