Elsevier

Graphical Models

Volume 67, Issue 6, November 2005, Pages 476-495
Graphical Models

3D S.O.M.—A commercial software solution to 3D scanning

https://doi.org/10.1016/j.gmod.2004.10.002Get rights and content

Abstract

This paper describes the novel features of a commercial software-only solution to 3D scanning—the 3D Software Object Modeler. Our work is motivated by the desire to produce a low-cost, portable 3D scanning system based on hand-held digital photographs. We describe the novel techniques we have employed to achieve a robust software-based system in the areas of camera calibration, surface generation, and texture extraction.

Introduction

Conventionally generating photo realistic 3D models has been an expensive and time-consuming operation. To automate this process, a wide variety of 3D scanning devices is available, for example, using laser-based systems (e.g., Cyberware body scanners), structured light systems as well as passive systems (e.g., Geometrix LightScribe). Some of these high end devices can recover full-colour texture as well as geometry data. One particular problem with laser-based and structured light systems is that they do not work well with shiny reflective objects and it is often necessary to coat the surface with a non-reflective layer (e.g., dust with chalk) which can result in unsatisfactory texture data. Existing passive systems often require an expensive turntable component that limits the size of the object being scanned. These systems often require professional lighting (e.g., diffuse lighting) to produce satisfactory textures.

Our work is motivated by the desire to produce a low-cost, portable 3D scanning system based on hand-held digital photographs. The only hardware requirements for our system are a camera, a black and white calibration pattern printed on any standard printer, a uniform backdrop and a PC. We designed the system to be as easy to use as possible although a certain amount of skill is required in taking the photos and editing any artifacts in the output model. The target model quality is “reasonable looking” textured 3D models suitable for 3D on the web. In general the more expensive hardware systems are capable of producing very high quality 3D models. However, we have found that our low-cost software solution can often produce comparable results, especially when comparing models suited to bandwidth limited e-commerce applications. In this context the appearance (and file size) of the textured model is much more important than the geometric accuracy.

On a system level our approach is similar to that of Niem [1]. Like Niem we use a calibration object to ensure accurate camera parameter estimation for arbitrary objects. Similarly to ensure the system can handle untextured or reflective objects and uncontrolled lighting we also use a “shape from silhouettes” approach rather than rely on stereo feature matching [2] or colour consistency [3]. In contrast to Niem, our system benefits from a new robust calibration object, a novel batch technique for exact computation of the “visual hull” and a novel texture blending scheme.

An alternative approach to using a calibration object is to use “Structure From Motion” (SFM) techniques (e.g., Fitzgibbon et al. [4]). This has proved to be very effective in recovering camera motion from tracked feature trajectories. However in our application, we wish to be able to handle objects with little or no texture detail making feature tracking difficult. For feature tracking to be robust, video input is generally required limiting the resulting textured models to video quality. Furthermore SFM can fail for scenes consisting of highly reflective (e.g., metallic) objects.

Currently, we extract a simple texture map to represent the appearance of the 3D surface. Structured or controlled lighting techniques such as Bernardini [5] allow the recovery of albedo and normal maps for more accurate re-rendering of an object in new lighting conditions. However, we chose an uncontrolled lighting solution to reduce the system cost and simplify image acquisition.

An alternative to explicitly modeling the surface reflectance properties is to use an image based rendering approach. MacMillan describes such a technique where images of an object are acquired from multiple viewpoints and with multiple lighting conditions [6]. This approach can produce very high quality results but is not well suited for producing small sized models suitable for low bandwidth applications. The image acquisition step is also expensive and potentially slow. Proprietary image-based solutions such as QTVR suffer from large file size and also require large numbers of input images to ensure smooth viewpoint changes when rotating the model.

Section snippets

System overview

To engineer a reliable commercial modelling system we chose to base our solution on two well understood simple techniques—a calibration object is introduced into the scene and we use a “shape from silhouettes” approach.

To automate silhouette detection we utilise a uniformly coloured backdrop placed behind the object. The segmentation is further improved with the novel use of a stand to separate the object from the calibration object.

To summarise, the main steps involved in the system are as

The 3D S.O.M. calibration target

We require a calibration target which can be placed in a scene to enable the accurate and robust recovery of the unknown camera parameters.

Previously seen targets (e.g., Niem [1], Gortler [8]) require coloured features (less reliably detected in uncontrolled lighting) or complicated detection schemes. Niem for example requires the reliable detection of thin line features which can easily become obscured by shadows, highlights, etc. The Niem pattern consists of two concentric thin circles joined

Background

Many 3D modeling systems use the “shape from silhouette” approach to computing the shape of an object from a set of images taken from known positions [1]. The approach uses the “visual hull” approximation to the shape, which is the maximum volume that reproduces all the silhouettes of an object [10]. A good approximation to the visual hull can be obtained by intersecting the back-projection of a finite set of silhouette images. Silhouettes can be easily obtained in controlled environments

Background

In practice, the measured colour and intensity for a surface element observed in different photographic images will not agree. This is due to the interaction between real world lighting effects (such as highlights) and variations in the camera gain settings as well as registration and surface modelling errors.

A common approach for blending image data for texturing is to use a triangle-based scheme [1], [18], [19]. In general, these techniques rely on a regular triangular mesh model (with a

Results

Although there are obvious limitations with silhouette based surface reconstruction (e.g., modelling concavities) we have found that the synthesised views of the texture mapped model are surprisingly convincing. Fig. 10 shows some typical examples of novel views obtained in 3D S.O.M. next to a typical input camera image. More example models are available from the 3D S.O.M. web site (http://www.cre.canon.co.uk/3dsom and http://www.3dsom.com).

Conclusions

We have described a software solution to 3D scanning from photos. Our approach has been to make minimal assumptions about the object and scene lighting by introducing a robust calibration object and utilising an efficient “shape from silhouettes” technique to improve robustness and performance. The key novel contributions of this work are—robust camera estimation using a carefully designed calibration target, batch visual hull calculation that extends the silhouette approach to large numbers of

Acknowledgments

The authors wish to thank Simon Rowe and Qi He Hong for useful discussions and comments and Canon for funding the work.

References (22)

  • S. Gortler, R. Grzeszczuk, R. Szeliski, M. Cohen, The lumigraph, in: SIGGRAPH 96 Conference Proceedings, ACM SIGGRAPH,...
  • Cited by (23)

    View all citing articles on Scopus
    View full text