3D S.O.M.—A commercial software solution to 3D scanning
Introduction
Conventionally generating photo realistic 3D models has been an expensive and time-consuming operation. To automate this process, a wide variety of 3D scanning devices is available, for example, using laser-based systems (e.g., Cyberware body scanners), structured light systems as well as passive systems (e.g., Geometrix LightScribe). Some of these high end devices can recover full-colour texture as well as geometry data. One particular problem with laser-based and structured light systems is that they do not work well with shiny reflective objects and it is often necessary to coat the surface with a non-reflective layer (e.g., dust with chalk) which can result in unsatisfactory texture data. Existing passive systems often require an expensive turntable component that limits the size of the object being scanned. These systems often require professional lighting (e.g., diffuse lighting) to produce satisfactory textures.
Our work is motivated by the desire to produce a low-cost, portable 3D scanning system based on hand-held digital photographs. The only hardware requirements for our system are a camera, a black and white calibration pattern printed on any standard printer, a uniform backdrop and a PC. We designed the system to be as easy to use as possible although a certain amount of skill is required in taking the photos and editing any artifacts in the output model. The target model quality is “reasonable looking” textured 3D models suitable for 3D on the web. In general the more expensive hardware systems are capable of producing very high quality 3D models. However, we have found that our low-cost software solution can often produce comparable results, especially when comparing models suited to bandwidth limited e-commerce applications. In this context the appearance (and file size) of the textured model is much more important than the geometric accuracy.
On a system level our approach is similar to that of Niem [1]. Like Niem we use a calibration object to ensure accurate camera parameter estimation for arbitrary objects. Similarly to ensure the system can handle untextured or reflective objects and uncontrolled lighting we also use a “shape from silhouettes” approach rather than rely on stereo feature matching [2] or colour consistency [3]. In contrast to Niem, our system benefits from a new robust calibration object, a novel batch technique for exact computation of the “visual hull” and a novel texture blending scheme.
An alternative approach to using a calibration object is to use “Structure From Motion” (SFM) techniques (e.g., Fitzgibbon et al. [4]). This has proved to be very effective in recovering camera motion from tracked feature trajectories. However in our application, we wish to be able to handle objects with little or no texture detail making feature tracking difficult. For feature tracking to be robust, video input is generally required limiting the resulting textured models to video quality. Furthermore SFM can fail for scenes consisting of highly reflective (e.g., metallic) objects.
Currently, we extract a simple texture map to represent the appearance of the 3D surface. Structured or controlled lighting techniques such as Bernardini [5] allow the recovery of albedo and normal maps for more accurate re-rendering of an object in new lighting conditions. However, we chose an uncontrolled lighting solution to reduce the system cost and simplify image acquisition.
An alternative to explicitly modeling the surface reflectance properties is to use an image based rendering approach. MacMillan describes such a technique where images of an object are acquired from multiple viewpoints and with multiple lighting conditions [6]. This approach can produce very high quality results but is not well suited for producing small sized models suitable for low bandwidth applications. The image acquisition step is also expensive and potentially slow. Proprietary image-based solutions such as QTVR suffer from large file size and also require large numbers of input images to ensure smooth viewpoint changes when rotating the model.
Section snippets
System overview
To engineer a reliable commercial modelling system we chose to base our solution on two well understood simple techniques—a calibration object is introduced into the scene and we use a “shape from silhouettes” approach.
To automate silhouette detection we utilise a uniformly coloured backdrop placed behind the object. The segmentation is further improved with the novel use of a stand to separate the object from the calibration object.
To summarise, the main steps involved in the system are as
The 3D S.O.M. calibration target
We require a calibration target which can be placed in a scene to enable the accurate and robust recovery of the unknown camera parameters.
Previously seen targets (e.g., Niem [1], Gortler [8]) require coloured features (less reliably detected in uncontrolled lighting) or complicated detection schemes. Niem for example requires the reliable detection of thin line features which can easily become obscured by shadows, highlights, etc. The Niem pattern consists of two concentric thin circles joined
Background
Many 3D modeling systems use the “shape from silhouette” approach to computing the shape of an object from a set of images taken from known positions [1]. The approach uses the “visual hull” approximation to the shape, which is the maximum volume that reproduces all the silhouettes of an object [10]. A good approximation to the visual hull can be obtained by intersecting the back-projection of a finite set of silhouette images. Silhouettes can be easily obtained in controlled environments
Background
In practice, the measured colour and intensity for a surface element observed in different photographic images will not agree. This is due to the interaction between real world lighting effects (such as highlights) and variations in the camera gain settings as well as registration and surface modelling errors.
A common approach for blending image data for texturing is to use a triangle-based scheme [1], [18], [19]. In general, these techniques rely on a regular triangular mesh model (with a
Results
Although there are obvious limitations with silhouette based surface reconstruction (e.g., modelling concavities) we have found that the synthesised views of the texture mapped model are surprisingly convincing. Fig. 10 shows some typical examples of novel views obtained in 3D S.O.M. next to a typical input camera image. More example models are available from the 3D S.O.M. web site (http://www.cre.canon.co.uk/3dsom and http://www.3dsom.com).
Conclusions
We have described a software solution to 3D scanning from photos. Our approach has been to make minimal assumptions about the object and scene lighting by introducing a robust calibration object and utilising an efficient “shape from silhouettes” technique to improve robustness and performance. The key novel contributions of this work are—robust camera estimation using a carefully designed calibration target, batch visual hull calculation that extends the silhouette approach to large numbers of
Acknowledgments
The authors wish to thank Simon Rowe and Qi He Hong for useful discussions and comments and Canon for funding the work.
References (22)
Automatic reconstruction of 3d objects using a mobile camera
Image Vision Comput.
(1999)Rapid octree construction from image sequences
CVGIP: Image Understand.
(1993)- et al.
On approximating polygonal curves in two and three dimensions, computer vision, graphics, and image processing
Graph. Mod. Image Process.
(1994) - et al.
A silhouette-based algorithm for texture registration and stitching
Graph. Mod.
(2001) - R. Koch, M. Pollefeys, L. Van Gool, Multi viewpoint stereo from uncalibrated video sequences, in: ECCV98, 1998, pp....
- S. Seitz, C. Dyer, Photorealistic scene reconstruction by voxel coloring, in: CVPR97, 1997, pp....
- A.W. Fitzgibbon, A. Zisserman, Automatic camera recovery for closed or open image sequences, in: European Conference on...
- et al.
High quality texture reconstruction from multiple scans
IEEE Trans. Visualization Comput. Graph.
(2001) - et al.
Image-based 3d photography using opacity hulls
ACM Trans. Graph.
(2002) - A. Smith, J. Blinn, Blue screen matting, Comput. Graph. 30 (Annual Conference Series) (1996)...
Cited by (23)
An interactive photogrammetric method for assessing deer antler quality using a parametric Computer-Aided Design system (Interactive Photogrammetric Measure Method)
2016, Biosystems EngineeringCitation Excerpt :Recently, several techniques and commercial tools based on the creation of networks of apparent contours have been used to process the photographs. The 3D models obtained are promising, but these are only useful when high accuracy is not required and the objects are not complex (Baumberg, Lyons, & Taylor, 2005; Prakoonwit & Benjamin, 2007; Remondino & El-Hakim, 2006). Another fact to be taken into account is that the capture of morphological data of antlers requires taking many photographs, but the real situations usually do not allow us to take more than two or three photos per animal.
3D Digitization of Tangible Heritage
2022, Handbook of Cultural Heritage AnalysisA Three-Dimensional Scanning System for Digital Archiving and Quantitative Evaluation of Arabidopsis Plant Architectures
2021, Plant and Cell PhysiologyIntroduction of all-around 3d modeling methods for investigation of plants
2021, International Journal of Automation TechnologyEffect on Quality of 3D Model of Plant with Change in Number and Resolution of Images Used: An Investigation
2021, Lecture Notes in Electrical Engineering