Abstract
The case of mixed-reality projector-camera systems is considered and, in particular, those which employ hand-held boards as interactive displays. This work focuses upon the accurate, robust, and timely detection and pose estimation of such boards, to achieve high-quality augmentation and interaction. The proposed approach operates a camera in the near infrared spectrum to filter out the optical projection from the sensory input. However, the monochromaticity of input restricts the use of color for the detection of boards. In this context, two methods are proposed. The first regards the pose estimation of boards which, being computationally demanding and frequently used by the system, is highly parallelized. The second uses this pose estimation method to detect and track boards, being efficient in the use of computational resources so that accurate results are provided in real-time. Accurate pose estimation facilitates touch detection upon designated areas on the boards and high-quality projection of visual content upon boards. An implementation of the proposed approach is extensively and quantitatively evaluated, as to its accuracy and efficiency. This evaluation, along with usability and pilot application investigations, indicate the suitability of the proposed approach for use in interactive, mixed-reality applications.

















Similar content being viewed by others
References
Angeline P (1998) Evolutionary optimization versus particle swarm optimization: Philosophy and performance differences. Evolutionary Programming VII, LNCS 1447:601–610
Audet S, Okutomi M (2009) A user-friendly method to geometrically calibrate projector-camera systems. In: IEEE computer vision and pattern recognition workshops, pp 47–54
Audet S, Okutomi M, Tanaka M (2013) Augmenting moving planar surfaces robustly with video projection and direct image alignment. Virtual Reality 17(2):157–168
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing 2008(1):1–10
Bonnard Q, Lemaignan S, Zufferey G, Mazzei A, Cuendet S, Li N, Özgür A, Dillenbourg P (2013) Chilitags 2: robust fiducial markers for augmented reality and robotics. http://chili.epfl.ch/software. Accessed 18 Oct 2016
Borkowski S, Riff O, Crowley JL (2003) Projecting rectified images in an augmented environment. In: Procams workshop, IEEE computer
Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698
Choi C, Christensen HI (2012) 3D textureless object detection and tracking: an edge-based approach. In: 2012 IEEE/RSJ international conference on intelligent robots and systems, pp 3877–3884
Grammenos D, Michel D, Zabulis X, Argyros AA (2011) Paperview: augmenting physical surfaces with location-aware digital information. In: International conference on tangible, embedded, and embodied interaction, pp 57–60
Grammenos D, Zabulis X, Michel D, Argyros AA (2011) Augmented reality interactive exhibits in cartographic heritage: an implemented case-study open to the general public. In: International workshop on digital approaches in cartographic heritage, vol 6, pp 57–67
Gross T, Fetter M, Liebsch S (2008) The cueTable: cooperative and competitive multi-touch interaction on a tabletop. In: Computer–human interaction ’08 extended abstracts on human factors in computing systems, pp 3465–3470
Gupta S, Jaynes C (2006) The universal media book: tracking and augmenting moving surfaces with projected information. In: IEEE/ACM international symposium on mixed and augmented reality, pp 177–180
Han JY (2005) Low-cost multi-touch sensing through frustrated total internal reflection. In: Proceedings of the 18th annual association for computing machinery symposium on user interface software and technology, pp 115–118
Harrison C, Benko H, Wilson AD (2011) Omnitouch: wearable multitouch interaction everywhere. In: Proceedings of the 24th annual association for computing machinery symposium on user interface software and technology, pp 441–450
Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press
Hartman J, Levas T (2002) Interacting with steerable projected displays. In: Proceedings of the fifth IEEE international conference on automatic face and gesture recognition, p 402
Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2013) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, pp 548–562
Hodan T, Zabulis X, Lourakis M, Obdrzalek S, Matas J (2015) Detection and fine 3d pose estimation of texture-less objects in RGB-D images. In: IEEE/RSJ international conference on intelligent robots and systems, pp 4421–4428
Itseez (2015) Open source computer vision library. https://github.com/itseez/opencv. Accessed 16 Oct 2016
Jones BR, Sodhi R, Campbell RH, Garnett G, Bailey BP (2010) Build your world and play in it: Interacting with surface particles on complex objects. In: 2010 IEEE international symposium on mixed and augmented reality, pp 165–174
Kagami S, Hashimoto K (2015) Sticky projection mapping: 450-fps tracking projection onto a moving planar surface. In: SIGGRAPH Asia 2015 emerging technologies, pp 23:1–23:3
Kalman R (1960) A new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering Series D 82:35–45
Kato H, Billinghurst M (1999) Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In: IEEE and ACM international workshop on augmented reality, pp 85–94
Kato H, Tachibana K, Anf M, Grafe MB (2003) A registration method based on texture tracking using ARToolkit. In: IEEE international augmented reality toolkit workshop, 2003, pp 1–9
Lee JC, Dietz PH, Maynes-Aminzade D, Raskar R, Hudson SE (2004) Automatic projector calibration with embedded light sensors. In: Annual ACM symposium on user interface software and technology, pp 123–126
Lee JC, Hudson SE, Summet JW, Dietz PH (2005) Moveable interactive projected displays using projector based tracking. In: Annual ACM symposium on user interface software and technology, pp 63–72
Leung MC, Lee KK, Wong KH, Chang MMY (2009) A projector-based movable hand-held display system. In: Computer vision and pattern recognition
Luo YM, Duraiswami R (2008) Canny edge detection on nvidia cuda. In: IEEE Computer vision and pattern recognition workshops, pp 1–8
Meijster A, Roerdink JBTM, Hesselink WH (2000) A general algorithm for computing distance transforms in linear time. A general algorithm for computing distance transforms in linear time. Mathematical morphology and its applications to image and signal processing. Kluwer Academic Publishers, pp 331–340. https://doi.org/10.1007/0-306-47025-X_36
Morevec HP (1977) Towards automatic visual obstacle avoidance. In: International joint conference on artificial intelligence, vol 2, pp 584–584
Nakamura T, de Sorbier F, Martedi S, Saito H (2012) Calibration-free projector-camera system for spatial augmented reality on planar surfaces. In: International conference on pattern recognition, pp 85–88
Okumura K, Oku H, Ishikawa M (2013) Acitve projection AR using high-speed optical axis control and appearance estimation algorithm. In: IEEE international conference on multimedia and expo, pp 1–6
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Padeleris P, Zabulis X, Argyros A (2012) Head pose estimation on depth data based on particle swarm optimization. In: Computer vision and pattern recognition workshops, pp 42–49
Pinhanez C (2001) Using a steerable projector and a camera to transform surfaces into interactive displays. In: Computer-human interaction ’01 extended abstracts on human factors in computing systems, pp 369–370
Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57
Raskar R, Beardsley PA, van Baar J, Wang Y, Dietz PH, Lee JC, Leigh D, Willwacher T (2004) RFIG Lamps: interacting with a self-describing world via photosensing wireless tags and projectors. ACM Trans Graph 23(3):406–415
Rekimoto J (2002) Smartskin: an infrastructure for freehand manipulation on interactive surfaces. In: Proceedings of the special interest group on computer-human interaction conference on human factors in computing systems, pp 113–120
Roccetti M, Marfia G, Semeraro A (2012) Playing into the wild: a gesture-based interface for gaming in public spaces. Vis Commun Image Represent 23(3):426–440
Roccetti M, Marfia G, Bertuccioli C (2014) Day and night at the museum: intangible computer interfaces for public exhibitions. Multimedia Tools and Applications 69(3):1131–1157
Song P, Winkler S, Tedjokusumo J (2007) A tangible game interface using projector-camera systems. In: Proceedings of Human-Computer Interaction. Interaction Platforms and Techniques: 12th International Conference, Part II, pp 956–965. https://doi.org/10.1007/978-3-540-73107-8_105
Stephanidis C, Argyros AA, Grammenos D, Zabulis X (2008) Pervasive computing @ ICS-FORTH. In: International conference on pervasive computing
Sueishi T, Oku H, Ishikawa M (2015) Robust high-speed tracking against illumination changes for dynamic projection mapping. In: IEEE virtual reality, pp 97–104
Summet J, Sukthankar R (2005) Tracking locations of moving hand-held displays using projected light. In: International conference on pervasive computing, pp 37–46
Tseng HY, Wu PC, Yang MH, Chien SY (2016) Direct 3D pose estimation of a planar target. In: IEEE winter conference on applications of computer vision, pp 1–9
Wilson AD (2005) Playanywhere: a compact interactive tabletop projection-vision system. In: Proceedings of the 18th annual association for computing machinery symposium on user interface software and technology, pp 83–92
Xie X, Livermore C (2017) Passively self-aligned assembly of compact barrel hinges for high-performance, out-of-plane MEMS actuators. In: IEEE international conference on micro electro mechanical systems, pp 813–816
Xie X, Zaitsev Y, Velasquez-Garcia L, Teller S, Livermore C (2014) Scalable, mems-enabled, vibrational tactile actuators for high resolution tactile displays. J Micromech Microeng 24(12):125,014
Xie X, Zaitsev Y, Velásquez-García L, Teller S, Livermore C (2014) Compact, scalable, high-resolution, MEMS-enabled tactile displays. In: Solid-state sensors, actuators and microsystems workshop, pp 127–130
Xu C, Kuipers B (2009) Murarka, a.: 3D pose estimation for planes. In: 2009 IEEE 12th international conference on computer vision workshops, pp 673–680
Xu Q, Wang Z, Wang F, Li J (2017) Thermal comfort research on human ct data modeling. Multimedia Tools and Applications 1:1–16
Yang J, Li J, Liu S (2017) A novel technique applied to the economic investigation of recommender system. Multimedia Tools and Applications 1:1–16
Zabulis X, Lourakis M (2014) Stefanou, s.: 3D pose refinement using rendering and texture-based matching. In: International conference on computer vision and graphics, pp 672–679
Zabulis X, Lourakis M, Koutlemanis P (2015) 3D object pose refinement in range images. In: Proceedings of international conference on computer vision systems, pp 263–274. https://doi.org/10.1007/978-3-319-20904-3_25
Zabulis X, Lourakis M, Koutlemanis P (2016) Correspondence-free pose estimation for 3D objects from noisy depth data. The Visual Computer
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334
Zhang Z, Wu Y, Shan Y, Shafer S (2001) Visual panel: virtual mouse, keyboard and 3d controller with an ordinary piece of paper. In: Proceedings of the 2001 workshop on perceptive user interfaces, pp 1–8
Acknowledgements
This work was supported by the Foundation for Research and Technology Hellas-Institute of Computer Science (FORTH-ICS) internal RTD Programme “Ambient Intelligence and Smart Environments”. Authors are grateful to Mr. Antonis Hatziantoniou for the implementation of the pilot application and to Mr. Antonis Katzourakis for its graphical design.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
(MP4 302 MB)
Appendix A: Calibration
Appendix A: Calibration
The camera and projector are static and calibrated in a common coordinate system, as follows.
To minimize effort, calibration is split into two parts. The first regards intrinsic parameters and is performed once during installation. The second regards the update of extrinsic parameters and is performed on maintenance (i.e. when the projector is disturbed while changing its lamp, or the camera is accidentally moved during this process). Intrinsic parameters are harder to be modified by accident, while external parameters may change more often in unattended public installations.
Calibration of the projector is based on [2], which projects fiducial patterns on a board and which are detected in the camera image. As the projection is invisible to the NIR camera, an auxiliary RGB camera is used during calibration.
We treat the RGB and NIR cameras as a pair, using [56] to estimate intrinsic, extrinsic and lens distortion parameters simultaneously for both cameras, in an RGB camera centered coordinate system. Then, the projector is calibrated together with the RGB camera and extrinsic parameters are estimated in the RGB camera centered coordinate system. As the RGB camera is common to both pairs, we obtain extrinsic parameters for the NIR camera and the projector, in the same reference. An NIR camera centered system is more convenient and we, thus, convert the extrinsic parameters as follows.
The estimated relative motions between the NIR and RGB cameras, as well as, the RGB camera and the projector are denoted R g , T g and R p , T p , respectively. A homogeneous point x in the RGB camera coordinate system, is expressed as x g and x p in the NIR camera and the projector coordinate systems, with: x g = Q g ⋅ x and x p = Q p ⋅ x, where Q g = [R g |T g ] and Q p = [R p |T p ]. Therefore: \(\textbf {x}_{p}=\textbf {Q}_{p}\cdot \textbf {Q}_{g}^{-1}\cdot \textbf {x}_{g}\) and, thus, \(\textbf {Q}_{p}^{\prime }=\textbf {Q}_{p}\cdot \textbf {Q}_{g}^{-1}\) contains the extrinsic parameters of the projector, in a NIR camera centered coordinate system.
Rights and permissions
About this article
Cite this article
Koutlemanis, P., Zabulis, X. Tracking of multiple planar projection boards for interactive mixed-reality applications. Multimed Tools Appl 77, 17457–17487 (2018). https://doi.org/10.1007/s11042-017-5313-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5313-6