Skip to main content
Log in

Online Approximate Model Representation Based on Scale-Normalized and Fronto-Parallel Appearance

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Various object representations have been widely used for many tasks such as object detection, recognition, and tracking. Most of them requires an intensive training process on large database which is collected in advance, and it is hard to add models of a previously unobserved object which is not in the database. In this paper, we investigate how to create a representation of a new and unknown object online, and how to apply it to practical applications like object detection and tracking. To make it viable, we utilize a sensor fusion approach using a camera and a single-line scan LIDAR. The proposed representation consists of an approximated geometry model and a viewpoint-scale invariant appearance model which makes to extremely simple to match the model and the observation. This property makes it possible to model a new object online, and provides a robustness to viewpoint variation and occlusion. The representation has benefits of both an implicit model (referred to as a view-based model) and an explicit model (referred to as a shape-based model). Intensive experiments using synthetic and real data demonstrate the viability of the proposed object representation in both modeling and detecting/tracking objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

References

  • Bertalmio, M., Sapiro, G., & Randall, G. (2000). Morphing active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(7), 733–737.

    Article  Google Scholar 

  • Bouguet, J. Y. (2008). Camera calibration toolbox for Matlab. http://vision.caltech.edu/bouguetj/calib_doc/download/index.html.

  • Boykov, Y. Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In: Proceedings of the international conference on computer vision, (vol.1, pp. 105–112). IEEE Computer Society.

  • Cannons, K. (2008). A review of visual tracking. Technical report, York University.

  • Collins, R. (2003). Mean-shift blob tracking through scale space. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. II–234–40). IEEE.

  • Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5), 564–575.

    Article  Google Scholar 

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (pp. 886–893).

  • Dowson, N., & Bowden, R. (2005). Simultaneous modeling and tracking (smat) of feature sets. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. 99–105). IEEE.

  • Duda, R., & Hart, P. (1973). Pattern classification and scene analysis. New York: Wiley.

    MATH  Google Scholar 

  • Ess, A., Schindler, K., Leibe, B., & Van Gool, L. (2010). Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 29, 1707–1725.

    Article  Google Scholar 

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.

    Article  Google Scholar 

  • Franc, V., & Hlavac, V. (2004). Statistical pattern recognition toolbox for matlab. Prague: Center for Machine Perception, Czech Technical University.

    Google Scholar 

  • Haag, M., & Nagel, H. H. (1999). Combination of edge element and optical flow estimates for 3D-model-based vehicle tracking in traffic image sequences. International Journal of Computer Vision, 35, 295–319.

    Article  Google Scholar 

  • Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Hinterstoisser, S., Cagniart, C., Ilic, S., Sturm, P., Navab, N., Fua, P., et al. (2012). Gradient response maps for real-time detection of textureless objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5), 876–888.

    Article  Google Scholar 

  • Horn, B. K. P., & Schunck, B. G. (1981). Determining optical flow. Artificial Intelligence, 17, 185–203.

    Article  Google Scholar 

  • Jepson, A., Fleet, D., & El-Maraghi, T. (2003). Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1296–1311.

    Article  Google Scholar 

  • Kasper, A., Xue, Z., & Dillmann, R. (2012). The kit object models database: An object model database for object recognition, localization and manipulation in service robotics. The International Journal of Robotics Research, 31(8), 927–934.

  • Koller, D., Danilidis, K., & Nagel, H. H. (1993). Model-based object tracking in monocular image sequences of road traffic scenes. International Journal of Computer Vision, 10, 257–281.

    Article  Google Scholar 

  • Kwak, K., Huber, D., Chae, J., & Kanade, T. (2010). Boundary detection based on supervised learning. In: Proceedings of the IEEE international conference on robotics and automation. IEEE.

  • Kwak, K., Huber, D., Badino, H., & Kanade, T. (2011). Extrinsic calibration of a single line scanning lidar and a camera. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE.

  • Kwak, K., Kim, J. S., Min, J., & Park, Y. W. (2014). Unknown multiple object tracking using 2d lidar and video camera. Electronics Letters, 50(8), 600–602.

    Article  Google Scholar 

  • Leibe, B., Schindler, K., Cornelis, N., & Van Gool, L. (2008). Coupled object detection and tracking from static cameras and moving vehicles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 1683–1698.

    Article  Google Scholar 

  • Lempitsky, V. S., & Ivanov, D. V. (2007). Seamless mosaicing of image-based texture maps. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition. IEEE Computer Society.

  • Lepetit, V., & Fua, P. (2005). Monocular model-based 3D tracking of rigid objects. Foundations and Trends in Computer Graphics and Vision, 1, 1–89.

    Article  Google Scholar 

  • Li, Y., Gu, L., & Kanade, T. (2011). Robustly aligning a shape model and its application to car alignment of unknown pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(9), 1860–1876.

    Article  Google Scholar 

  • Lou, J., Tan, T., Hu, W., Yang, H., & Maybank, S. J. (2005). 3-D model-based vehicle tracking. IEEE Transactions on Image Processing, 14, 1561–1569.

    Article  Google Scholar 

  • Luber, M., Arras, K. O., Plagemann, C., & Burgard, W. (2009). Classifying dynamic objects. Autonomous Robots, 26, 141–151.

    Article  Google Scholar 

  • Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In: Proceedings of the international joint conference on artificial intelligence, (pp. 674–679).

  • MacLachlan, R. (2005). Tracking moving objects from a moving vehicle using a laser scanner. Technical Report CMU-RI-TR-05-07, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.

  • Moreels, P., & Perona, P. (2007). Evaluation of features detectors and descriptors based on 3D objects. International Journal of Computer Vision, 73, 263–284.

    Article  Google Scholar 

  • Mundy, J. (2006). Object recognition in the geometric era: a retrospective. In J. Ponce, M. Hebert, C. Schmid, & A. Zisserman (Eds.), Toward category-level object recognition. Lecture Notes in Computer Science (vol. 4170, pp. 3–28). Berlin: Springer.

  • Nguyen, V., Gächter, S., Martinelli, A., Tomatis, N., & Siegwart, R. (2007). A comparison of line extraction algorithms using 2d range data for indoor mobile robotics. Autonomous Robots, 23(2), 97–111.

    Article  Google Scholar 

  • Ottlik, A., & Nagel, H. H. (2008). Initialization of model-based vehicle tracking in video sequences of inner-city intersections. International Journal of Computer Vision, 80, 211–225.

    Article  Google Scholar 

  • Petrovskaya, A., & Thrun, S. (2009). Model based vehicle detection and tracking for autonomous urban driving. Autonomous Robots, 26, 123–139.

    Article  Google Scholar 

  • Premebida, C., Ludwig, O., & Nunes, U. (2009). Lidar and vision-based pedestrian detection system. Journal of Field Robotics, 26, 696–711.

    Article  Google Scholar 

  • Rav-Acha, A., Kohli, P., Rother, C., & Fitzgibbon, A. (2008). Unwrap mosaics: A new representation for video editing. In: ACM SIGGRAPH 2008 Conference Proceedings. ACM.

  • Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2003) . 3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. II–272–7). IEEE Computer Society.

  • Saragih, J. M., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision, 91(2), 200–215.

    Article  MathSciNet  MATH  Google Scholar 

  • Sato, Y., Wheeler, M. D., & Ikeuchi, K. (1997). Object shape and reflectance modeling from observation. In: Proceedings of the 24th annual conference on computer graphics and interactive techniques, SIGGRAPH ’97, (pp. 379–387).

  • Scharstein, D. (1994). Matching images by comparing their gradient fields. In: Proceedings of the international conference on pattern recognition, (pp. 572–575).

  • Schneiderman, H., & Kanade, T. (2000). A statistical method for 3d object detection applied to faces and cars. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (vol. 1, pp. 746–751). IEEE.

  • Shafique, K., & Shah, M. (2005). A noniterative greedy algorithm for multiframe point correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(1), 51–65.

    Article  Google Scholar 

  • Sinha, S. N., Steedly, D., Szeliski, R., Agrawala, M., & Pollefeys, M. (2008). Interactive 3d architectural modeling from unordered photo collections. ACM Transactions on Graphics, 27(5), 159:1–159:10.

    Article  Google Scholar 

  • Szeliski, R. (2010). Computer vision: Algorithms and applications. New York: Springer.

    Google Scholar 

  • Terzopoulos, D., & Szeliski, R. (1993). Active vision. Cambridge, MA: MIT Press.

    Google Scholar 

  • Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, (vol. 2, pp. II–762–II–769). IEEE Computer Society.

  • Veenman, C., Reinders, M., & Backer, E. (2001). Resolving motion correspondence for densely moving points. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(1), 54–72.

    Article  Google Scholar 

  • Xiang, Y., & Savarese, S. (2012). Estimating the aspect layout of object categories. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2012 (pp. 3410–3417). IEEE.

  • Xiang, Y., Song, C., Mottaghi, R., & Savarese, S. (2014). Monocular multiview object tracking with 3D aspect parts. In: Proceedings of the computer vision–ECCV 2014, (pp. 220–235). Berlin: Springer.

  • Yan, P., Khan, S.M., & Shah, M. (2007). 3D model based object class detection in an arbitrary view. In: Proceedings of the IEEE international conference on computer vision, (vol. 0, pp. 1–6). IEEE.

  • Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: A survey. ACM Computer Survey, 38(4), 13.

    Article  Google Scholar 

  • Yin, Z., & Collins, R. (2007). On-the-fly object modeling while tracking. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition (pp. 1 –8). IEEE Computer Society.

  • Zia, M. Z., Stark, M., Schiele, B., & Schindler, K. (2013). Detailed 3D representations for object recognition and modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2608–2623.

    Article  Google Scholar 

  • Zia, M. Z., Stark, M., & Schindler, K. (2015). Towards scene understanding with detailed 3D object representations. International Journal of Computer Vision, 112(2), 188–203.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kiho Kwak.

Additional information

Communicated by V. Lepetit.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (wmv 6346 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kwak, K., Kim, JS., Huber, D.F. et al. Online Approximate Model Representation Based on Scale-Normalized and Fronto-Parallel Appearance. Int J Comput Vis 117, 48–69 (2016). https://doi.org/10.1007/s11263-015-0848-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-015-0848-3

Keywords

Navigation