ABSTRACT
Stereo correspondence algorithms, which are fast enough for real-time use, require hardware assistance and inevitably trade some matching accuracy for speed. A cloud of 3D points thus produced by our previously reported GPU accelerated implementation of a dynamic programming correspondence algorithm is noisy and contains artifacts, which hinder tracking accuracy. We have augmented this implementation with modules for re-projection and filtering. A fast clustering procedure based upon a set of simple volume rules identifies candidate objects. An opportunistic tagging system tracks objects through occlusions. Kalman filtering predicts positions in the next frame. These steps reduce the effects of dynamic programming streaks in the depth maps. Experiments with synthetic and real-world video sequences confirmed the accuracy in tracking multiple objects (e.g. humans) in various environments.
- S. Asano, T. Maruyama, and Y. Yamaguchi. Performance comparison of FPGA, GPU and CPU in image processing. In Proc. Int. Conf. on Field Programmable Logic and Applications, pages 126--131, 2009.Google ScholarCross Ref
- L. Cai, L. He, Y. Xu, Y. Zhao, and X. Yang. Multi-object detection and tracking by stereo vision. Pattern Recognition, 43(12): 4028--4041, 2010. Google ScholarDigital Library
- T. Darrell, G. Gordon, M. Harville, and J. Woodfill. Integrated person tracking using stereo, color, and pattern detection. International Journal of Computer Vision, 37(2): 175--185, 2000. Google ScholarDigital Library
- G. Gimel'farb. Probabilistic regularisation and symmetry in binocular dynamic programming stereo. Pattern Recognition Letters, 23(4): 431--442, 2002. Google ScholarDigital Library
- M. L. Gong and Y. H. Yang. Real-time stereo matching using orthogonal reliability-based dynamic programming. IEEE Trans. on Image Processing, 16(3): 879--884, 2007. Google ScholarDigital Library
- M. Harville. Stereo person tracking with adaptive plan-view templates of height and occupancy statistics. Image and Vision Computing, 22(2): 127--142, 2004.Google ScholarCross Ref
- M. Himmelsbach, A. Müller, T. Lüttel, and H. Wünsche. Lidar-based 3d object perception. In Proceedings of 1st International Workshop on Cognition for Technical Systems, 2008.Google Scholar
- R. Kalarot and J. Morris. Comparison of FPGA and GPU implementations of real-time stereo vision. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on, pages 9--15. IEEE, 2010.Google ScholarCross Ref
- R. Kalarot and J. Morris. Implementation of symmetric dynamic programming stereo matching algorithm using CUDA. In Proc. 16th Korea-Japan Joint Workshop on Frontiers of Computer Vision. FCV, 2010.Google Scholar
- R. Kalarot, J. Morris, D. Berry, and J. Dunning. Analysis of real-time stereo vision algorithms on GPU. In International Conference Image and Vision Computing New Zealand (IVCNZ), pages 179--184, 2011.Google Scholar
- R. Kalarot, J. Morris, and G. Gimel'farb. Performance analysis of multi-resolution symmetric dynamic programming stereo on GPU. In 25th International Conference Image and Vision Computing New Zealand (IVCNZ), pages 1--7. IEEE, 2010.Google ScholarCross Ref
- K. Khoshelham and S. Elberink. Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors, 12(2): 1437--1454, 2012.Google ScholarCross Ref
- K. Lee, B. Kalyan, S. Wijesoma, M. Adams, F. Hover, and N. Patrikalakis. Tracking random finite objects using 3d-lidar in marine environments. In Proceedings of the 2010 ACM Symposium on Applied Computing, pages 1282--1287. ACM, 2010. Google ScholarDigital Library
- V. Lepetit and P. Fua. Monocular model-based 3D tracking of rigid objects, volume 1. Now Publishers Inc., Hanover, MA, USA, 2005. Google ScholarDigital Library
- R. Muñoz-Salinas, M. García-Silvente, and R. Medina Carnicer. Adaptive multi-modal stereo people tracking without background modelling. Journal of Visual Communication and Image Representation, 19(2): 75--91, 2008. Google ScholarDigital Library
- R. Muñoz-Salinas, R. Medina-Carnicer, F. Madrid-Cuevas, and A. Carmona-Poyato. People detection and tracking with multiple stereo cameras using particle filters. Journal of Visual Communication and Image Representation, 20(5): 339--350, 2009. Google ScholarDigital Library
- S. Obdržálek, G. Kurillo, J. Han, T. Abresch, R. Bajcsy, et al. Real-time human pose detection and tracking for tele-rehabilitation in virtual reality. Studies in Health Technology and Informatics, 173: 320, 2012.Google Scholar
- S. Park and H. Jeong. Real-time stereo vision fpga chip with low error rate. In Proc. Int. Conf. on Multimedia and Ubiquitous Engineering, pages 751--756, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
- L. Spinello, M. Luber, and K. Arras. Tracking people in 3d using a bottom-up top-down detector. In 2011 IEEE International Conference on Robotics and Automation (ICRA), pages 1304--1310. IEEE, 2011.Google ScholarCross Ref
- A. Yilmaz, O. Javed, and M. Shah. Object tracking: A survey. ACM Computing Surveys (CSUR), 38(4), 2006. Google ScholarDigital Library
Index Terms
3D object tracking with a high-resolution GPU based real-time stereo
Recommendations
Catadioptric Stereo Using Planar Mirrors
By using mirror reflections of a scene, stereo images can be captured with a single camera (catadioptric stereo). In addition to simplifying data acquisition single camera stereo provides both geometric and radiometric advantages over traditional two ...
Moving vehicle tracking based on improved tracking–learning–detection algorithm
This study addresses the tracking–learning–detection (TLD) algorithm for long‐term single‐target tracking of moving vehicle from video streams. The problems leading to tracking failures in existing TLD methods are discovered, and an improved TLD (ITLD) ...
Robust object tracking via multi-cue fusion
A long-term object tracking method based on calibrated binocular cameras by fusing information of the two channels and binocular geometry constraints is proposed.The stereo filter which is built based on the epipolar geometry of the binocular cameras is ...
Comments