Abstract
We present a visual object detector based on a deep convolutional neural network that quickly outputs bounding box hypotheses without a separate proposal generation stage [1]. We modify the network for better performance, specialize it for a robotic application involving “bird” and “nest” categories (including the creation of a new dataset for the latter), and extend it to enforce temporal continuity for tracking. The system exhibits very competitive detection accuracy and speed, as well as robust, high-speed tracking on several difficult sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Full nest dataset available here: http://nameless.cis.udel.edu/data/nests.
References
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. arXiv preprint arXiv:1506.02640 (2015)
Treisman, A., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12, 97–136 (1980)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR) (2005)
Han, F., Shan, Y., Cekander, R., Sawhney, H., Kumar, R.: A twostage approach to people and vehicle detection with hog-based svm. In: Proceedings of the Performance Metrics for Intelligent Systems Workshop (2006)
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The Pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Sergeant, D., Boyle, R., Forbes, M.: Computer visual tracking of poultry. Comput. Electron. Agric. 21, 1–18 (1998)
Steen, K., Therkildsen, O., Green, O., Karstoft, H.: Detection of bird nests during mechanical weeding by incremental background modeling and visual saliency. Sensors 15(3), 5096–5111 (2015)
Wu, X., Yuan, P., Peng, Q., Ngo, C., He, J.: Detection of bird nests in overhead catenary system images for high-speed rail. Pattern Recogn. 51, 242–254 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154–171 (2013)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10602-1_26
Benenson, R., Mathias, M., Timofte, R., Gool, L.V.: Pedestrian detection at 100 frames per second. In: Computer Vision and Pattern Recognition (CVPR) (2012)
Mathias, M., Timofte, R., Benenson, R., Gool, L.V.: Traffic sign recognition how far are we from the solution? In: International Joint Conference on Neural Networks (2013)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 91–99 (2015)
Lenc, K., Vedaldi, A.: R-CNN minus R. arXiv preprint arXiv:1506.06981 (2015)
Wu, Y., Lim, J., Yang, M.: Online object tracking: a benchmark. In: Computer Vision and Pattern Recognition (CVPR) (2013)
Wang, N., Yeung, D.: Learning a deep compact image representation for visual tracking. In: Advances in Neural Information Processing Systems (NIPS) (2013)
Li, H., Li, Y., Porikli, F.: Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: British Machine Vision Conference (BMVC) (2014)
Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: International Conference on Computer Vision (ICCV) (2015)
Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., Garcia, R.: Incorporating second-order functional knowledge for better option pricing. In: Advances in Neural Information Processing Systems (NIPS), pp. 472–478 (2001)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33765-9_50
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Pascal voc challenge performance evaluation and download server - detection, 31 January 2016. http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
Huang, L., Yang, Y., Deng, Y., Yu, Y.: Densebox: unifying landmark localization with end to end object detection. arXiv preprint arXiv:1509.04874 (2015)
He, K., Zhang, X., Ren, R., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
Zhang, K., Zhang, L., Yang, M.-H.: Real-time compressive tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 864–877. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3_62
Zhong, W., Lu, H., Yang, M.: Robust object tracking via sparsity-based collaborative model. In: Computer Vision and Pattern Recognition (CVPR) (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wang, Q., Rasmussen, C., Song, C. (2016). Fast, Deep Detection and Tracking of Birds and Nests. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2016. Lecture Notes in Computer Science(), vol 10072. Springer, Cham. https://doi.org/10.1007/978-3-319-50835-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-50835-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50834-4
Online ISBN: 978-3-319-50835-1
eBook Packages: Computer ScienceComputer Science (R0)