Skip to main content
Log in

Watch Out: Embedded Video Tracking with BST for Unmanned Aerial Vehicles

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

The paper presents the development of a real time tracking system, named Watch Out, that is able to efficiently run on an Nvidia Jetson board mounted on a UAV (Unmanned Aerial Vehicle). The approach to long term video tracking implemented in Watch Out is named Best Structured Tracker (BST): a set of local trackers independently tracks patches of the original target in an online learning manner, while an outlier detection procedure filters out the less meaningful ones, and a resampling procedure allows to correctly reinitialise the trackers that have been filtered out. Performance of the tracking algorithm has been verified both on VOT2016 challenge datasets and in real situations using an Nvidia Jetson board mounted on a drone. Results show that the proposed system can track almost every possible target in real time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. Akin, O., Erdem, E., Erdem, A., & Mikolajczyk, K. (2016). Deformable part-based tracking by coupled global and local correlation filters. Journal of Visual Communication and Image Representation, 38, 763–774.

    Article  Google Scholar 

  2. Berg, A., Ahlberg, J., & Felsberg, M. (2015). A thermal object tracking benchmark. In 12th IEEE international conference on advanced video and signal based surveillance (AVSS), 2015 (pp. 1–6).

  3. Bilgic, B., Horn, B.K.P., & Masaki, I. (2010). Efficient integral image computation on the gpu. In Intelligent vehicles symposium (IV), 2010 IEEE (pp. 528–533).

  4. Bordes, A., Bottou, L., Gallinari, P., & Weston, J. (2007). Solving multiclass support vector machines with larank. In Proceedings of the 24th international conference on machine learning. ICML ’07 (pp. 89–96). New York, NY, USA: ACM.

  5. Cehovin, L., Kristan, M., & Leonardis, A. (2013). Robust visual tracking using an adaptive coupled-layer visual model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(4), 941–953.

    Article  Google Scholar 

  6. Cehovin, L., Leonardis, A., & Kristan, M. (2016). Robust visual tracking using template anchors. In 2016 IEEE winter conference on applications of computer vision (WACV) (pp. 1–8).

  7. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.

    MATH  Google Scholar 

  8. Dickmanns, E.D., & Mysliwetz, B.D. (1992). Recursive 3-d road and relative ego-state recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 199–213.

    Article  Google Scholar 

  9. Felsberg, M., Kristan, M., Matas, J., Leonardis, A., Pflugfelder, R., Häger, G., Berg, A., Eldesokey, A., Ahlberg, J., Čehovin, L., Vojír~, T., Lukežič, A., Fernández, G., Petrosino, A., Garcia-Martin, A., Montero, A.S., Varfolomieiev, A., Erdem, A., Han, B., Chang, C.M., Du, D., Erdem, E., Khan, F.S., Porikli, F., Zhao, F., Bunyak, F., Battistone, F., Zhu, G., Seetharaman, G., Li, H., Qi, H., Bischof, H., Possegger, H., Nam, H., Valmadre, J., Zhu, J., Feng, J., Lang, J., Martinez, J.M., Palaniappan, K., Lebeda, K., Gao, K., Mikolajczyk, K., Wen, L., Bertinetto, L., Poostchi, M., Maresca, M., Danelljan, M., Arens, M., Tang, M., Baek, M., Fan, N., Al-Shakarji, N., Miksik, O., Akin, O., Torr, P.H.S., Huang, Q., Martin-Nieto, R., Pelapur, R., Bowden, R., Laganière, R., Krah, S.B., Li, S., Yao, S., Hadfield, S., Lyu, S., Becker, S., Golodetz, S., Hu, T., Mauthner, T., Santopietro, V., Li, W., Hübner, W., Li, X., Li, Y., Xu, Z., & He, Z. (2016). The thermal infrared visual object tracking VOT-TIR2016 challenge results, (pp. 824–849). Cham: Springer International Publishing.

    Google Scholar 

  10. Godec, M., Roth, P.M., & Bischof, H. (2011). Hough-based tracking of non-rigid objects. In 2011 international conference on computer vision (pp. 81–88).

  11. Haar, A. (1910). Zur Theorie der orthogonalen Funktionensysteme. Mathematische Annalen, 69(3), 331–371.

    Article  MathSciNet  MATH  Google Scholar 

  12. Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S., & Torr, P. (2015). Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99), 1–1.

    Google Scholar 

  13. Hare, S., Saffari, A., & Torr, P.H.S. (2011). Struck: Structured output tracking with kernels. In 2011 International conference on computer vision (pp. 263–270).

  14. Harris, M. (2007). Optimizing parallel reduction in cuda. NVDIA Developer Technology.

  15. Harris, M., Sengupta, S., & Owens, J.D. (2007). Parallel prefix sum (scan) with cuda. GPU Gems, 3(39), 851–876.

    Google Scholar 

  16. Hou, L., Wan, W., Lee, K.H., Hwang, J.N., Okopal, G., & Pitton, J. (2015). Robust human tracking based on dpm constrained multiple-kernel from a moving camera. Journal of Signal Processing Systems, 1–13.

  17. Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P-n learning: Bootstrapping binary classifiers by structural constraints. In IEEE Conference on computer vision and pattern recognition (CVPR), 2010 (pp. 49–56).

  18. Kalal, Z., Mikolajczyk, K., & Matas, J. (2012). Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7), 1409–1422.

    Article  Google Scholar 

  19. Kolsch, M., & Turk, M. (2004). Fast 2d hand tracking with flocks of features and multi-cue integration. In Proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW’04). CVPRW ’04, (Vol. 10 p. 158). Washington, DC, USA: IEEE Computer Society.

  20. Kosecka, J., Blasi, R., Taylor, C.J., & Malik, J. (1998). A comparative study of vision-based lateral control strategies for autonomous highway driving. In IEEE international conference on robotics and automation, 1998. Proceedings. 1998, (Vol. 3 pp. 1903–1908).

  21. Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., Fernandez, G., Vojir, T., Hager, G., Nebehay, G., Pflugfelder, R., Gupta, A., Bibi, A., Lukezic, A., Garcia-Martin, A., Saffari, A., Petrosino, A., & Montero, A.S. (2015). The visual object tracking vot2015 challenge results. In 2015 IEEE international conference on computer vision workshop (ICCVW) (pp. 564–586).

  22. Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Čehovin, L., Vojir, T., Häger, G., Lukežič, A., & Fernandez, G. (2016). The visual object tracking vot2016 challenge results. Springer.

  23. Kristan, M., Pflugfelder, R., Leonardis, A., Matas, J., Čehovin, L., Nebehay, G., Vojíř, T., Fernández, G., Lukežič, A., Dimitriev, A., Petrosino, A., Saffari, A., Li, B., Han, B., Heng, C., Garcia, C., Pangeršič, D., Häger, G., Khan, F.S., Oven, F., Possegger, H., Bischof, H., Nam, H., Zhu, J., Li, J., Choi, J.Y., Choi, J.W., Henriques, J.F., van de Weijer, J., Batista, J., Lebeda, K., Öfjäll, K., Yi, K.M., Qin, L., Wen, L., Maresca, M.E., Danelljan, M., Felsberg, M., Cheng, M.M., Torr, P., Huang, Q., Bowden, R., Hare, S., Lim, S.Y., Hong, S., Liao, S., Hadfield, S., Li, S.Z., Duffner, S., Golodetz, S., Mauthner, T., Vineet, V., Lin, W., Li, Y., Qi, Y., Lei, Z., & Niu, Z.H. (2015). The visual object tracking VOT2014 challenge results, (pp. 191–217). Cham: Springer International Publishing.

    Google Scholar 

  24. Lebeda, K., Hadfield, S., Matas, J., & Bowden, R. (2013). Long-term tracking through failure cases. In 2013 IEEE international conference on computer vision workshops (pp. 153–160).

  25. Lebeda, K., Hadfield, S., Matas, J., & Bowden, R. (2016). Texture-independent long-term tracking using virtual corners. IEEE Transactions on Image Processing, 25(1), 359–371.

    Article  MathSciNet  Google Scholar 

  26. Lienhart, R., & Maydt, J. (2002). An extended set of haar-like features for rapid object detection. In International conference on image processing. 2002. Proceedings. 2002, (Vol. 1 pp. I–900–I–903).

  27. Lukezic, A., Cehovin, L., & Kristan, M. (2016). Deformable parts correlation filters for robust visual tracking. arXiv:1605.03720.

  28. Ma, L., Stepanyan, V., Cao, C., Faruque, I., Woolsey, C., & Hovakimyan, N. (2006). Flight test bed for visual tracking of small UAVs. American Institute of Aeronautics and Astronautics.

  29. Maresca, M.E., & Petrosino, A. (2013). Matrioska: a multi-level approach to fast tracking by learning. In Petrosino, A. (Ed.) ICIAP (2). Lecture Notes in computer science, (Vol. 8157 pp. 419–428): Springer.

  30. Maresca, M.E., & Petrosino, A. (2015). Clustering local motion estimates for robust and efficient object tracking, (pp. 244–253). Cham: Springer International Publishing.

    Google Scholar 

  31. Mateo Lozano, O., & Otsuka, K. (2009). Real-time visual tracker by stream processing. Journal of Signal Processing Systems, 57(2), 285–295.

    Article  Google Scholar 

  32. Muscoloni, A., & Mattoccia, S. (2014). Real-time tracking with an embedded 3d camera with fpga processing. In 2014 international conference on 3d imaging (IC3D) (pp. 1–7).

  33. Nebehay, G., & Pflugfelder, R. (2015). Clustering of static-adaptive correspondences for deformable object tracking. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2784–2791).

  34. Qadir, A., Neubert, J., & Semke, W. (2012). On-board visual tracking with unmanned aircraft system (UAS). arXiv:1203.2386.

  35. Rathinam, S., Almeida, P., Kim, Z., Jackson, S., Tinka, A., Grossman, W., & Sengupta, R. (2007). Autonomous searching and tracking of a river using an uav. In 2007 American control conference (pp. 359–364).

  36. Thiang, I.N., Maw, Dr.L., & Tun, H.M. (2016). Vision-based object tracking algorithm with ar. drone. IJSTR Volume 5 - Issue 6 June 2016 Edition.

  37. Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the twenty-first international conference on machine learning. ICML ’04 (p. 104). New York, NY, USA: ACM.

  38. Viola, P., & Jones, M.J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.

    Article  Google Scholar 

  39. Vojir, T., & Matas, J. (2014). The enhanced flock of trackers. In Registration and recognition in images and videos (pp. 113–136). Berlin: Springer.

  40. Wang, X., Valstar, M., Martinez, B., Khan, M.H., & Pridmore, T. (2015). Tric-track: Tracking by regression with incrementally learned cascades. In 2015 IEEE international conference on computer vision (ICCV) (pp. 4337–4345).

  41. Xiao, J., Stolkin, R., & Leonardis, A. (2015). Single target tracking using adaptive clustered decision trees and dynamic multi-level appearance models. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 4978–4987).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Petrosino.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Battistone, F., Petrosino, A. & Santopietro, V. Watch Out: Embedded Video Tracking with BST for Unmanned Aerial Vehicles. J Sign Process Syst 90, 891–900 (2018). https://doi.org/10.1007/s11265-017-1279-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-017-1279-x

Keywords

Navigation