Abstract
Deep learning methods have demonstrated encouraging performance on open-air visual object tracking (VOT) benchmarks, however, their strength remains unexplored on underwater video sequences due to the lack of challenging underwater VOT benchmarks. Apart from the open-air tracking challenges, videos captured in underwater environments pose additional challenges for tracking such as low visibility, poor video quality, distortions in sharpness and contrast, reflections from suspended particles, and non-uniform lighting. In the current work, we propose a new Underwater Tracking Benchmark (UTB180) dataset consisting of 180 sequences to facilitate the development of underwater deep trackers. The sequences in UTB180 are selected from both underwater natural and online sources with over 58,000 annotated frames. Video-level attributes are also provided to facilitate the development of robust trackers for specific challenges. We benchmark 15 existing pre-trained State-Of-The-Art (SOTA) trackers on UTB180 and compare their performance on another publicly available underwater benchmark. The trackers consistently perform worse on UTB180 showing that it poses more challenging scenarios. Moreover, we show that fine-tuning five high-quality SOTA trackers on UTB180 still does not sufficiently boost their tracking performance. Our experiments show that the UTB180 sequences pose a major burden on the SOTA trackers as compared to their open-air tracking performance. The performance gap reveals the need for a dedicated end-to-end underwater deep tracker that takes into account the inherent properties of underwater environments. We believe that our proposed dataset will be of great value to the tracking community in advancing the SOTA in underwater VOT. Our dataset is publicly available on Kaggle.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-Convolutional Siamese Networks for Object Tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Bhat, G., Danelljan, M., Van Gool, L., Timofte, R.: Learning discriminative model prediction for tracking. Proc. In: IEEE Int. Conf. Comput. Vis. 2019-Oct (ICCV), pp. 6181–6190 (2019). https://doi.org/10.1109/ICCV.2019.00628
Bhat, G., Danelljan, M., Van Gool, L., Timofte, R.: Know Your Surroundings: Exploiting Scene Information for Object Tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 205–221. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_13
Boudiaf, A., et al.: Underwater image enhancement using pre-trained transformer. In: International Conference on Image Analysis and Processing, pp. 480–488. Springer (2022). https://doi.org/10.1007/978-3-031-06433-3_41
Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer Tracking, pp. 8122–8131 (2021). https://doi.org/10.1109/cvpr46437.2021.00803
Chen, Z., Zhong, B., Li, G., Zhang, S., Ji, R.: Siamese Box Adaptive Network for Visual Tracking. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp. 6667–6676 (2020). https://doi.org/10.1109/CVPR42600.2020.00670
CVAT: Computer Vision Annotation Tool. https://cvat.org
Danelljan, M., Van Gool, L., Timofte, R.: Probabilistic regression for visual tracking. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp. 7181–7190 (2020). https://doi.org/10.1109/CVPR42600.2020.00721
Fan, H., et al.: LaSOT: a High-quality Large-scale Single Object Tracking Benchmark. Int. J. Comput. Vis. 129(2), 439–461 (2021). https://doi.org/10.1007/s11263-020-01387-y
Giraldo, J.H., Javed, S., Bouwmans, T.: Graph moving object segmentation. In: IEEE Trans. Pattern Anal. Mach. Intell. 44(5), 2485-2503 (2020)
Guo, D., Shao, Y., Cui, Y., Wang, Z., Zhang, L., Shen, C.: Graph Attention Tracking. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp. 9538–9547 (2021). https://doi.org/10.1109/CVPR46437.2021.00942
Guo, D., Wang, J., Cui, Y., Wang, Z., Chen, S.: SiamCAR: siamese Fully Convolutional Classification and Regression for Visual Tracking. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp. 6268–6276 (2020). https://doi.org/10.1109/CVPR42600.2020.00630
Kristan, M., et al.: The Visual Object Tracking VOT2014 Challenge Results. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 191–217. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16181-5_14
Han, M., Lyu, Z., Qiu, T., Xu, M.: A Review on Intelligence Dehazing and Color Restoration for Underwater Images. In: IEEE Trans. Syst. Man, Cybern. Syst. 50(5), 1820–1832 (2020). https://doi.org/10.1109/TSMC.2017.2788902
Huang, L., Zhao, X., Huang, K.: Got-10k: a large high-diversity benchmark for generic object tracking in the wild. In: IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1562–1577 (2021). https://doi.org/10.1109/TPAMI.2019.2957464. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85103976016 &doi=10.1109%2FTPAMI.2019.2957464 &partnerID=40 &md5=3fd7d1e870e60df363a83a52a092c544
Javed, S., Danelljan, M., Khan, F.S., Khan, M.H., Felsberg, M., Matas, J.: Visual Object Tracking with Discriminative Filters and Siamese Networks: a Survey and Outlook 14(8), 1–20 (2021). http://arxiv.org/abs/2112.02838
Javed, S., Dias, J., Werghi, N.: Low-rank tensor tracking. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 605–614 (2019). https://doi.org/10.1109/ICCVW.2019.00074
Javed, S., Mahmood, A., Dias, J., Seneviratne, L., Werghi, N.: Hierarchical spatiotemporal graph regularized discriminative correlation filter for visual object tracking. In: IEEE Transactions on Cybernetics (2021)
Javed, S., Mahmood, A., Dias, J., Werghi, N.: Robust structural low-rank tracking. IEEE Trans. Image Process. 29, 4390–4405 (2020)
Javed, S., et al.: A novel algorithm based on a common subspace fusion for visual object tracking. IEEE Access 10, 24690–24703 (2022)
Javed, S., Zhang, X., Dias, J., Seneviratne, L., Werghi, N.: Spatial Graph Regularized Correlation Filters for Visual Object Tracking. In: Abraham, A., Ohsawa, Y., Gandhi, N., Jabbar, M.A., Haqiq, A., McLoone, S., Issac, B. (eds.) SoCPaR 2020. AISC, vol. 1383, pp. 186–195. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73689-7_19
Javed, S., Zhang, X., Dias, J., Seneviratne, L., Werghi, N.: Spatial Graph Regularized Correlation Filters for Visual Object Tracking. In: Abraham, A., Abraham, A., et al. (eds.) SoCPaR 2020. AISC, vol. 1383, pp. 186–195. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73689-7_19
Kristan, M., Leonardis, A., Matas, e.a.: The Visual Object Tracking VOT2016 Challenge Results, pp. 777–823. Springer International Publishing (2016). https://doi.org/10.1007/978-3-319-48881-3_54
Kristan, M., Leonardis, A., Matas, e.: The Sixth Visual Object Tracking VOT2018 Challenge Results. In: Leal-Taixé, L., Roth, S. (eds.) Comput. Vis. - ECCV 2018 Work, pp. 3–53. Springer International Publishing, Cham (2019) https://doi.org/10.1007/978-3-030-11009-3_1
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SIAMRPN++: evolution of siamese visual tracking with very deep networks. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2019-June, 4277–4286 (2019). https://doi.org/10.1109/CVPR.2019.00441
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High Performance Visual Tracking with Siamese Region Proposal Network. In: IEEE Conf. Comput. Vis. Pattern Recognit, pp. 8971–8980 (2018)
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: Algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)
Mayer, C., Danelljan, M., Pani Paudel, D., Van Gool, L.: Learning Target Candidate Association to Keep Track of What Not to Track (ICCV), 13424–13434 (2022). https://doi.org/10.1109/iccv48922.2021.01319
Meinhardt, T., Kirillov, A., Leal-Taixé, L., Feichtenhofer, C.: Trackformer: multi-object tracking with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8844–8854 (2022)
Mueller, M., Smith, N., Ghanem, B.: A Benchmark and Simulator for UAV Tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Müller, M., Bibi, A., Giancola, S., Alsubaihi, S., Ghanem, B.: TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 310–327. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_19
Panetta, K., Kezebou, L., Oludare, V., Agaian, S.: Comprehensive Underwater Object Tracking Benchmark Dataset and Underwater Image Enhancement with GAN. IEEE J. Ocean. Eng. 47(1), 59–75 (2022). https://doi.org/10.1109/JOE.2021.3086907
Pexel: 1,363+ Best Free Underwater 4K Stock Video Footage & Royalty-Free HD Video Clips. https://www.pexels.com/search/videos/underwater/
Underwaterchangedetection: Videos - Underwaterchangedetection. http://underwaterchangedetection.eu/Videos.html
Wang, N., Zhou, W., Wang, J., Li, H.: Transformer Meets Tracker: exploiting Temporal Context for Robust Visual Tracking. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp. 1571–1580 (2021). https://doi.org/10.1109/CVPR46437.2021.00162
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2019-June, 1328–1338 (2019). https://doi.org/10.1109/CVPR.2019.00142
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015). https://doi.org/10.1109/TPAMI.2014.2388226
Xu, Y., Wang, Z., Li, Z., Yuan, Y., Yu, G.: SiamFC++: Towards robust and accurate visual tracking with target estimation guidelines. AAAI 2020–34th AAAI Conf. Artif. Intell. 34(7), 12549–12556 (2020). https://doi.org/10.1609/aaai.v34i07.6944
Yan, B., Peng, H., Fu, J., Wang, D., Lu, H.: Learning Spatio-Temporal Transformer for Visual Tracking, pp. 10428–10437 (2022). https://doi.org/10.1109/iccv48922.2021.01028
Yu, Y., Xiong, Y., Huang, W., Scott, M.R.: Deformable Siamese Attention Networks for Visual Object Tracking. Proc. In: IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp. 6727–6736 (2020). https://doi.org/10.1109/CVPR42600.2020.00676
Zhao, M., Okada, K., Inaba, M.: TrTr: visual Tracking with Transformer (2021). http://arxiv.org/abs/2105.03817
Acknowledgments
This publication acknowledges the support provided by the Khalifa University of Science and Technology under Faculty Start Up grants FSU-2022–003 Award No. 84740 0 0401.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Alawode, B. et al. (2023). UTB180: A High-Quality Benchmark for Underwater Tracking. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13845. Springer, Cham. https://doi.org/10.1007/978-3-031-26348-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-26348-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26347-7
Online ISBN: 978-3-031-26348-4
eBook Packages: Computer ScienceComputer Science (R0)