skip to main content
10.1145/3573910.3573916acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicraiConference Proceedingsconference-collections
research-article

DSGA: Distractor-Suppressing Graph Attention for Multi-object Tracking

Authors Info & Claims
Published:20 January 2023Publication History

ABSTRACT

Multiple object tracking (MOT) methods based on single object tracking are of great interest because of their ability to balance efficiency and performance on the strength of the localization capability of single-target tracking. However, most of the single object tracking methods only distinguish foreground and background. They are susceptible to the influence of similar interfering objects during localization, while in multiple object tracking scenarios, there are more interfering objects and the influence is more severe. Therefore, we propose a Distractor-Suppressing Graph Attention (DSGA) to learn more discriminative attention by reducing the influence of distractors on learning attention weight features. Furthermore, DSGA is embedded into the basic MOT framework “SiamMOT” formed as DSGA-SiamMOT and applied to multiple object tracking to verify its effectiveness. We conduct experiments on the MOT Challenge benchmark with "public detection", and obtain MOTA 66.65%, IDF1 62.2% accuracy on the MOT17 dataset with 14fps.

References

  1. LEE, M.-K., PYO, J.-W., BAE, S.-H., JOO, S.-H., AND KUC, T.-Y. Traffic light recognition for autonomous driving vehicle: Using mono camera and its. Journal of Image and Graphics 10, 3 (2022), 102–108.Google ScholarGoogle ScholarCross RefCross Ref
  2. GIRSHICK, R. B. Fast R-CNN. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015 (2015), IEEE Computer Society, pp. 1440–1448.Google ScholarGoogle Scholar
  3. REN, S., HE, K., GIRSHICK, R. B., AND SUN, J. Faster R-CNN: towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada (2015), C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds., pp. 91–99.Google ScholarGoogle Scholar
  4. REDMON, J., DIVVALA, S. K., GIRSHICK, R. B., AND FARHADI, A. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016 (2016), IEEE Computer Society, pp. 779–788.Google ScholarGoogle ScholarCross RefCross Ref
  5. TIAN, Z., SHEN, C., CHEN, H., AND HE, T. FCOS: fully convolutional one-stage object detection. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019 (2019), IEEE, pp. 9626–9635.Google ScholarGoogle ScholarCross RefCross Ref
  6. WOJKE, N., BEWLEY, A., AND PAULUS, D. Simple online and realtime tracking with a deep association metric. In 2017 IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, September 17-20, 2017 (2017), IEEE, pp. 3645–3649.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. YANG, F., CHANG, X., SAKTI, S., WU, Y., AND NAKAMURA, S. Remot: A model-agnostic refinement for multiple object tracking. Image Vis. Comput. 106 (2021), 104091.Google ScholarGoogle ScholarCross RefCross Ref
  8. SHUAI, B., BERNESHAWI, A. G., LI, X., MODOLO, D., AND TIGHE, J. Siammot: Siamese multi-object tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 12372–12382.Google ScholarGoogle Scholar
  9. YIN, J., WANG, W., MENG, Q., YANG, R., AND SHEN, J. A unified object motion and affinity model for online multi-object tracking. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020 (2020), Computer Vision Foundation / IEEE, pp. 6767–6776.Google ScholarGoogle ScholarCross RefCross Ref
  10. BERTINETTO, L., VALMADRE, J., HENRIQUES, J. F., VEDALDI, A., AND TORR, P. H. S. Fully-convolutional siamese networks for object tracking. In Computer Vision - ECCV 2016 Workshops - Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II (2016), G. Hua and H. Jégou, Eds., vol. 9914 of Lecture Notes in Computer Science, pp. 850–865.Google ScholarGoogle ScholarCross RefCross Ref
  11. DANELLJAN, M., BHAT, G., KHAN, F. S., AND FELSBERG, M. ECO: efficient convolution operators for tracking. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017 (2017), IEEE Computer Society, pp. 6931–6939.Google ScholarGoogle ScholarCross RefCross Ref
  12. ZHU, Z., WANG, Q., LI, B., WU, W., YAN, J., AND HU, W. Distractor-aware siamese networks for visual object tracking. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part IX (2018), V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., vol. 11213 of Lecture Notes in Computer Science, Springer, pp. 103–119.Google ScholarGoogle Scholar
  13. GUO, D., SHAO, Y., CUI, Y., WANG, Z., ZHANG, L., AND SHEN, C. Graph attention tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 9543–9552.Google ScholarGoogle ScholarCross RefCross Ref
  14. MILAN, A., LEAL-TAIXÉ, L., REID, I. D., ROTH, S., AND SCHINDLER, K. MOT16: A benchmark for multi-object tracking. CoRR abs/1603.00831 (2016).Google ScholarGoogle Scholar
  15. DENDORFER, P., REZATOFIGHI, H., MILAN, A., SHI, J., CREMERS, D., REID, I. D., ROTH, S., SCHINDLER, K., AND LEAL-TAIXÉ, L. MOT20: A benchmark for multi object tracking in crowded scenes. CoRR abs/2003.09003 (2020).Google ScholarGoogle Scholar
  16. BEWLEY, A., GE, Z., OTT, L., RAMOS, F. T., AND UPCROFT, B. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016 (2016), IEEE, pp. 3464–3468.Google ScholarGoogle ScholarCross RefCross Ref
  17. BERGMANN, P., MEINHARDT, T., AND LEAL-TAIXÉ, L. Tracking without bells and whistles. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019 (2019), IEEE, pp. 941–951.Google ScholarGoogle ScholarCross RefCross Ref
  18. HE, J., HUANG, Z., WANG, N., AND ZHANG, Z. Learnable graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 5299–5309.Google ScholarGoogle ScholarCross RefCross Ref
  19. LIANG, T., LAN, L., ZHANG, X., PENG, X., AND LUO, Z. Enhancing the association in multi-object tracking via neighbor graph. Int. J. Intell. Syst. 36, 11 (2021), 6713–6730.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. ZHENG, L., TANG, M., CHEN, Y., ZHU, G., WANG, J., AND LU, H. Improving multiple object tracking with single object tracking. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 2453–2462.Google ScholarGoogle ScholarCross RefCross Ref
  21. ZHU, J., YANG, H., LIU, N., KIM, M., ZHANG, W., AND YANG, M. Online multi-object tracking with dual matching attention networks. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part V (2018), V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., vol. 11209 of Lecture Notes in Computer Science, Springer, pp. 379–396.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. LI, B., YAN, J., WU, W., ZHU, Z., AND HU, X. High performance visual tracking with siamese region proposal network. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018 (2018), Computer Vision Foundation / IEEE Computer Society, pp. 8971–8980.Google ScholarGoogle ScholarCross RefCross Ref
  23. ZHOU, X., KOLTUN, V., AND KRÄHENBÜHL, P. Tracking objects as points. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IV (2020), A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., vol. 12349 of Lecture Notes in Computer Science, Springer, pp. 474–490.Google ScholarGoogle Scholar
  24. LIANG, T., LAN, L., ZHANG, X., AND LUO, Z. A generic MOT boosting framework by combining cues from sot, tracklet and re-identification. Knowl. Inf. Syst. 63, 8 (2021), 2109–2127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. CHU, Q., OUYANG, W., LI, H., WANG, X., LIU, B., AND YU, N. Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017 (2017), IEEE Computer Society, pp. 4846–4855.Google ScholarGoogle ScholarCross RefCross Ref
  26. DOSOVITSKIY, A., BEYER, L., KOLESNIKOV, A., WEISSENBORN, D., ZHAI, X., UNTERTHINER, T., DEHGHANI, M., MINDERER, M., HEIGOLD, G., GELLY, S., USZKOREIT, J., AND HOULSBY, N. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 (2021), OpenReview.net.Google ScholarGoogle Scholar
  27. CARION, N., MASSA, F., SYNNAEVE, G., USUNIER, N., KIRILLOV, A., AND ZAGORUYKO, S. End-to-end object detection with transformers. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I (2020), A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., vol. 12346 of Lecture Notes in Computer Science, Springer, pp. 213–229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. SUN, P., JIANG, Y., ZHANG, R., XIE, E., CAO, J., HU, X., KONG, T., YUAN, Z., WANG, C., AND LUO, P. Transtrack: Multiple-object tracking with transformer. CoRR abs/2012.15460 (2020).Google ScholarGoogle Scholar
  29. XU, Y., BAN, Y., DELORME, G., GAN, C., RUS, D., AND ALAMEDA-PINEDA, X. Transcenter: Transformers with dense queries for multiple-object tracking. CoRR abs/2103.15145 (2021).Google ScholarGoogle Scholar
  30. CUI, Y., JIANG, C., WANG, L., AND WU, G. Target transformed regression for accurate tracking. CoRR abs/2104.00403 (2021).Google ScholarGoogle Scholar
  31. XING, D., EVANGELIOU, N., TSOUKALAS, A., AND TZES, A. Siamese transformer pyramid networks for real-time UAV tracking. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, Waikoloa, HI, USA, January 3-8, 2022 (2022), IEEE, pp. 1898–1907.Google ScholarGoogle ScholarCross RefCross Ref
  32. GUO, D., WANG, J., CUI, Y., WANG, Z., AND CHEN, S. Siamcar: Siamese fully convolutional classification and regression for visual tracking. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020 (2020), Computer Vision Foundation / IEEE, pp. 6268–6276.Google ScholarGoogle ScholarCross RefCross Ref
  33. XU, Y., BAN, Y., ALAMEDA-PINEDA, X., AND HORAUD, R. Deepmot: A differentiable framework for training multiple object trackers. CoRR abs/1906.06618 (2019).Google ScholarGoogle Scholar
  34. GUO, S., WANG, J., WANG, X., AND TAO, D. Online multiple object tracking with cross-task synergy. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 8136–8145.Google ScholarGoogle ScholarCross RefCross Ref
  35. STADLER, D., AND BEYERER, J. Improving multiple pedestrian tracking by track management and occlusion handling. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021 (2021), Computer Vision Foundation / IEEE, pp. 10958–10967.Google ScholarGoogle ScholarCross RefCross Ref
  36. CHU, P., AND LING, H. Famnet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019 (2019), IEEE, pp. 6171–6180.Google ScholarGoogle ScholarCross RefCross Ref
  37. FENG, W., HU, Z., WU, W., YAN, J., AND OUYANG, W. Multi-object tracking with multiple cues and switcher-aware classification. CoRR abs/1901.06129 (2019).Google ScholarGoogle Scholar

Index Terms

  1. DSGA: Distractor-Suppressing Graph Attention for Multi-object Tracking
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICRAI '22: Proceedings of the 8th International Conference on Robotics and Artificial Intelligence
        November 2022
        89 pages
        ISBN:9781450397544
        DOI:10.1145/3573910

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 January 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format