skip to main content
10.1145/3474085.3475231acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Auto-MSFNet: Search Multi-scale Fusion Network for Salient Object Detection

Authors Info & Claims
Published:17 October 2021Publication History

ABSTRACT

Multi-scale features fusion plays a critical role in salient object detection. Most of existing methods have achieved remarkable performance by exploiting various multi-scale features fusion strategies. However, an elegant fusion framework requires expert knowledge and experience, heavily relying on laborious trial and error. In this paper, we propose a multi-scale features fusion framework based on Neural Architecture Search (NAS), named Auto-MSFNet. First, we design a novel search cell, named FusionCell to automatically decide multi-scale features aggregation. Rather than searching one repeatable cell stacked, we allow different FusionCells to flexibly integrate multi-level features. Simultaneously, considering features generated from CNNs are naturally spatial and channel-wise, we propose a new search space for efficiently focusing on the most relevant information. The search space mitigates incomplete object structures or over-predicted foreground regions caused by progressive fusion. Second, we propose a progressive polishing loss to further obtain exquisite boundaries by penalizing misalignment of salient object boundaries. Extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed method and achieve state-of-the-art performance on four evaluation metrics. The code and results of our method are available at https://github.com/OIPLab-DUT/Auto-MSFNet.

References

  1. Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1597--1604.Google ScholarGoogle ScholarCross RefCross Ref
  2. Alexey Bokhovkin and Evgeny Burnaev. 2019. Boundary loss for remote sensing imagery semantic segmentation. In International Symposium on Neural Networks. 388--401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ali Borji, Ming-Ming Cheng, Huaizu Jiang, and Jia Li. 2015. Salient object detection: A benchmark. IEEE transactions on image processing, Vol. 24, 12 (2015), 5706--5722.Google ScholarGoogle Scholar
  4. Ali Borji, Simone Frintrop, Dicky N Sihite, and Laurent Itti. 2012. Adaptive object tracking by learning background context. In Computer Vision and Pattern Recognition Workshops. 23--30.Google ScholarGoogle ScholarCross RefCross Ref
  5. Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, 177--186.Google ScholarGoogle ScholarCross RefCross Ref
  6. Liang-Chieh Chen, Maxwell D Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, and Jonathon Shlens. 2018. Searching for efficient multi-scale architectures for dense image prediction. arXiv preprint arXiv:1809.04184 (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. 2019 a. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1294--1303.Google ScholarGoogle ScholarCross RefCross Ref
  8. Zuyao Chen, Qianqian Xu, Runmin Cong, and Qingming Huang. 2020. Global context-aware progressive aggregation network for salient object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10599--10606.Google ScholarGoogle ScholarCross RefCross Ref
  9. Zixuan Chen, Huajun Zhou, Xiaohua Xie, and Jianhuang Lai. 2019 b. Contour loss: Boundary-aware learning for salient object segmentation. arXiv preprint arXiv:1908.01975 (2019).Google ScholarGoogle Scholar
  10. Ming-Ming Cheng, Niloy J Mitra, Xiaolei Huang, Philip HS Torr, and Shi-Min Hu. 2014. Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence, Vol. 37, 3, 569--582.Google ScholarGoogle Scholar
  11. Xiangxiang Chu, Tianbao Zhou, Bo Zhang, and Jixiang Li. 2020. Fair darts: Eliminating unfair advantages in differentiable architecture search. In Proceedings of European Conference on Computer Vision. 465--480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Runmin Cong, Jianjun Lei, Changqing Zhang, Qingming Huang, Xiaochun Cao, and Chunping Hou. 2016. Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. IEEE Signal Processing Letters, Vol. 23, 6 (2016), 819--823.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  14. Michael Donoser, Martin Urschler, Martin Hirzer, and Horst Bischof. 2009. Saliency driven total variation segmentation. In 2009 IEEE 12th International Conference on Computer Vision. 817--824.Google ScholarGoogle ScholarCross RefCross Ref
  15. Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, and Ali Borji. 2017. Structure-measure: A new way to evaluate foreground maps. In Proceedings of the IEEE international conference on computer vision. 4548--4557.Google ScholarGoogle ScholarCross RefCross Ref
  16. Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, and Ali Borji. 2018. Enhanced-alignment Measure for Binary Foreground Map Evaluation. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence. 698--704. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh K Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C Platt, et al. 2015. From captions to visual concepts and back. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1473--1482.Google ScholarGoogle ScholarCross RefCross Ref
  18. Mengyang Feng, Huchuan Lu, and Errui Ding. 2019. Attentive Feedback Network for Boundary-aware Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  20. Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, and Philip HS Torr. 2017. Deeply supervised salient object detection with short connections. (2017), 3203--3212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, and Li Cheng. 2021. Calibrated RGB-D Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9471--9481.Google ScholarGoogle ScholarCross RefCross Ref
  22. Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, and Shipeng Li. 2013. Salient object detection: A discriminative regional feature integration approach. (2013), 2083--2090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sungwoong Kim, Ildoo Kim, Sungbin Lim, Woonhyuk Baek, Chiheon Kim, Hyungjoo Cho, Boogeon Yoon, and Taesup Kim. 2019. Scalable neural architecture search for 3d medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 220--228.Google ScholarGoogle ScholarCross RefCross Ref
  24. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  25. Guanbin Li and Yizhou Yu. 2015. Visual saliency based on multiscale deep features. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5455--5463.Google ScholarGoogle Scholar
  26. Xin Li, Fan Yang, Hong Cheng, Wei Liu, and Dinggang Shen. 2018. Contour knowledge transfer for salient object detection. In Proceedings of the European Conference on Computer Vision. 355--370.Google ScholarGoogle ScholarCross RefCross Ref
  27. Yin Li, Xiaodi Hou, Christof Koch, James M Rehg, and Alan L Yuille. 2014. The secrets of salient object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 280--287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  29. Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018b. DARTS: Differentiable Architecture Search. arXiv preprint arXiv:1806.09055 (2018).Google ScholarGoogle Scholar
  30. Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, and Jianmin Jiang. 2019. A Simple Pooling-Based Design for Real-Time Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  31. Nian Liu, Junwei Han, and Ming-Hsuan Yang. 2018a. Picanet: Learning pixel-wise contextual attention for saliency detection., 3089--3098 pages.Google ScholarGoogle Scholar
  32. Ran Margolin, Lihi Zelnik-Manor, and Ayellet Tal. 2014. How to evaluate foreground maps?. In Proceedings of the IEEE conference on computer vision and pattern recognition. 248--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sina Mohammadi, Mehrdad Noori, Ali Bahri, Sina Ghofrani Majelan, and Mohammad Havaei. 2020. CAGNet: Content-Aware Guidance for Salient Object Detection. Pattern Recognition (2020), 107303.Google ScholarGoogle Scholar
  34. Youwei Pang, Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu. 2020. Multi-scale interactive network for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9413--9422.Google ScholarGoogle ScholarCross RefCross Ref
  35. Federico Perazzi, Philipp Kr"ahenbühl, Yael Pritch, and Alexander Hornung. 2012. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 733--740. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yongri Piao, Wei Ji, Jingjing Li, Miao Zhang, and Huchuan Lu. 2019. Depth-induced multi-scale recurrent attention network for saliency detection. In ICCV. 7254--7263.Google ScholarGoogle Scholar
  37. Xuebin Qin, Zichen Zhang, Chenyang Huang, Chao Gao, Masood Dehghan, and Martin Jagersand. 2019. Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7479--7489.Google ScholarGoogle ScholarCross RefCross Ref
  38. Ruijie Quan, Xuanyi Dong, Yu Wu, Linchao Zhu, and Yi Yang. 2019. Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision.Google ScholarGoogle ScholarCross RefCross Ref
  39. Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In International Conference on Machine Learning. 2902--2911. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  41. Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Baocai Yin, and Xiang Ruan. 2017. Learning to detect salient objects with image-level supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 136--145.Google ScholarGoogle ScholarCross RefCross Ref
  42. Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, and Ali Borji. 2018. Detect globally, refine locally: A novel approach to saliency detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3127--3135.Google ScholarGoogle ScholarCross RefCross Ref
  43. Jun Wei, Shuhui Wang, and Qingming Huang. 2020 a. F$^3$Net: Fusion, Feedback and Focus for Salient Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12321--12328.Google ScholarGoogle Scholar
  44. Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, and Qi Tian. 2020 b. Label Decoupling Framework for Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  45. Yichen Wei, Fang Wen, Wangjiang Zhu, and Jian Sun. 2012. Geodesic saliency using background priors. In Proceedings of the European Conference on Computer Vision. 29--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zhe Wu, Li Su, and Qingming Huang. 2019 a. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3907--3916.Google ScholarGoogle ScholarCross RefCross Ref
  47. Zhe Wu, Li Su, and Qingming Huang. 2019 b. Stacked Cross Refinement Network for Edge-Aware Salient Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision.Google ScholarGoogle ScholarCross RefCross Ref
  48. Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2019. PC-DARTS: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737.Google ScholarGoogle Scholar
  49. Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. 2013a. Hierarchical Saliency Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. 2013b. Hierarchical saliency detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1155--1162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, and Ming-Hsuan Yang. 2013. Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3166--3173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, and Thomas Huang. 2016. Unitbox: An advanced object detection network. 516--520 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Zitong Yu, Chenxu Zhao, Zezheng Wang, Yunxiao Qin, Zhuo Su, Xiaobai Li, Feng Zhou, and Guoying Zhao. 2020. Searching Central Difference Convolutional Networks for Face Anti-Spoofing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  54. Lu Zhang, Ju Dai, Huchuan Lu, You He, and Gang Wang. 2018a. A bi-directional message passing model for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1741--1750.Google ScholarGoogle ScholarCross RefCross Ref
  55. Miao Zhang, Wei Ji, Yongri Piao, Jingjing Li, Yu Zhang, Shuang Xu, and Huchuan Lu. 2020. LFNet: Light field fusion network for salient object detection. IEEE Transactions on Image Processing, Vol. 29 (2020), 6276--6287.Google ScholarGoogle ScholarCross RefCross Ref
  56. Miao Zhang, Jingjing Li, Wei Ji, Yongri Piao, and Huchuan Lu. 2019 a. Memory-oriented Decoder for Light Field Salient Object Detection.. In NeurIPS. 896--906. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, and Xiang Ruan. 2017. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision. 202--211.Google ScholarGoogle ScholarCross RefCross Ref
  58. Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, and Gang Wang. 2018b. Progressive Attention Guided Recurrent Network for Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  59. Yiheng Zhang, Zhaofan Qiu, Jingen Liu, Ting Yao, Dong Liu, and Tao Mei. 2019 b. Customizable architecture search for semantic segmentation. (2019), 11641--11650.Google ScholarGoogle Scholar
  60. Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, and Ming-Ming Cheng. 2019. EGNet:Edge Guidance Network for Salient Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision.Google ScholarGoogle Scholar
  61. Rui Zhao, Wanli Ouyang, and Xiaogang Wang. 2013. Unsupervised Salience Learning for Person Re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, and Lei Zhang. 2020. Suppress and Balance: A Simple Gated Network for Salient Object Detection. In Proceedings of the European Conference on Computer Vision.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Huajun Zhou, Xiaohua Xie, Jian-Huang Lai, Zixuan Chen, and Lingxiao Yang. 2020. Interactive Two-Stream Decoder for Accurate and Fast Saliency Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  64. Wangjiang Zhu, Shuang Liang, Yichen Wei, and Jian Sun. 2014. Saliency optimization from robust background detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2814--2821. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Barret Zoph and Quoc V. Le. 2017. Neural Architecture Search with Reinforcement Learning. arxiv: 1611.01578 [cs.LG]Google ScholarGoogle Scholar
  66. Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. (2018), 8697--8710.Google ScholarGoogle Scholar

Index Terms

  1. Auto-MSFNet: Search Multi-scale Fusion Network for Salient Object Detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '21: Proceedings of the 29th ACM International Conference on Multimedia
        October 2021
        5796 pages
        ISBN:9781450386517
        DOI:10.1145/3474085

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader