research-article

Auto-MSFNet: Search Multi-scale Fusion Network for Salient Object Detection

Authors:
Miao Zhang

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

,
Tingwei Liu

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

,
Yongri Piao

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

,
Shunyu Yao

Dalian University of Technology, Dalian, China

Dalian University of Technology, Dalian, China
View Profile

,
Huchuan Lu

Dalian University of Technology & Pengcheng Lab, Dalian, China

Dalian University of Technology & Pengcheng Lab, Dalian, China
View Profile

MM '21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021Pages 667–676https://doi.org/10.1145/3474085.3475231

Published:17 October 2021Publication History

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 667–676

ABSTRACT

Multi-scale features fusion plays a critical role in salient object detection. Most of existing methods have achieved remarkable performance by exploiting various multi-scale features fusion strategies. However, an elegant fusion framework requires expert knowledge and experience, heavily relying on laborious trial and error. In this paper, we propose a multi-scale features fusion framework based on Neural Architecture Search (NAS), named Auto-MSFNet. First, we design a novel search cell, named FusionCell to automatically decide multi-scale features aggregation. Rather than searching one repeatable cell stacked, we allow different FusionCells to flexibly integrate multi-level features. Simultaneously, considering features generated from CNNs are naturally spatial and channel-wise, we propose a new search space for efficiently focusing on the most relevant information. The search space mitigates incomplete object structures or over-predicted foreground regions caused by progressive fusion. Second, we propose a progressive polishing loss to further obtain exquisite boundaries by penalizing misalignment of salient object boundaries. Extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed method and achieve state-of-the-art performance on four evaluation metrics. The code and results of our method are available at https://github.com/OIPLab-DUT/Auto-MSFNet.

References

Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1597--1604.Google ScholarCross Ref
Alexey Bokhovkin and Evgeny Burnaev. 2019. Boundary loss for remote sensing imagery semantic segmentation. In International Symposium on Neural Networks. 388--401.Google ScholarDigital Library
Ali Borji, Ming-Ming Cheng, Huaizu Jiang, and Jia Li. 2015. Salient object detection: A benchmark. IEEE transactions on image processing, Vol. 24, 12 (2015), 5706--5722.Google Scholar
Ali Borji, Simone Frintrop, Dicky N Sihite, and Laurent Itti. 2012. Adaptive object tracking by learning background context. In Computer Vision and Pattern Recognition Workshops. 23--30.Google ScholarCross Ref
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, 177--186.Google ScholarCross Ref
Liang-Chieh Chen, Maxwell D Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, and Jonathon Shlens. 2018. Searching for efficient multi-scale architectures for dense image prediction. arXiv preprint arXiv:1809.04184 (2018). Google ScholarDigital Library
Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. 2019 a. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1294--1303.Google ScholarCross Ref
Zuyao Chen, Qianqian Xu, Runmin Cong, and Qingming Huang. 2020. Global context-aware progressive aggregation network for salient object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10599--10606.Google ScholarCross Ref
Zixuan Chen, Huajun Zhou, Xiaohua Xie, and Jianhuang Lai. 2019 b. Contour loss: Boundary-aware learning for salient object segmentation. arXiv preprint arXiv:1908.01975 (2019).Google Scholar
Ming-Ming Cheng, Niloy J Mitra, Xiaolei Huang, Philip HS Torr, and Shi-Min Hu. 2014. Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence, Vol. 37, 3, 569--582.Google Scholar
Xiangxiang Chu, Tianbao Zhou, Bo Zhang, and Jixiang Li. 2020. Fair darts: Eliminating unfair advantages in differentiable architecture search. In Proceedings of European Conference on Computer Vision. 465--480.Google ScholarDigital Library
Runmin Cong, Jianjun Lei, Changqing Zhang, Qingming Huang, Xiaochun Cao, and Chunping Hou. 2016. Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. IEEE Signal Processing Letters, Vol. 23, 6 (2016), 819--823.Google ScholarCross Ref
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarCross Ref
Michael Donoser, Martin Urschler, Martin Hirzer, and Horst Bischof. 2009. Saliency driven total variation segmentation. In 2009 IEEE 12th International Conference on Computer Vision. 817--824.Google ScholarCross Ref
Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, and Ali Borji. 2017. Structure-measure: A new way to evaluate foreground maps. In Proceedings of the IEEE international conference on computer vision. 4548--4557.Google ScholarCross Ref
Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, and Ali Borji. 2018. Enhanced-alignment Measure for Binary Foreground Map Evaluation. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence. 698--704. Google ScholarDigital Library
Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh K Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C Platt, et al. 2015. From captions to visual concepts and back. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1473--1482.Google ScholarCross Ref
Mengyang Feng, Huchuan Lu, and Errui Ding. 2019. Attentive Feedback Network for Boundary-aware Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, and Philip HS Torr. 2017. Deeply supervised salient object detection with short connections. (2017), 3203--3212. Google ScholarDigital Library
Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, and Li Cheng. 2021. Calibrated RGB-D Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9471--9481.Google ScholarCross Ref
Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, and Shipeng Li. 2013. Salient object detection: A discriminative regional feature integration approach. (2013), 2083--2090. Google ScholarDigital Library
Sungwoong Kim, Ildoo Kim, Sungbin Lim, Woonhyuk Baek, Chiheon Kim, Hyungjoo Cho, Boogeon Yoon, and Taesup Kim. 2019. Scalable neural architecture search for 3d medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 220--228.Google ScholarCross Ref
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Guanbin Li and Yizhou Yu. 2015. Visual saliency based on multiscale deep features. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5455--5463.Google Scholar
Xin Li, Fan Yang, Hong Cheng, Wei Liu, and Dinggang Shen. 2018. Contour knowledge transfer for salient object detection. In Proceedings of the European Conference on Computer Vision. 355--370.Google ScholarCross Ref
Yin Li, Xiaodi Hou, Christof Koch, James M Rehg, and Alan L Yuille. 2014. The secrets of salient object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 280--287. Google ScholarDigital Library
Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018b. DARTS: Differentiable Architecture Search. arXiv preprint arXiv:1806.09055 (2018).Google Scholar
Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, and Jianmin Jiang. 2019. A Simple Pooling-Based Design for Real-Time Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Nian Liu, Junwei Han, and Ming-Hsuan Yang. 2018a. Picanet: Learning pixel-wise contextual attention for saliency detection., 3089--3098 pages.Google Scholar
Ran Margolin, Lihi Zelnik-Manor, and Ayellet Tal. 2014. How to evaluate foreground maps?. In Proceedings of the IEEE conference on computer vision and pattern recognition. 248--255. Google ScholarDigital Library
Sina Mohammadi, Mehrdad Noori, Ali Bahri, Sina Ghofrani Majelan, and Mohammad Havaei. 2020. CAGNet: Content-Aware Guidance for Salient Object Detection. Pattern Recognition (2020), 107303.Google Scholar
Youwei Pang, Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu. 2020. Multi-scale interactive network for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9413--9422.Google ScholarCross Ref
Federico Perazzi, Philipp Kr"ahenbühl, Yael Pritch, and Alexander Hornung. 2012. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 733--740. Google ScholarDigital Library
Yongri Piao, Wei Ji, Jingjing Li, Miao Zhang, and Huchuan Lu. 2019. Depth-induced multi-scale recurrent attention network for saliency detection. In ICCV. 7254--7263.Google Scholar
Xuebin Qin, Zichen Zhang, Chenyang Huang, Chao Gao, Masood Dehghan, and Martin Jagersand. 2019. Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7479--7489.Google ScholarCross Ref
Ruijie Quan, Xuanyi Dong, Yu Wu, Linchao Zhu, and Yi Yang. 2019. Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision.Google ScholarCross Ref
Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In International Conference on Machine Learning. 2902--2911. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Baocai Yin, and Xiang Ruan. 2017. Learning to detect salient objects with image-level supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 136--145.Google ScholarCross Ref
Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, and Ali Borji. 2018. Detect globally, refine locally: A novel approach to saliency detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3127--3135.Google ScholarCross Ref
Jun Wei, Shuhui Wang, and Qingming Huang. 2020 a. F$^3$Net: Fusion, Feedback and Focus for Salient Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12321--12328.Google Scholar
Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, and Qi Tian. 2020 b. Label Decoupling Framework for Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Yichen Wei, Fang Wen, Wangjiang Zhu, and Jian Sun. 2012. Geodesic saliency using background priors. In Proceedings of the European Conference on Computer Vision. 29--42. Google ScholarDigital Library
Zhe Wu, Li Su, and Qingming Huang. 2019 a. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3907--3916.Google ScholarCross Ref
Zhe Wu, Li Su, and Qingming Huang. 2019 b. Stacked Cross Refinement Network for Edge-Aware Salient Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision.Google ScholarCross Ref
Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2019. PC-DARTS: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737.Google Scholar
Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. 2013a. Hierarchical Saliency Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. 2013b. Hierarchical saliency detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1155--1162. Google ScholarDigital Library
Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, and Ming-Hsuan Yang. 2013. Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3166--3173. Google ScholarDigital Library
Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, and Thomas Huang. 2016. Unitbox: An advanced object detection network. 516--520 pages. Google ScholarDigital Library
Zitong Yu, Chenxu Zhao, Zezheng Wang, Yunxiao Qin, Zhuo Su, Xiaobai Li, Feng Zhou, and Guoying Zhao. 2020. Searching Central Difference Convolutional Networks for Face Anti-Spoofing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Lu Zhang, Ju Dai, Huchuan Lu, You He, and Gang Wang. 2018a. A bi-directional message passing model for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1741--1750.Google ScholarCross Ref
Miao Zhang, Wei Ji, Yongri Piao, Jingjing Li, Yu Zhang, Shuang Xu, and Huchuan Lu. 2020. LFNet: Light field fusion network for salient object detection. IEEE Transactions on Image Processing, Vol. 29 (2020), 6276--6287.Google ScholarCross Ref
Miao Zhang, Jingjing Li, Wei Ji, Yongri Piao, and Huchuan Lu. 2019 a. Memory-oriented Decoder for Light Field Salient Object Detection.. In NeurIPS. 896--906. Google ScholarDigital Library
Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, and Xiang Ruan. 2017. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision. 202--211.Google ScholarCross Ref
Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, and Gang Wang. 2018b. Progressive Attention Guided Recurrent Network for Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Yiheng Zhang, Zhaofan Qiu, Jingen Liu, Ting Yao, Dong Liu, and Tao Mei. 2019 b. Customizable architecture search for semantic segmentation. (2019), 11641--11650.Google Scholar
Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, and Ming-Ming Cheng. 2019. EGNet:Edge Guidance Network for Salient Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision.Google Scholar
Rui Zhao, Wanli Ouyang, and Xiaogang Wang. 2013. Unsupervised Salience Learning for Person Re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, and Lei Zhang. 2020. Suppress and Balance: A Simple Gated Network for Salient Object Detection. In Proceedings of the European Conference on Computer Vision.Google ScholarDigital Library
Huajun Zhou, Xiaohua Xie, Jian-Huang Lai, Zixuan Chen, and Lingxiao Yang. 2020. Interactive Two-Stream Decoder for Accurate and Fast Saliency Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Wangjiang Zhu, Shuang Liang, Yichen Wei, and Jian Sun. 2014. Saliency optimization from robust background detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2814--2821. Google ScholarDigital Library
Barret Zoph and Quoc V. Le. 2017. Neural Architecture Search with Reinforcement Learning. arxiv: 1611.01578 [cs.LG]Google Scholar
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. (2018), 8697--8710.Google Scholar

Index Terms

Auto-MSFNet: Search Multi-scale Fusion Network for Salient Object Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

MSDNet: Multi-scale Dense Networks for Salient Object Detection
Pattern Recognition and Computer Vision
Abstract
Since the fully convolutional networks was proposed, it has made great progress in the salient object detection. However, this kind of network structure still has obvious problems of incomplete salient objects segmentation and redundant ...
Read More
Progressive multi-scale fusion network for RGB-D salient object detection
Abstract
Salient object detection (SOD) aims at locating the most significant object within a given image. In recent years, great progress has been made in applying SOD on many vision tasks. The depth map could provide additional spatial prior ...
Highlights
- Novel multi-scale structure for RGB-D saliency detection.
- Mask-Guided Feature ...
Read More
Multi-Scale Cascade Network for Salient Object Detection
MM '17: Proceedings of the 25th ACM international conference on Multimedia

In this paper we present a novel network architecture, called Multi-Scale Cascade Network (MSC-Net), to identify the most visually conspicuous objects in an image. Our network consists of several stages (sub-networks) for handling saliency detection ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '21: Proceedings of the 29th ACM International Conference on Multimedia
October 2021
5796 pages
ISBN:9781450386517
DOI:10.1145/3474085
General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
boundary loss
multi-scale features fusion
neural architecture search
salient object detection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 504
  Total Downloads
- Downloads (Last 12 months)117
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Auto-MSFNet: Search Multi-scale Fusion Network for Salient Object Detection

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

MSDNet: Multi-scale Dense Networks for Salient Object Detection

Progressive multi-scale fusion network for RGB-D salient object detection

Multi-Scale Cascade Network for Salient Object Detection