skip to main content
10.1145/3343031.3350931acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Instance of Interest Detection

Published: 15 October 2019 Publication History

Abstract

In this paper, we propose a novel task named Instance of Interest Detection (IOID) to provide instance-level user interest modeling for image semantic description. IOID focuses on extracting the instances which are beneficial to represent image content, while other related tasks such as saliency analysis, attention model and instance segmentation extract the regions attracting visual attention or with a predefined category. To this end, we propose a Cross-influential Network for IOID, which integrates both visual saliency and semantic context. Moreover, we contribute the first dataset IOID evaluation, which consists of 45,000 images from MSCOCO with manually annotated instances of interest. Our method outperforms the state-of-the-art baselines on this dataset.

References

[1]
Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned Salient Region Detection. In IEEE Conference on Computer Vision and Pattern Recognition .
[2]
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. 2015. VQA: Visual Question Answering. In International Conference on Computer Vision .
[3]
Sergiu Nedevschi Arthur Daniel Costea, Andra Petrovai. 2018. Fusion Scheme for Semantic and Instance-level Segmentation. In International Conference on Intelligent Transportation Systems .
[4]
Liangchieh Chen, George Papandreou, Iasonas Kokkinos, Kevin P Murphy, and Alan L Yuille. 2018a. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets and Atrous Convolution and and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 4 (2018), 843--848.
[5]
Shuhan Chen, Xiuli Tan, Ben Wang, and Xuelong Hu. 2018b. Reverse Attention for Salient Object Detection. In European Conference on Computer Vision .
[6]
Xinlei Chen and C Lawrence Zitnick. 2015. Mind's Eye: A Recurrent Visual Representation for Image Caption Generation. In IEEE Conference on Computer Vision and Pattern Recognition .
[7]
Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder--Decoder Approaches. In Conference on Empirical Methods in Natural Language Processing .
[8]
Andrew T Duchowski. 2007. Eye Tracking Methodology: Theory and Practice. In Springer Science and Business Media .
[9]
Ruochen Fan, Qibin Hou, Mingming Cheng, Taijiang Mu, and Shimin Hu. 2017. S4Net: Single Stage Salient-Instance Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .
[10]
Hao Fang, Saurabh Gupta, Forrest N Iandola, Rupesh Kumar Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John Platt, C Lawrence Zitnick, and Geoffrey Zweig. 2015. From Captions to Visual Concepts and Back. In IEEE Conference on Computer Vision and Pattern Recognition .
[11]
Liang Lin Guanbin Li, Yuan Xie and Yizhou Yu. 2017. Instance-Level Salient Object Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .
[12]
Dan Witzner Hansen and Qiang Ji. 2010. In the Eye of the Beholder: A Survey of Models for Eyes and Gaze. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, 3 (2010), 478--500.
[13]
Jonathan Harel, Christof Koch, and Pietro Perona. 2006. Graph-Based Visual Saliency. In Annual Conference on Neural Information Processing Systems .
[14]
Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross B Girshick. 2017. Mask R-CNN. In IEEE Conference on Computer Vision and Pattern Recognition .
[15]
Cosmin Toca Anca Ioana Petre Carmen Patrascu Mihai Ciuc Ionut Ficiu, Radu Stilpeanu. 2018. Automatic Annotation of Object Instances by Region-Based Recurrent Neural Networks. In International Conference on Intelligent Computer Communication and Processing .
[16]
Laurent Itti, Christof Koch, and Ernst Niebur. 1998. A Model of Saliency-based Visual Attention for Rapid Scene Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, 11 (1998), 1254--1259.
[17]
Ming Jiang, Shengsheng Huang, Juanyong Duan, and Qi Zhao. 2015. SALICON: Saliency in Context. In IEEE Conference on Computer Vision and Pattern Recognition .
[18]
Trevor Darrell Jonathan Long, Evan Shelhamer. 2015. Fully Convolutional Networks for Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .
[19]
Kevin Mcguinness Noel E Oconnor Jordi Torres Elisa Sayrol Xavier Giro I Nieto Junting Pan, Cristian Cantonferrer. 2017. SalGAN: Visual Saliency Prediction with Generative Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition .
[20]
Shaoqing Ren Jian Sun Kaiming He, Xiangyu Zhang. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition .
[21]
Vahid Kazemi and Ali Elqursh. 2017. Show and Ask and Attend and Answer: A Strong Baseline for Visual Question Answering. In IEEE Conference on Computer Vision and Pattern Recognition .
[22]
Alexander Kirillov, Ross Girshick, Kaiming He, and Piotr Dollár. 2018a. Panoptic Feature Pyramid Networks. In arXiv:1901.02446 .
[23]
Alexander Kirillov, Kaiming He, Ross B Girshick, Carsten Rother, and Piotr Dollar. 2018b. Panoptic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .
[24]
Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra M Bhandarkar, Wojciech Matusik, and Antonio Torralba. 2016. Eye Tracking for Everyone. In IEEE Conference on Computer Vision and Pattern Recognition .
[25]
Philipp Krahenbuhl and Vladlen Koltun. 2011. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In Annual Conference on Neural Information Processing Systems .
[26]
Guanbin Li, Yuan Xie, Liang Lin, and Yizhou Yu. 2017. Instance-Level Salient Object Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .
[27]
Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, and Xingang Wang. 2019. Attention-guided Unified Network for Panoptic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .
[28]
Tsungyi Lin, Piotr Dollar, Ross B Girshick, Kaiming He, Bharath Hariharan, and Serge J Belongie. 2017. Feature Pyramid Networks for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition .
[29]
Tsungyi Lin, Michael Maire, Serge J Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. 2015. Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision .
[30]
Nian Liu, Junwei Han, and Minghsuan Yang. 2018. PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection. In IEEE Conference on Computer Vision and Pattern Recognition .
[31]
Zhiming Luo, Akshaya Kumar Mishra, Andrew Achkar, Justin A Eichel, Shaozi Li, and Pierremarc Jodoin. 2017. Non-local Deep Features for Salient Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition .
[32]
Xiaowei Hu Ali Borji Zhuowen Tu Philip H S Torr Qibin Hou, Mingming Cheng. 2017. Deeply Supervised Salient Object Detection with Short Connections. In IEEE Conference on Computer Vision and Pattern Recognition .
[33]
Eleonora Vig, Michael Dorr, and David D Cox. 2014. Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images. In IEEE Conference on Computer Vision and Pattern Recognition .
[34]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and Tell: A Neural Image Caption Generator. In IEEE Conference on Computer Vision and Pattern Recognition .
[35]
Alex Graves Koray Kavukcuoglu Volodymyr Mnih, Nicolas Heess. 2014. Recurrent Models of Visual Attention. In Annual Conference on Neural Information Processing Systems .
[36]
Qi Wu, Chunhua Shen, Lingqiao Liu, Anthony R Dick, and Anton Van Den Hengel. 2016. What Value Do Explicit High Level Concepts Have in Vision to Language Problems. In IEEE Conference on Computer Vision and Pattern Recognition .
[37]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In International Conference on Machine Learning .
[38]
Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. 2013. Hierarchical Saliency Detection. In IEEE Conference on Computer Vision and Pattern Recognition .
[39]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In IEEE Conference on Computer Vision and Pattern Recognition .
[40]
Shuai Zheng, Sadeep Jayasumana, Bernardino Romeraparedes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H S Torr. 2015. Conditional Random Fields as Recurrent Neural Networks. In International Conference on Computer Vision .

Cited By

View all

Index Terms

  1. Instance of Interest Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '19: Proceedings of the 27th ACM International Conference on Multimedia
    October 2019
    2794 pages
    ISBN:9781450368896
    DOI:10.1145/3343031
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 October 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. instance extraction
    2. instance of interest
    3. instance of interest annotation
    4. instance of interest detection
    5. interest estimation

    Qualifiers

    • Research-article

    Funding Sources

    • Collaborative Innovation Center of Novel Software Technology and Industrialization
    • National Science Foundation of China

    Conference

    MM '19
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 188
      Total Downloads
    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media