research-article

Instance of Interest Detection

Authors:

Gangshan WuAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 1997 - 2005

https://doi.org/10.1145/3343031.3350931

Published: 15 October 2019 Publication History

Abstract

In this paper, we propose a novel task named Instance of Interest Detection (IOID) to provide instance-level user interest modeling for image semantic description. IOID focuses on extracting the instances which are beneficial to represent image content, while other related tasks such as saliency analysis, attention model and instance segmentation extract the regions attracting visual attention or with a predefined category. To this end, we propose a Cross-influential Network for IOID, which integrates both visual saliency and semantic context. Moreover, we contribute the first dataset IOID evaluation, which consists of 45,000 images from MSCOCO with manually annotated instances of interest. Our method outperforms the state-of-the-art baselines on this dataset.

References

[1]

Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned Salient Region Detection. In IEEE Conference on Computer Vision and Pattern Recognition .

[2]

Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. 2015. VQA: Visual Question Answering. In International Conference on Computer Vision .

[3]

Sergiu Nedevschi Arthur Daniel Costea, Andra Petrovai. 2018. Fusion Scheme for Semantic and Instance-level Segmentation. In International Conference on Intelligent Transportation Systems .

[4]

Liangchieh Chen, George Papandreou, Iasonas Kokkinos, Kevin P Murphy, and Alan L Yuille. 2018a. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets and Atrous Convolution and and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 4 (2018), 843--848.

[5]

Shuhan Chen, Xiuli Tan, Ben Wang, and Xuelong Hu. 2018b. Reverse Attention for Salient Object Detection. In European Conference on Computer Vision .

[6]

Xinlei Chen and C Lawrence Zitnick. 2015. Mind's Eye: A Recurrent Visual Representation for Image Caption Generation. In IEEE Conference on Computer Vision and Pattern Recognition .

[7]

Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder--Decoder Approaches. In Conference on Empirical Methods in Natural Language Processing .

[8]

Andrew T Duchowski. 2007. Eye Tracking Methodology: Theory and Practice. In Springer Science and Business Media .

Digital Library

[9]

Ruochen Fan, Qibin Hou, Mingming Cheng, Taijiang Mu, and Shimin Hu. 2017. S4Net: Single Stage Salient-Instance Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .

[10]

Hao Fang, Saurabh Gupta, Forrest N Iandola, Rupesh Kumar Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John Platt, C Lawrence Zitnick, and Geoffrey Zweig. 2015. From Captions to Visual Concepts and Back. In IEEE Conference on Computer Vision and Pattern Recognition .

[11]

Liang Lin Guanbin Li, Yuan Xie and Yizhou Yu. 2017. Instance-Level Salient Object Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .

[12]

Dan Witzner Hansen and Qiang Ji. 2010. In the Eye of the Beholder: A Survey of Models for Eyes and Gaze. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, 3 (2010), 478--500.

Digital Library

[13]

Jonathan Harel, Christof Koch, and Pietro Perona. 2006. Graph-Based Visual Saliency. In Annual Conference on Neural Information Processing Systems .

[14]

Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross B Girshick. 2017. Mask R-CNN. In IEEE Conference on Computer Vision and Pattern Recognition .

[15]

Cosmin Toca Anca Ioana Petre Carmen Patrascu Mihai Ciuc Ionut Ficiu, Radu Stilpeanu. 2018. Automatic Annotation of Object Instances by Region-Based Recurrent Neural Networks. In International Conference on Intelligent Computer Communication and Processing .

[16]

Laurent Itti, Christof Koch, and Ernst Niebur. 1998. A Model of Saliency-based Visual Attention for Rapid Scene Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, 11 (1998), 1254--1259.

Digital Library

[17]

Ming Jiang, Shengsheng Huang, Juanyong Duan, and Qi Zhao. 2015. SALICON: Saliency in Context. In IEEE Conference on Computer Vision and Pattern Recognition .

[18]

Trevor Darrell Jonathan Long, Evan Shelhamer. 2015. Fully Convolutional Networks for Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .

[19]

Kevin Mcguinness Noel E Oconnor Jordi Torres Elisa Sayrol Xavier Giro I Nieto Junting Pan, Cristian Cantonferrer. 2017. SalGAN: Visual Saliency Prediction with Generative Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition .

[20]

Shaoqing Ren Jian Sun Kaiming He, Xiangyu Zhang. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition .

[21]

Vahid Kazemi and Ali Elqursh. 2017. Show and Ask and Attend and Answer: A Strong Baseline for Visual Question Answering. In IEEE Conference on Computer Vision and Pattern Recognition .

[22]

Alexander Kirillov, Ross Girshick, Kaiming He, and Piotr Dollár. 2018a. Panoptic Feature Pyramid Networks. In arXiv:1901.02446 .

[23]

Alexander Kirillov, Kaiming He, Ross B Girshick, Carsten Rother, and Piotr Dollar. 2018b. Panoptic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .

[24]

Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra M Bhandarkar, Wojciech Matusik, and Antonio Torralba. 2016. Eye Tracking for Everyone. In IEEE Conference on Computer Vision and Pattern Recognition .

[25]

Philipp Krahenbuhl and Vladlen Koltun. 2011. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In Annual Conference on Neural Information Processing Systems .

[26]

Guanbin Li, Yuan Xie, Liang Lin, and Yizhou Yu. 2017. Instance-Level Salient Object Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .

[27]

Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, and Xingang Wang. 2019. Attention-guided Unified Network for Panoptic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition .

[28]

Tsungyi Lin, Piotr Dollar, Ross B Girshick, Kaiming He, Bharath Hariharan, and Serge J Belongie. 2017. Feature Pyramid Networks for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition .

[29]

Tsungyi Lin, Michael Maire, Serge J Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. 2015. Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision .

[30]

Nian Liu, Junwei Han, and Minghsuan Yang. 2018. PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection. In IEEE Conference on Computer Vision and Pattern Recognition .

[31]

Zhiming Luo, Akshaya Kumar Mishra, Andrew Achkar, Justin A Eichel, Shaozi Li, and Pierremarc Jodoin. 2017. Non-local Deep Features for Salient Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition .

[32]

Xiaowei Hu Ali Borji Zhuowen Tu Philip H S Torr Qibin Hou, Mingming Cheng. 2017. Deeply Supervised Salient Object Detection with Short Connections. In IEEE Conference on Computer Vision and Pattern Recognition .

[33]

Eleonora Vig, Michael Dorr, and David D Cox. 2014. Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images. In IEEE Conference on Computer Vision and Pattern Recognition .

[34]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and Tell: A Neural Image Caption Generator. In IEEE Conference on Computer Vision and Pattern Recognition .

[35]

Alex Graves Koray Kavukcuoglu Volodymyr Mnih, Nicolas Heess. 2014. Recurrent Models of Visual Attention. In Annual Conference on Neural Information Processing Systems .

[36]

Qi Wu, Chunhua Shen, Lingqiao Liu, Anthony R Dick, and Anton Van Den Hengel. 2016. What Value Do Explicit High Level Concepts Have in Vision to Language Problems. In IEEE Conference on Computer Vision and Pattern Recognition .

[37]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In International Conference on Machine Learning .

[38]

Qiong Yan, Li Xu, Jianping Shi, and Jiaya Jia. 2013. Hierarchical Saliency Detection. In IEEE Conference on Computer Vision and Pattern Recognition .

[39]

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In IEEE Conference on Computer Vision and Pattern Recognition .

[40]

Shuai Zheng, Sadeep Jayasumana, Bernardino Romeraparedes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H S Torr. 2015. Conditional Random Fields as Recurrent Neural Networks. In International Conference on Computer Vision .

Digital Library

Cited By

Index Terms

Instance of Interest Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Reproducibility Companion Paper: Visual Relation of Interest Detection
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

In this companion paper, we provide the details of the reproducibility artifacts of the paper "Visual Relation of Interest Detection" presented at MM'20. Visual Relation of Interest Detection (VROID) aims to detect visual relations that are important ...
Visual Relation of Interest Detection
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

In this paper, we propose a novel Visual Relation of Interest Detection (VROID) task, which aims to detect visual relations that are important for conveying the main content of an image, motivated from the intuition that not all correctly detected ...
Reproducibility Companion Paper: Instance of Interest Detection
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

To support the replication of "Instance of Interest Detection", which was presented at MM'19, this companion paper provides the details of the artifacts. Instance of Interest Detection (IOID) aims to provide instance-level user interest modeling for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Collaborative Innovation Center of Novel Software Technology and Industrialization
National Science Foundation of China

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
188
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten