ABSTRACT
It is common that TV audiences want to quickly browse scenes with certain actors in TV series. Since 2016, the TREC Video Retrieval Evaluation (TRECVID) Instance Search (INS) task has started to focus on identifying a target person in a target scene simultaneously. In this paper, we name this kind of task as P-S INS (Person-Scene Instance Search). To find out P-S instances, most approaches search person and scene separately, and then directly combine the results together by addition or multiplication. However, we find that person and scene INS modules are not always effective at the same time, or they may suppress each other in some situations. Aggregating the results shot after shot is not a good choice. Luckily, for the TV series, video shots are arranged in chronological order. We extend our focus from time point (single video shot) to time slice (multiple consecutive video shots) in the time-line. Through detecting salient time slices, we prune the data. Through evaluating the importance of salient time slices, we boost the aggregation results. Extensive experiments on the large-scale TRECVID INS dataset demonstrate the effectiveness of the proposed method.
- George Awad, Wessel Kraaij, Paul Over, and Shin'ichi Satoh. 2017. Instance search retrospective with focus on TRECVID. International journal of multimedia information retrieval (2017).Google ScholarCross Ref
- Mika Fischer, Hazım Kemal Ekenel, and Rainer Stiefelhagen. 2011. Person re-identification in tv series using robust face recognition and user feedback. Multimedia Tools and Applications (2011).Google Scholar
- Haiyun Guo, Jinqiao Wang, Yue Gao, Jianqiang Li, and Hanqing Lu. 2016. Multi-view 3d object retrieval with deep embedding network. TIP (2016).Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.Google Scholar
- Luis Herranz, Shuqiang Jiang, and Xiangyang Li. 2016. Scene recognition with CNNs: objects, scales and dataset bias. In CVPR.Google Scholar
- Jiamei Lan, Jun Chen, Zheng Wang, Chao Liang, and Shin'ichi Satoh. 2017. PS Instance Retrieval via Early Elimination and Late Expansion. In ACM MM Workshop.Google Scholar
- Duy-Dinh Le, Sang Phan, and Shin'ichi Satoh. 2016. NII-HITACHI-UIT at TRECVID 2016. In TRECVID Workshop.Google Scholar
- Duy-Dinh Le, Sebastien Poullot, Xiaomeng Wu, Bertrand Nouvel, and Shin'ichi Satoh. 2010. National Institute of Informatics, Japan at TRECVID 2010.. In TRECVID Workshop.Google Scholar
- Jingjing Meng, Junsong Yuan, Yap-Peng Tan, and Gang Wang. 2015. Fast object instance search in videos from one example. In ICIP.Google Scholar
- Vinh-Tiep Nguyen, Dinh-Luan Nguyen, Minh-Triet Tran, Duy-Dinh Le, Duc Anh Duong, and Shin'ichi Satoh. 2015. Query-adaptive late fusion with neural network for instance search. In MMSP.Google Scholar
- Yuxin Peng, Xin Huang, and Jinwei Qi. 2016. Pku-icst at trecvid 2016: Instance search task. In TRECVID Workshop.Google Scholar
- Gerard Salton and Donna Harman. 2003. Information retrieval. John Wiley and Sons Ltd.Google Scholar
- Alan F Smeaton, Paul Over, and Wessel Kraaij. 2006. Evaluation campaigns and TRECVid. In ACM international workshop on Multimedia information retrieval.Google ScholarDigital Library
- Zheng Wang, Yang Yang, Shuosen Guan, and Chenxia Han. 2016. Whu-nercms at trecvid2016: Instance search task. In TRECVID Workshop.Google Scholar
- Wei Zhang, Hongzhi Li, Chong-Wah Ngo, and Shih-Fu Chang. 2014. Scalable visual instance mining with threads of features. In ACM MM.Google Scholar
- W Zhang, CC Tan, SA Zhu, T Yao, L Pang, and CW Ngo. 2012. Vireo@ trecvid 2012: Searching with topology, recounting will small concepts, learning with free examples. In TRECVID Workshop.Google Scholar
- Zhenxing Zhang, Rami Albatal, Cathal Gurrin, and Alan F Smeaton. 2013. Trecvid 2013 experiments at dublin city university. In TRECVID Workshop.Google Scholar
- Zhicheng Zhao, Menglai Wang, and Rui Xiang. 2016. Bupt-mcprl at trecvid 2016. In TRECVID Workshop.Google Scholar
- Liang Zheng, Yi Yang, and Qi Tian. 2017. SIFT meets CNN: A decade survey of instance retrieval. TPAMI (2017).Google Scholar
- Yousong Zhu, Jinqiao Wang, Chaoyang Zhao, Haiyun Guo, and Hanqing Lu. 2016. Scale-adaptive deconvolutional regression network for pedestrian detection. In ACCV.Google Scholar
Index Terms
- Salient Time Slice Pruning and Boosting for Person-Scene Instance Search in TV Series
Recommendations
Inferring Attention Shifts for Salient Instance Ranking
AbstractThe human visual system has limited capacity in simultaneously processing multiple visual inputs. Consequently, humans rely on shifting their attention from one location to another. When viewing an image of complex scenes, psychology studies and ...
Salient object detection via boosting object-level distinctiveness and saliency refinement
We detect saliency via boosting object-level distinctiveness and saliency refinement.Our approach can better uniformly highlight heterogeneous regions of salient objects.A new method only using object-level features to detect coarse saliency is ...
Extraction of salient contours from cluttered scenes
The responses of neurons in the primary visual cortex (V1) to stimulus inside the receptive field (RF) can be markedly modulated by stimuli outside the classical receptive field. The modulation, relying on contextual configurations, yields excitatory ...
Comments