Gaze Shifting Kernel: Engineering Perceptually- Aware Features for Scene Categorization

Zhang, Luming; Hong, Richang; Wang, Meng

doi:10.1007/978-3-319-24075-6_25

Luming Zhang¹⁸,
Richang Hong¹⁸ &
Meng Wang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9314))

Included in the following conference series:

Pacific Rim Conference on Multimedia

1810 Accesses

Abstract

In this paper, we propose a novel gaze shifting kernel for scene image categorization, focusing on discovering the mechanism of humans perceiving visually/semantically salient regions in a scene. First, a weakly supervised embedding algorithm projects the local image descriptors (i.e., graphlets) into a pre-specified semantic space. Afterward, each graphlet can be represented by multiple visual features at both low-level and high-level. As humans typically attend to a small fraction of regions in a scene, a sparsity-constrained graphlet ranking algorithm is proposed to dynamically integrate both the low-level and the high-level visual cues. The top-ranked graphlets are either visually or semantically salient according to human perception. They are linked into a path to simulate human gaze shifting. Finally, we calculate the gaze shifting kernel (GSK) based on the discovered paths from a set of images. Experiments on the USC scene and the ZJU aerial image data sets demonstrate the competitiveness of our GSK, as well as the high consistency of the predicted path with real human gaze shifting path.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Guestrin, E.D., Eizenman, M.: General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE T-BE 53(6), 1124–1133 (2006)
Google Scholar
Jixu, C., Qiang, J.: Probabilistic gaze estimation without active personal calibration. In: Proceedings of CVPR (2011)
Google Scholar
Nakazawa, A., Nitschke, C.: Point of gaze estimation through corneal surface reflection in an active illumination environment. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 159–172. Springer, Heidelberg (2012)
Chapter Google Scholar
Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: a survey. IEEE T-PAMI 31(4), 607–626 (2009)
Article Google Scholar
Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: Head 3D deformable face tracking with a commodity depth camera. In: Proceeding of ECCV (2010)
Google Scholar
Lu, F., Okabe, T., Sugano, Y., Sato, Y.: A head pose-free approach for appearance-based gaze estimation. In: Proceedings of BMVC (2011)
Google Scholar
Mora, K.A.F., Odobez, J.-M.: Gaze estimation from multimodal kinect data. In: CVPR Workshop (2012)
Google Scholar
Mora, K.A.F., Odobez, J.-M.: Person independent 3D gaze estimation from remote RGB-D camera. In: Proceedings of ICIP (2013)
Google Scholar
Moosmann, F., Larlus, D., Frederic, J.: Learning saliency maps for object categorization. In: ECCV Workshop (2006)
Google Scholar
Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: Proceedings of NIPS (2004)
Google Scholar
Gao, D., Vasconcelos, N.: Integrated learning of saliency, complex features and object detectors from cluttered scenes. In: Proceedings of CVPR (2005)
Google Scholar
Parikh, D., Zitnick, C.L., Chen, T.: Determining patch saliency using low-level context. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 446–459. Springer, Heidelberg (2008)
Chapter Google Scholar
Oliva, A., Torralba, A., Castelhano, M.S., Henderson, J.M.: Top-down control of visual attention in object detection. In: Proceedings of ICCV (2009)
Google Scholar
Harada, T., Ushiku, Y., Yuya Y.: Discriminative spatial pyramid. In: Proceedings of CVPR, Yasuo Kuniyoshi (2011)
Google Scholar
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of CVPR (2011)
Google Scholar
Zhang, L., Song, M., Zhao, Q., Liu, X., Bu, J., Chen, C.: Probabilistic graphlet transfer for photo cropping. IEEE T-IP 21(5), 803–815 (2013)
Google Scholar
Lin, Z., Chen, M., Ma, Y.: The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices, arXiv preprint (2010). arXiv:1009.5055
Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discriminative graphlets for aerial image categories recognition. IEEE T-IP 22(12), 5071–5084 (2013)
Article MathSciNet Google Scholar
Siagian, C., Itti, L.: Rapid biologically-inspired scene classification using features shared with visual attention. IEEE T-PAMI 29(2), 300–312 (2007)
Article Google Scholar
Harchaoui, Z., Bach, F.: Image classification with segmentation graph kernels. In: Proceedings of ICCV (2007)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of ICCV (2006)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of CVPR (2010)
Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of CVPR (2009)
Google Scholar
Li, L.-J., Su, H., Xing, E.P., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: Proceedings of NIPS (2010)
Google Scholar
Hou, X., Harel, J., Koch, C., Signature, I.: Highlighting sparse salient regions. IEEE T-PAMI 34(1), 194–201 (2012)
Article Google Scholar
Yao, B., Yang, X., Zhu, S.-C.: Introduction to a large scale general purpose ground truth dataset: methodology, annotation tool, and benchmarks. In: EMMCVPR (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Information, Hefei University of Technology, Hefei, China
Luming Zhang, Richang Hong & Meng Wang

Authors

Luming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Richang Hong
View author publications
You can also search for this author in PubMed Google Scholar
Meng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luming Zhang .

Editor information

Editors and Affiliations

Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Chinese Academy of Sciences, Institute of Automation, Beijing, China
Jitao Sang
ICU, IVY Lab, KAIST, Daejeon, Korea (Republic of)
Yong Man Ro
KAIST, Daejeon, Korea (Republic of)
Junmo Kim
College of Computer Science, Zhejiang University, Hangzhou, China
Fei Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Hong, R., Wang, M. (2015). Gaze Shifting Kernel: Engineering Perceptually- Aware Features for Scene Categorization. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-24075-6_25
Published: 22 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24074-9
Online ISBN: 978-3-319-24075-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics