Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation

Hsu, Winston H.; Chang, Shih-Fu

doi:10.1007/11526346_12

Winston H. Hsu²¹ &
Shih-Fu Chang²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3568))

Included in the following conference series:

International Conference on Image and Video Retrieval

1196 Accesses
24 Citations

Abstract

Recent research in video analysis has shown a promising direction, in which mid-level features (e.g., people, anchor, indoor) are abstracted from low-level features (e.g., color, texture, motion, etc.) and used for discriminative classification of semantic labels. However, in most systems, such mid-level features are selected manually. In this paper, we propose an information-theoretic framework, visual cue cluster construction (VC³), to automatically discover adequate mid-level features. The problem is posed as mutual information maximization, through which optimal cue clusters are discovered to preserve the highest information about the semantic labels. We extend the Information Bottleneck framework to high-dimensional continuous features and further propose a projection method to map each video into probabilistic memberships over all the cue clusters. The biggest advantage of the proposed approach is to remove the dependence on the manual process in choosing the mid-level features and the huge labor cost involved in annotating the training corpus for training the detector of each mid-level feature. The proposed VC³ framework is general and effective, leading to exciting potential in solving other problems of semantic video analysis. When tested in news video story segmentation, the proposed approach achieves promising performance gain over representations derived from conventional clustering techniques and even the mid-level features selected manually.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chaisorn, L., Chua, T.S., Koh, C.K., Zhao, Y., Xu, H., Feng, H., Tian, Q.: A two-level multi-modal approach for story segmentation of large news video corpus. In: TRECVID Workshop, Washington DC (2003)
Google Scholar
Amir, A., Berg, M., Chang, S.F., Iyengar, G., Lin, C.Y., Natsev, A., Neti, C., Nock, H., Naphade, M., Hsu, W., Smith, J.R., Tseng, B., Wu, Y., Zhang, D.: IBM research trecvid 2003 video retrieval system. In: TRECVID 2003 Workshop (2003)
Google Scholar
Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001)
MATH Google Scholar
Slonim, N., Friedman, N., Tishby, N.: Unsupervised document classification using sequential information maximization. In: 25th ACM intermational Conference on Research and Development of Information Retireval (2002)
Google Scholar
Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Neural Information Processing Systems, NIPS (1999)
Google Scholar
Gordon, S., Greenspan, H., Goldberger, J.: Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations. In: International Conference on Computer Vision (2003)
Google Scholar
Hsu, W., Kennedy, L., Chang, S.F., Franz, M., Smith, J.: Columbia-IBM news video story segmentation in trecvid 2004. (Technical Report ADVENT #207-2005-3)
Google Scholar
Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley-Interscience, Hoboken (1992)
Book MATH Google Scholar
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Hsu, W., Chang, S.F.: Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation. In: IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan (2004)
Google Scholar
Hsu, W., Chang, S.F., Huang, C.W., Kennedy, L., Lin, C.Y., Iyengar, G.: Discovery and fusion of salient multi-modal features towards news story segmentation. In: IS&T/SPIE Electronic Imaging, San Jose, CA (2004)
Google Scholar
France, V., Hlavac, V.: Statistical pattern recognition toolbox for matlab. Technical report, Czech Technical University (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electrical Engineering, Columbia University, New York, NY, 10027, USA
Winston H. Hsu & Shih-Fu Chang

Authors

Winston H. Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Shih-Fu Chang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, National University of Singapore, Computing 1, 117590, Singapore
Wee-Kheng Leow
LIACS Media Lab, Leiden University,
Michael S. Lew & Erwin M. Bakker &
National University of Singapore, 3 Science Dr, 117543, Singapore
Tat-Seng Chua
Microsoft Research Asia, 4F, Sigma Center, No.49, Zhichun Road, 100080, Beijing, P.R.China
Wei-Ying Ma
School of Computing, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Lekha Chaisorn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hsu, W.H., Chang, SF. (2005). Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation. In: Leow, WK., Lew, M.S., Chua, TS., Ma, WY., Chaisorn, L., Bakker, E.M. (eds) Image and Video Retrieval. CIVR 2005. Lecture Notes in Computer Science, vol 3568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11526346_12

Download citation

DOI: https://doi.org/10.1007/11526346_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27858-0
Online ISBN: 978-3-540-31678-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics