Article

TV ad video categorization with probabilistic latent concept learning

Authors:

Jesse S. JinAuthors Info & Claims

MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval

Pages 217 - 226

https://doi.org/10.1145/1290082.1290113

Published: 24 September 2007 Publication History

Abstract

In this paper we present a multi-modal approach to TV ads classification by advertised products/services. A bag-of-words representation is proposed to discover ad categories-related latent visual and textual concepts by probabilistic latent semantics analysis (PLSA). We use multi-modal concepts to represent ad categories in the latent semantics space. In particular, we resort to external resources (e.g., a brand list, encyclopedia) to expand sparse textual information. A semi-supervised co-training is finally employed to fuse visual and textual features for ad classification. Our experiments have achieved promising results in terms of classification accuracy and scalability to new ad categories. The resulting ad classifiers can be applied to digest ads from TV streams, which is useful for TV viewers to manage ads in a positive manner. The digested ads can be considered the video-based alert for emerging products/services. Thus the reachability and focus of TV ads can be improved.

References

[1]

Hofmann, T. Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning. 2001, 177--196.

Digital Library

[2]

Hofmann, T. Probabilistic Latent Semantic Indexing. ACM SIGIR'98, 1998.

Digital Library

[3]

Quelhas, P., Monay, F., Odobez, J., Gatica-Perez, D., Tuyte-laars, T., Van Gool, L. Modeling Scenes with Local Descriptors and Latent Aspects. ICCV'05, Beijing, China, 2005.

Digital Library

[4]

Bosch, A., Zisserman, A. and Munoz, X. Scene Classification via pLSA ECCV'06. 2006.

Digital Library

[5]

Fei-Fei, L. etc. A Bayesian Heirarcical Model for Learning Natural Scene Categories, Proc. CVPR'05, 2005.

Digital Library

[6]

T. Ahonen, A. Hadid, and M. Pietikinen. Face Recognition with Local Binary Patterns. Lecture Notes in Computer Science, vol. 3021, pp. 469--481, 2004.

[7]

Ling-Yu Duan, JinqiaoWang, etc. Segmentation, Categorization, and Identification of Commercials from TV Streams using Multimodal Analysis, Proc. ACM MM'06, 2006.

Digital Library

[8]

Wikipedia. {Online} Available, http://en.wikipedia.org.

[9]

Jinqiao Wang, Qingshan Liu, Jing Liu, Hanqing Lu. Logo Retrieval with Latent Semantic Analysis, Proc VIP'06.

[10]

H.-P. Zhang, etc., Chinese Named Entity Recognition Using Role Model, Computational Linguistics and Chinese Language Processing, Vol.8, No.2, August, 2003.

[11]

Rong Yan, etc. Semi-supervised Cross Feature Learning for Semantic Concept Detection in Video. Proc.CVPR'05, 2005.

Digital Library

[12]

P. Duygulu, M.-Y. Chen, and A. Hauptmann. Comparison and Combination of Two Novel Commercial Detection Methods. Proc. CIVR'04, July, 2004.

[13]

J. M. Sànchez, X. Binefa, and J. Vitrià, Shot Partitioning Based Recognition of TV Commercials. Multimedia Tools and Applications, pp. 223--247, Dec 2002.

Digital Library

[14]

A. Albiol, etc. Commercials Detection Using HMMs, Proc. of the Int. Workshop on Image Analysis for Multimedia Interactive Services, Lisbon, 2004.

[15]

A. Albiol, etc. Detection of TV Commercials. Proc. ICASSP'04, Montreal, 2004, pp. 541--544.

[16]

X.-S. Hua, L. Lu, and H.-J. Zhang, Robust Learning-based TV Commercial Detection. Proc.ICME'05, pp.149--152.

[17]

Google. {Online} Available, http://www.google.com.

[18]

M. Mizutani, etc. Commercial Detection in Heterogeneous Video Streams Using Fused Multi-Modal and Temporal Features. Proc.ICASSP'05, Philadelphia, pp. 157--160. 2005.

[19]

R. Lienhart, C. Kuhmunch, and W. Effelsberg. On the Detection and Recognition of Television Commercials. Int. Conf. on Multimedia Computing and Systems, 1997, pp. 509--516.

Digital Library

[20]

JinqiaoWang, Ling-Yu Duan, etc. Robust Commercial Re-trieval in Video Streams, To appear in ICME'07.

[21]

John M. Gauch and Abhishek Shivadas. Identification of new commercials using repeated video sequence detection. Proc. ICIP'05, 2005, pp. 1252--1255.

[22]

A. Shivadas and J.M. Gauch. Real-time commercial recognition using color moments and hashing. Proc. ACM MIR'06.

[23]

C. Colombo, A.D. Bimbo, and P. Pala. Retrieval of commercials by semantic content: The semiotic perspective. Multi-media Tools and Applications, Vol. 13, pp.93--118, 2001.

Digital Library

[24]

P. Quelhas, etc. A thousand words in a scene. To be published in IEEE Transactions in PAMI.

Digital Library

[25]

TRECVID. http:// www-nlpir.nist.gov/projects/trecvid.

[26]

LSCOM Lexicon Definitions and Annotations Version 1.0, Columbia Technical Report #217-2006-3, March, 2006.

[27]

JinqiaoWang, Ling-Yu Duan, etc. A Mid-leve Scene Change Representation via Audiovisual Alignment, Proc. ICASSP'06.

[28]

D.G. Lowe. Distinctive Image Features Form Scale-invariant Keypoints, IJCV, Vol. 60(2), pp. 91--110, 2004.

Digital Library

[29]

H. Bay, T. Tuytelaars, and L.V. Gool. SURF Speeded Up Robust Features. Proc. ECCV'06, 2006.

Digital Library

[30]

Mikolajczyk, K., and Schmid, C. Scale and Affine Invariant Interest Point Detectors. IJCV, pp. 63--86, 2004.

Digital Library

[31]

Harris, C. and Stephens, M. A Combined Corner and Edge Detector. In Alvey Vision Conference, pp. 147--151, 1988.

[32]

Chang Huang, Haizhou Ai, Yuan Li, and Shihong Lao. Vector Boosting for Rotation Invariant Multi-View Face Detection. Proc.ICCV'05, pp.446--453, Beijing, China, 2005.

Digital Library

[33]

C. Toklu, S.P. Liou, and M. Das. VideoAbstract: A New Hybrid Approach to Video Summary Generation, Proc.ICME .00. New York, USA, vol.3, pp.1333--6, 2000.

[34]

WordNet. {Online}, Available, http://wordnet.princeton.edu/.

[35]

ViaVoice. http://www.ibm.com/software/speech/.

[36]

Cees G. M. Snoek etc. The Challenge Problems for Auto-mated Detection of 101 Semantic Concepts in Multimedia, Proc.MM'06, Santa Barbara, CA, USA, 2006.

Digital Library

[37]

TheM. R. Naphade and T.S. Huang. A probabilistic frame-work for semantic video indexing, filtering, and retrieval. IEEE Tran. TMM, vol.3 (1), pp.141--151.

Digital Library

[38]

ABBYY. http://www.abbyy.com/.

[39]

TopOCR. http://code.google.com/p/tesseract--ocr/.

[40]

TH-OCR. http://www.wintone.com.cn/.

[41]

GOCR. http://jocr.sourceforge.net/.

[42]

WeOCR. http://weocr.ocrgrid.org/.

[43]

Tesseract OCR. http://code.google.com/p/tesseract-ocr/.

[44]

Rainer Lienhart. Video OCR: A Survey and Practitioner's Guide. In Video Mining, pp. 155--184, Oct. 2003.

[45]

K.C. Kim. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification. in Proceedings of 17th ICPR, 2004, vol. 2, pp. 679--682.

Digital Library

[46]

J. Gllavata, etc. Text Detection in Images Based on Unsupervised Classification of High-frequency Wavelet Coefficients, in Proc.ICPR'04, pp. 425--428.

Digital Library

[47]

C. Wolf and J.M. Jolion. Extraction and Recognition of Artificial Text in Multimedia Documents. Pattern Analysis and Application, vol. 6, no. 4, pp. 309--326, 2003.

Digital Library

[48]

Fortune 1000 list. http://www.lead411.com/top-companieslist.

[49]

D. Hawking, T. Upstill, and N. C. Toward. Toward better weighting of anchor. In Proc. ACM SIGIR'04, pp. 25--29.

Digital Library

[50]

Kamal Nigam and Rayid Ghani. Analyzing the Effectiveness and Applicability of Co-training. CIKM'00, pp. 86--93. 2000.

Digital Library

Cited By

Gonzalez-Toral SEspinoza-Mejia MPalacio-Baus KSaquicela V(2019)A General Process for the Semantic Annotation and Enrichment of Electronic Program GuidesKnowledge Graphs and Semantic Web10.1007/978-3-030-21395-4_6(72-86)Online publication date: 19-May-2019
https://doi.org/10.1007/978-3-030-21395-4_6
Ortega-León CMarín-Reyes PLorenzo-Navarro JCastrillón-Santana MSánchez-Nielsen E(2019)Video Categorisation Mimicking Text MiningAdvances in Computational Intelligence10.1007/978-3-030-20518-8_25(292-301)Online publication date: 16-May-2019
https://doi.org/10.1007/978-3-030-20518-8_25
Vega FMedina JSaquicela VPalacio-Baus KEspinoza M(2017)Towards a multi-screen interactive ad delivery platform2017 XLIII Latin American Computer Conference (CLEI)10.1109/CLEI.2017.8226400(1-10)Online publication date: Sep-2017
https://doi.org/10.1109/CLEI.2017.8226400
Show More Cited By

Index Terms

TV ad video categorization with probabilistic latent concept learning
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Product feature categorization with multilevel latent semantic association
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

In recent years, the number of freely available online reviews is increasing at a high speed. Aspect-based opinion mining technique has been employed to find out reviewers' opinions toward different product aspects. Such finer-grained opinion mining is ...
Latent Dirichlet learning for document summarization
ICASSP '09: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

Automatic summarization is developed to extract the representative contents or sentences from a large corpus of documents. This paper presents a new hierarchical representation of words, sentences and documents in a corpus, and infers the Dirichlet ...
Multimodal latent topic analysis for image collection summarization

We present a new multimodal image collection summarization method.The summarization method is based on latent topic analysis.Textual and visual modalities are fused in the same latent space using convex non-negative matrix factorization.The obtained ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval

September 2007

343 pages

ISBN:9781595937780

DOI:10.1145/1290082

General Chairs:
James Z. Wang
The Pennsylvania State University, USA
,
Nozha Boujemaa
INRIA Rocquencourt, France
,
Program Chairs:
Alberto Del Bimbo
University of Florence, Italy
,
Jia Li
The Pennsylvania State University, USA

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 September 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM07

Sponsor:

MM07: The 15th ACM International Conference on Multimedia 2007

September 24 - 29, 2007

Bavaria, Augsburg, Germany

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
565
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gonzalez-Toral SEspinoza-Mejia MPalacio-Baus KSaquicela V(2019)A General Process for the Semantic Annotation and Enrichment of Electronic Program GuidesKnowledge Graphs and Semantic Web10.1007/978-3-030-21395-4_6(72-86)Online publication date: 19-May-2019
https://doi.org/10.1007/978-3-030-21395-4_6
Ortega-León CMarín-Reyes PLorenzo-Navarro JCastrillón-Santana MSánchez-Nielsen E(2019)Video Categorisation Mimicking Text MiningAdvances in Computational Intelligence10.1007/978-3-030-20518-8_25(292-301)Online publication date: 16-May-2019
https://doi.org/10.1007/978-3-030-20518-8_25
Vega FMedina JSaquicela VPalacio-Baus KEspinoza M(2017)Towards a multi-screen interactive ad delivery platform2017 XLIII Latin American Computer Conference (CLEI)10.1109/CLEI.2017.8226400(1-10)Online publication date: Sep-2017
https://doi.org/10.1109/CLEI.2017.8226400
Vega FMedina JMendoza DSaquicela VEspinoza M(2017)A robust video identification framework using perceptual image hashing2017 XLIII Latin American Computer Conference (CLEI)10.1109/CLEI.2017.8226396(1-10)Online publication date: Sep-2017
https://doi.org/10.1109/CLEI.2017.8226396
Wang JXu MLu HBurnett I(2016)ActiveAdNeurocomputing10.1016/j.neucom.2015.12.038185:C(82-92)Online publication date: 12-Apr-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.12.038
Saquicela VEspinoza-Mejia MPalacio KAlban H(2014)Enriching Electronic Program Guides using semantic technologies and external resources2014 XL Latin American Computing Conference (CLEI)10.1109/CLEI.2014.6965173(1-8)Online publication date: Sep-2014
https://doi.org/10.1109/CLEI.2014.6965173
Merler MHuang BXie LHua GNatsev A(2012)Semantic Model Vectors for Complex Video Event RecognitionIEEE Transactions on Multimedia10.1109/TMM.2011.216894814:1(88-101)Online publication date: 1-Feb-2012
https://dl.acm.org/doi/10.1109/TMM.2011.2168948
Kae AKan KNarayanan VYankov D(2011)Categorization of display ads using image and landing page featuresProceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications10.1145/2002945.2002946(1-8)Online publication date: 21-Aug-2011
https://dl.acm.org/doi/10.1145/2002945.2002946
Gamallo PBordag S(2011)Is singular value decomposition useful for word similarity extraction?Language Resources and Evaluation10.1007/s10579-010-9129-545:2(95-119)Online publication date: 1-May-2011
https://dl.acm.org/doi/10.1007/s10579-010-9129-5
Wang JDuan LWang BChen SOuyang YLiu JLu HGao W(2009)Linking video ads with product or service information by web searchProceedings of the 2009 IEEE international conference on Multimedia and Expo10.5555/1698924.1698992(274-277)Online publication date: 28-Jun-2009
https://dl.acm.org/doi/10.5555/1698924.1698992
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten