skip to main content
10.1145/1290082.1290113acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

TV ad video categorization with probabilistic latent concept learning

Published: 24 September 2007 Publication History

Abstract

In this paper we present a multi-modal approach to TV ads classification by advertised products/services. A bag-of-words representation is proposed to discover ad categories-related latent visual and textual concepts by probabilistic latent semantics analysis (PLSA). We use multi-modal concepts to represent ad categories in the latent semantics space. In particular, we resort to external resources (e.g., a brand list, encyclopedia) to expand sparse textual information. A semi-supervised co-training is finally employed to fuse visual and textual features for ad classification. Our experiments have achieved promising results in terms of classification accuracy and scalability to new ad categories. The resulting ad classifiers can be applied to digest ads from TV streams, which is useful for TV viewers to manage ads in a positive manner. The digested ads can be considered the video-based alert for emerging products/services. Thus the reachability and focus of TV ads can be improved.

References

[1]
Hofmann, T. Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning. 2001, 177--196.
[2]
Hofmann, T. Probabilistic Latent Semantic Indexing. ACM SIGIR'98, 1998.
[3]
Quelhas, P., Monay, F., Odobez, J., Gatica-Perez, D., Tuyte-laars, T., Van Gool, L. Modeling Scenes with Local Descriptors and Latent Aspects. ICCV'05, Beijing, China, 2005.
[4]
Bosch, A., Zisserman, A. and Munoz, X. Scene Classification via pLSA ECCV'06. 2006.
[5]
Fei-Fei, L. etc. A Bayesian Heirarcical Model for Learning Natural Scene Categories, Proc. CVPR'05, 2005.
[6]
T. Ahonen, A. Hadid, and M. Pietikinen. Face Recognition with Local Binary Patterns. Lecture Notes in Computer Science, vol. 3021, pp. 469--481, 2004.
[7]
Ling-Yu Duan, JinqiaoWang, etc. Segmentation, Categorization, and Identification of Commercials from TV Streams using Multimodal Analysis, Proc. ACM MM'06, 2006.
[8]
Wikipedia. {Online} Available, http://en.wikipedia.org.
[9]
Jinqiao Wang, Qingshan Liu, Jing Liu, Hanqing Lu. Logo Retrieval with Latent Semantic Analysis, Proc VIP'06.
[10]
H.-P. Zhang, etc., Chinese Named Entity Recognition Using Role Model, Computational Linguistics and Chinese Language Processing, Vol.8, No.2, August, 2003.
[11]
Rong Yan, etc. Semi-supervised Cross Feature Learning for Semantic Concept Detection in Video. Proc.CVPR'05, 2005.
[12]
P. Duygulu, M.-Y. Chen, and A. Hauptmann. Comparison and Combination of Two Novel Commercial Detection Methods. Proc. CIVR'04, July, 2004.
[13]
J. M. Sànchez, X. Binefa, and J. Vitrià, Shot Partitioning Based Recognition of TV Commercials. Multimedia Tools and Applications, pp. 223--247, Dec 2002.
[14]
A. Albiol, etc. Commercials Detection Using HMMs, Proc. of the Int. Workshop on Image Analysis for Multimedia Interactive Services, Lisbon, 2004.
[15]
A. Albiol, etc. Detection of TV Commercials. Proc. ICASSP'04, Montreal, 2004, pp. 541--544.
[16]
X.-S. Hua, L. Lu, and H.-J. Zhang, Robust Learning-based TV Commercial Detection. Proc.ICME'05, pp.149--152.
[17]
Google. {Online} Available, http://www.google.com.
[18]
M. Mizutani, etc. Commercial Detection in Heterogeneous Video Streams Using Fused Multi-Modal and Temporal Features. Proc.ICASSP'05, Philadelphia, pp. 157--160. 2005.
[19]
R. Lienhart, C. Kuhmunch, and W. Effelsberg. On the Detection and Recognition of Television Commercials. Int. Conf. on Multimedia Computing and Systems, 1997, pp. 509--516.
[20]
JinqiaoWang, Ling-Yu Duan, etc. Robust Commercial Re-trieval in Video Streams, To appear in ICME'07.
[21]
John M. Gauch and Abhishek Shivadas. Identification of new commercials using repeated video sequence detection. Proc. ICIP'05, 2005, pp. 1252--1255.
[22]
A. Shivadas and J.M. Gauch. Real-time commercial recognition using color moments and hashing. Proc. ACM MIR'06.
[23]
C. Colombo, A.D. Bimbo, and P. Pala. Retrieval of commercials by semantic content: The semiotic perspective. Multi-media Tools and Applications, Vol. 13, pp.93--118, 2001.
[24]
P. Quelhas, etc. A thousand words in a scene. To be published in IEEE Transactions in PAMI.
[25]
TRECVID. http:// www-nlpir.nist.gov/projects/trecvid.
[26]
LSCOM Lexicon Definitions and Annotations Version 1.0, Columbia Technical Report #217-2006-3, March, 2006.
[27]
JinqiaoWang, Ling-Yu Duan, etc. A Mid-leve Scene Change Representation via Audiovisual Alignment, Proc. ICASSP'06.
[28]
D.G. Lowe. Distinctive Image Features Form Scale-invariant Keypoints, IJCV, Vol. 60(2), pp. 91--110, 2004.
[29]
H. Bay, T. Tuytelaars, and L.V. Gool. SURF Speeded Up Robust Features. Proc. ECCV'06, 2006.
[30]
Mikolajczyk, K., and Schmid, C. Scale and Affine Invariant Interest Point Detectors. IJCV, pp. 63--86, 2004.
[31]
Harris, C. and Stephens, M. A Combined Corner and Edge Detector. In Alvey Vision Conference, pp. 147--151, 1988.
[32]
Chang Huang, Haizhou Ai, Yuan Li, and Shihong Lao. Vector Boosting for Rotation Invariant Multi-View Face Detection. Proc.ICCV'05, pp.446--453, Beijing, China, 2005.
[33]
C. Toklu, S.P. Liou, and M. Das. VideoAbstract: A New Hybrid Approach to Video Summary Generation, Proc.ICME .00. New York, USA, vol.3, pp.1333--6, 2000.
[34]
WordNet. {Online}, Available, http://wordnet.princeton.edu/.
[35]
ViaVoice. http://www.ibm.com/software/speech/.
[36]
Cees G. M. Snoek etc. The Challenge Problems for Auto-mated Detection of 101 Semantic Concepts in Multimedia, Proc.MM'06, Santa Barbara, CA, USA, 2006.
[37]
TheM. R. Naphade and T.S. Huang. A probabilistic frame-work for semantic video indexing, filtering, and retrieval. IEEE Tran. TMM, vol.3 (1), pp.141--151.
[38]
ABBYY. http://www.abbyy.com/.
[39]
TopOCR. http://code.google.com/p/tesseract--ocr/.
[40]
TH-OCR. http://www.wintone.com.cn/.
[41]
GOCR. http://jocr.sourceforge.net/.
[42]
WeOCR. http://weocr.ocrgrid.org/.
[43]
Tesseract OCR. http://code.google.com/p/tesseract-ocr/.
[44]
Rainer Lienhart. Video OCR: A Survey and Practitioner's Guide. In Video Mining, pp. 155--184, Oct. 2003.
[45]
K.C. Kim. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification. in Proceedings of 17th ICPR, 2004, vol. 2, pp. 679--682.
[46]
J. Gllavata, etc. Text Detection in Images Based on Unsupervised Classification of High-frequency Wavelet Coefficients, in Proc.ICPR'04, pp. 425--428.
[47]
C. Wolf and J.M. Jolion. Extraction and Recognition of Artificial Text in Multimedia Documents. Pattern Analysis and Application, vol. 6, no. 4, pp. 309--326, 2003.
[48]
Fortune 1000 list. http://www.lead411.com/top-companieslist.
[49]
D. Hawking, T. Upstill, and N. C. Toward. Toward better weighting of anchor. In Proc. ACM SIGIR'04, pp. 25--29.
[50]
Kamal Nigam and Rayid Ghani. Analyzing the Effectiveness and Applicability of Co-training. CIKM'00, pp. 86--93. 2000.

Cited By

View all
  • (2019)A General Process for the Semantic Annotation and Enrichment of Electronic Program GuidesKnowledge Graphs and Semantic Web10.1007/978-3-030-21395-4_6(72-86)Online publication date: 19-May-2019
  • (2019)Video Categorisation Mimicking Text MiningAdvances in Computational Intelligence10.1007/978-3-030-20518-8_25(292-301)Online publication date: 16-May-2019
  • (2017)Towards a multi-screen interactive ad delivery platform2017 XLIII Latin American Computer Conference (CLEI)10.1109/CLEI.2017.8226400(1-10)Online publication date: Sep-2017
  • Show More Cited By

Index Terms

  1. TV ad video categorization with probabilistic latent concept learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval
    September 2007
    343 pages
    ISBN:9781595937780
    DOI:10.1145/1290082
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 September 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. TV ads
    2. multimodal analysis
    3. semantics
    4. video categorization

    Qualifiers

    • Article

    Conference

    MM07
    MM07: The 15th ACM International Conference on Multimedia 2007
    September 24 - 29, 2007
    Bavaria, Augsburg, Germany

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)A General Process for the Semantic Annotation and Enrichment of Electronic Program GuidesKnowledge Graphs and Semantic Web10.1007/978-3-030-21395-4_6(72-86)Online publication date: 19-May-2019
    • (2019)Video Categorisation Mimicking Text MiningAdvances in Computational Intelligence10.1007/978-3-030-20518-8_25(292-301)Online publication date: 16-May-2019
    • (2017)Towards a multi-screen interactive ad delivery platform2017 XLIII Latin American Computer Conference (CLEI)10.1109/CLEI.2017.8226400(1-10)Online publication date: Sep-2017
    • (2017)A robust video identification framework using perceptual image hashing2017 XLIII Latin American Computer Conference (CLEI)10.1109/CLEI.2017.8226396(1-10)Online publication date: Sep-2017
    • (2016)ActiveAdNeurocomputing10.1016/j.neucom.2015.12.038185:C(82-92)Online publication date: 12-Apr-2016
    • (2014)Enriching Electronic Program Guides using semantic technologies and external resources2014 XL Latin American Computing Conference (CLEI)10.1109/CLEI.2014.6965173(1-8)Online publication date: Sep-2014
    • (2012)Semantic Model Vectors for Complex Video Event RecognitionIEEE Transactions on Multimedia10.1109/TMM.2011.216894814:1(88-101)Online publication date: 1-Feb-2012
    • (2011)Categorization of display ads using image and landing page featuresProceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications10.1145/2002945.2002946(1-8)Online publication date: 21-Aug-2011
    • (2011)Is singular value decomposition useful for word similarity extraction?Language Resources and Evaluation10.1007/s10579-010-9129-545:2(95-119)Online publication date: 1-May-2011
    • (2009)Linking video ads with product or service information by web searchProceedings of the 2009 IEEE international conference on Multimedia and Expo10.5555/1698924.1698992(274-277)Online publication date: 28-Jun-2009
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media