research-article

Large-Scale Semantic Concept Detection Based On Visual Contents

Authors:

Mohamed Hamroun,

Ikram AmousAuthors Info & Claims

MoMM2019: Proceedings of the 17th International Conference on Advances in Mobile Computing & Multimedia

Pages 165 - 174

https://doi.org/10.1145/3365921.3365925

Published: 22 February 2020 Publication History

Abstract

Indexing video by the concept is one of the most appropriate solutions for such problem. It's based on an association between a concept and its corresponding visual, sound or textual features. This kind of association is not a trivial task. It requires knowledge about the concept and its context. In this paper, we investigate a new concept detection approach to improve the performance of content-based multimedia documents retrieval systems. To achieve this goal, we tackle the problem from different plans and make four contributions at various stages of the indexing process. We first propose a new weakly supervised semi-automatic method based on the genetic algorithm to extract and annotate the video plans for training set. Subsequently, we develop a method to detect the basic concepts. We also deal with the issue of noise reduction when generating visual dictionary (BoVS). The different contributions are tested and evaluated on a big dataset (TRECVID 2015).

References

[1]

Y. Zhang and T. Chen. "Weakly Supervised Object Recognition and Localization with Invariant High Order Features.". In BMVC, pp. 1--11, 2010.

[2]

M. H. Nguyen, L. Torresani, F. De la Torre, and C. Rother. "Weakly supervised discriminative localization and classification: a joint learning process". In Computer Vision, 2009 12th International Conference on, pp. 1925--1932, IEEE, 2009.

[3]

E. Russ, and J. Kennedy. A new optimizer using particle swarm theory, Proceedings of the sixth international symposium on micro machine and human science, pp. 39--43, 1995.

[4]

J. Winn and N. Jojic. "Locus. Learning object classes with unsupervised segmentation". In IEEE International Conference on Computer Vision, pp. 756--763, 2005.

Digital Library

[5]

S. Fadaei, R. Amirfattahi and M. R. Ahmadzadeh, A New Content-Based Image Retrieval System Based on OptimizedIntegration of DCD, Wavelet and Curvelet Features, IET Image Processing, 2017.

[6]

Wang, Xiang-Yang, Y. J. Yu, and H. Y. Yang: 'An effective image retrieval scheme using color, texture and shape features', Computer Standards & Interfaces, 33, (1), pp. 59--68, 2011.

Digital Library

[7]

A. Prest, C. Schmid, and V. Ferrari. "Weakly supervised learning of interactions between humans and objects". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No. 3, pp. 601--614, 2012.

Digital Library

[8]

D.T. Ojala, M. Pietikinen, and T. Maenpaa, "Multiresolution gray scale and rotation invariant texture classification with local binary patterns," IEEE Trans on PAMI, vol. 24, pp. 971--987, 2002.

Digital Library

[9]

A. Streicher, H. Burkhardt, and J. Fehr, "A bag of features approach for 3D shape retrieval," International Symposium on Visual Computing, 2009.

[10]

T. Wan and Z. Qin, "A new technique for summarizing video sequences through histogram evolution," SPCOM, pp. 1--5, 2010.

[11]

Xiaoli Y, Jing Yu, Zengchang Q, and Tao Wan, A SIFT-LBP image retrieval model based on bag-of-features, 18th IEEE International Conference on Image Processing, 2011.

[12]

M. Hamroun, S.Lajmi, H. Nicolas and I. Amous. (2018). ISE:Interactive Image Search Using Visual Content. In ICEIS 2018.

[13]

M. Hamroun, S. Lajmi, H. Nicolas and I. Amous, "An Interactive Video Browsing With VINAS System", In Proceedings of the 15th ACS/IEEE International Conference on Computer Systems and Applications AICCSA, Aqaba, Jordan, 2018.

[14]

S. Tang, Y.T. Zheng, Y. Wang, T.S. Chua, Sparse ensemble learning for concept detection, J. IEEE Trans. Multimed, pp 43--54, 2012.

Digital Library

[15]

V. Viitaniemi, M. Koskela, J. Laaksonen, PicSOM Experiments in TRECVID 2009 Mats Sjöberg, - Helsinki University of Technology, Finland, 2009.

[16]

Z. S. Harris. "Distributional structure.".Word, 1954

[17]

J. Slimi, S. Mansouri, A. Ben Ammar, Adel M. Alimi. 2013, Video exploration tool based on semantic network. OAIR, pp 213--214, 2013.

[18]

M. Ben Halima, M. Hamroun, S. Moussa and A.M. Alimi, An interactive engine for multilingual video browsing using semantic content, International Graphonomics Society Conference IGS, Nara Japan, pp 183--186, 2013.

[19]

S., Padmakala and G., AnandhaMala, Interactive Video Retrieval Using Semantic Level Features and Relevant Feedback, The International Arab Journal of Information Technology, 2017.

[20]

L., Rossetto, I., Giangreco, C., Tanase and H., Schuldt, Multimodal Video Retrieval with the 2017 IMOTION System, ICMR'17, June 6--9, Bucharest, Romania, 2017.

[21]

U. Rashid, M. Viviani, G. Pasi. A graph-based approach for visualizing and exploring a multimedia search result space. Inf. Sci. 370-371 pp 303--322, 2016.

Digital Library

[22]

Z. Zhang, W. Li, C. Gurrin, Alan F, Smeaton Faceted Navigation for Browsing Large Video Collection. MMM, pp 412--417, 2016.

[23]

M.S. Lew, N. Sebe, C. Dheraba, Content-based multimedia information retrieval: State of the art and challenges, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2006.

[24]

A. Hauptmann, R. V. Baron, M. Chen, M. Christel, P. Duygulu, C. Huang, R. Jin, h. Lin W, T. Ng, N. Moraveji, C. G. M. Snoek, G. Tzanetakis, J. Yang, R. Yan, H.D. Wactlar, Informedia at trecvid 2003: analyzing and searching broadcast news video. In: Proc. Of TRECVID, 2003.

[25]

A. Natsev, J. Tesic, L Xie, R. Yan, J. R. Smith, Ibm multimedia search and retrieval system. I n: CIVR, p. 645, 2007.

[26]

C. G. M. Snoek, M. Worring, J. M. Geusebroek, D. C. Koelma, F. J. Seinstra, A. W. M. Smeulders, The semantic pathfinder: using an authoring metaphor for generic multimedia indexing. IEEE Trans Pattern Anal Mach Intell 28(10):1678--1689, 2006.

Digital Library

[27]

M. Pandey and S. Lazebnik. "Scene recognition and weakly supervised object localization with deformable part-based models". In:Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 1307--1314, IEEE, 2011.

Digital Library

[28]

L. Wang, D. Song, E. Elyan, Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, C I K M'12, pp. 13 03--131 2. ACM, USA, 2012.

[29]

C.G.M. Snoek, S. Cappallo, D. Fontijne, D. Julian, D.C. Koelma, P. Mettes, K.E.A. van de Sande, A. Sarah, H. Stokman, R.B. Towal, Qualcomm Research and University of Amsterdam at TRECVID 2015:Recognizing Concepts, Objects, and Events in Video, 2015.

[30]

E. Pinho, C. Costa, Feature Learning with Adversarial Networks for Concept Detection in Medical Images: UA.PT Bioinformatics at ImageCLEF 2018. CLEF (Working Notes) 2018.

[31]

N. Elleuch, A. Ben Ammar and A. M. Alimi, A generic framework for semantic video indexing based on visual concepts/contexts detection. In Mutimedia Tools and application, 2015.

[32]

E. Pinh, J.F. Silva, J.M. Silva, C. Costa, Towards Representation Learning for Biomedical Concept Detection in Medical Images: UA. PT Bioinformatics in ImageCLEF 2017. In: Working notes of conference and labs of the evaluation forum., Dublin, Ireland, 2017.

[33]

A. Kumar, P. Sattigeri, T. Fletcher, Semi-supervised Learning with GANs:Manifold Invariance with Improved Inference. In: Advances in neural informationprocessing systems, pp 5540--5550, 2017

[34]

K. Ueki and T. Kobayashi, Waseda at TRECVID 2015: Semantic Indexing, TREVVID, 2015.

[35]

K. Dimitris, K. Ergina, Concept detection on medical images using Deep Residual Learning Network, In: Working notes of conference and labs of the evaluation forum. Springer, Dublin, Ireland, 2017.

[36]

L. Valavanis, T. Kalamboukis, IPL at ImageCLEF 2018: A kNN based Concept Detection Approach. CLEF (Working Notes), 2018.

[37]

M. Hamroun, S. Lajmi, H. Nicolas and I. Amous. VISEN: A Video Interactive Retrieval Engine Based on Semantic Network in large video collections. International Database Engineering & Applications Symposium (IDEAS 2019).

Digital Library

Cited By

Hamroun MTamine KCrespin B(2021)Multimodal Video Indexing (MVI): A New Method Based on Machine Learning and Semi-Automatic Annotation on Large Video CollectionsInternational Journal of Image and Graphics10.1142/S021946782250022X22:02Online publication date: 19-Jun-2021
https://doi.org/10.1142/S021946782250022X

Large-Scale Semantic Concept Detection Based On Visual Contents
1. Information systems
  1. Information retrieval
    1. Search engine architectures and scalability

Recommendations

Content-based multimedia information retrieval: State of the art and challenges

Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100+ recent ...
Large-Scale Concept Detection in Multimedia Data Using Small Training Sets and Cross-Domain Concept Fusion

This paper presents the concept detector module developed for the VITALAS multimedia retrieval system. It outlines its architecture and major implementation aspects, including a set of procedures and tools that were used for the development of detectors ...
Rule-Based Semantic Concept Classification from Large-Scale Video Collections

The explosive growth and increasing complexity of the multimedia data have created a high demand of multimedia services and applications in various areas so that people can access and distribute the data easily. Unfortunately, traditional keyword-based ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

MoMM2019: Proceedings of the 17th International Conference on Advances in Mobile Computing & Multimedia

December 2019

266 pages

ISBN:9781450371780

DOI:10.1145/3365921

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Johannes Kepler University, Linz, Austria
@WAS: International Organization of Information Integration and Web-based Applications and Services

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MoMM2019

MoMM2019: The 17th International Conference on Advances in Mobile Computing & Multimedia

December 2 - 4, 2019

Munich, Germany

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
29
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hamroun MTamine KCrespin B(2021)Multimodal Video Indexing (MVI): A New Method Based on Machine Learning and Semi-Automatic Annotation on Large Video CollectionsInternational Journal of Image and Graphics10.1142/S021946782250022X22:02Online publication date: 19-Jun-2021
https://doi.org/10.1142/S021946782250022X

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents