Towards an ontology based framework for searching multimedia contents on the web

Shrivastav, Shikhar; Kumar, Sandeep; Kumar, Kuldeep

doi:10.1007/s11042-017-4350-5

Towards an ontology based framework for searching multimedia contents on the web

Published: 18 January 2017

Volume 76, pages 18657–18686, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shikhar Shrivastav¹,
Sandeep Kumar² &
Kuldeep Kumar³

355 Accesses
4 Citations
Explore all metrics

Abstract

We live in a world where there are huge number of consumers and producers of multimedia content. In this sea of information, finding the right content is like finding a needle in a haystack. Rich annotation of multimedia content during its initial upload on the Web, and further various methodologies for framing search query can be helpful to the user in this regard. In addition to annotation of multimedia content based on the user-provided description, various approaches for annotation and indexing of multimedia files based upon the embedded contents have been presented in the literature. However, annotating multimedia files by using multiple possible sources simultaneously to generate better annotation needs further exploration. We have proposed a framework utilizing these multiple sources of information like text, audio, image, etc. This framework generates annotation based on the contents of user entered description, embedded audio, image analysis, optical character recognition and finally by gathering more information from the Web. This framework provides multiple options to search for content like search by image, audio, video, face and also provides an improved textual search. A system has been implemented based on the proposed framework and the work has also been evaluated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extracting Metadata from Multimedia Content on Facebook as Media Annotations

Using an Ontology for Multimedia Content Semantics

Context-Based Semantic Tagging of Multimedia Data

Notes

http://zemanta.github.io/zemapi-java/com/zemanta/api/Zemanta.html
DBpedia, http://wiki.dbpedia.org/
Enrique Iglesias - Finally Found You, https://www.youtube.com/watch?v=f_EiqPp-vBM
http://www.ibm.com/watson/developercloud/alchemy-language/api/v1/
http://www.imdb.com/title/tt0898266

References

Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 12(28):2037–2041
Article MATH Google Scholar
AlchemyAPI (2016.) AlchemyAPI http://www.alchemyapi.com/. Accessed 27 Aug 2016
Apvrille L, Courtiat J, Lohr C, Saqui-Sannes P (2004) TURTLE: a real-time UML profile supported by a formal validation toolkit. IEEE Trans Softw Eng 30(7):473–487
Article MATH Google Scholar
Asprise (2016) OCR. https://asprise.com/home/. Accessed 25 Aug 2016
Banerjee R, Srivastava PK (2013) Reconstruction of contested landscape: detecting land cover transformation hosting cultural heritage sites from Central India using remote sensing. Land Use Policy 34:193–203
Article Google Scholar
Belhumeur PN, Hespanha JP, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Celino I, Valle ED, Cerizza D, Turati A (2006) Squiggle: a semantic search engine for indexing and retrieval of multimedia content. In: Proceedings of the 1st international conference on semantic-enhanced multimedia presentation systems, Volume 228, pp 40–54
Chang SF, Huang Q, Huang T, Puri A, Shahraray B (1999) Multimedia search and retrieval. In: Puri A, Chen T (eds) Advances in multimedia: systems, standards, and networks, New York
Chang L, Haofen W, Linhao Y (2010) Towards efficient SPARQL query processing on RDF data. Tsinghua Sci Technol 15(6):613–622
Article Google Scholar
Clausen M, Körner H, Kurth F (2003) An efficient indexing and search technique for multimedia databases. In: SIGIR workshop on multimedia retrieval. Canada, Toronto, pp 1–12
Google Scholar
CMU Sphinx (2016) CMU sphinx – open source speech recognition toolkit http://cmusphinx.sourceforge.net/. Accessed 27 Aug 2016
Deilamani MJ, Asli RN (2011) Moving object tracking based on mean shift algorithm. In: International symposium on artificial intelligence and signal processing (AISP), pp 48–53
Faloutsos C (1996) Searching multimedia databases by content. Kluwer Academic Publishers, MA, USA
Book MATH Google Scholar
FFMPEG (2016) FFMPEG http://www.ffmpeg.org/. Accessed 27 Aug 2016
FileInfo (2016) Video File Types http://www.fileinfo.com/filetypes/video. Accessed 28 Aug 2016
Frankel C, Swain MJ, Athitsos V (1996) Webseer: an image search engine for the world wide web. Technical Report. University of Chicago, Chicago
Google Scholar
Gir’o X, Vilaplana V, Marqu’es F, Salembier P (2005) automatic extraction and analysis of visual objects information. In: Stamou G, Kollias S (eds) Multimedia content and the semantic web: methods, standards and tools, John Wiley & Sons, pp 203–221
Hausenblas M (2011) Building scaleable and smart multimedia. 1st edn. GRIN Verlag
Helliker J (2012) Media Release – Nielsen VideoCensus launches in Australia http://www.nielsen.com/content/dam/corporate/au/en/press/2012/Nielsen%20VideoCensus%20media%20release_30.11.12.pdf. Accessed 27 Aug 2016
Hunter J (2005) Adding multimedia to the semantic web - building an mpeg-7 ontology. In: Stamou G, Kollias S (eds) Multimedia content and the semantic web: methods, standards and tools, John Wiley & Sons, pp 75–106
Java Server Pages (2016) Java Server Pages http://www.oracle.com/technetwork/java/%20javaee/jsp/index.html. Accessed 26 Aug 2016
Java Servlet Technology (2016) Java Servlet Technology http://www.oracle.com/technetwork/java/index-jsp-135475.html. Accessed 27 Aug 2012
Apache Jene (2016) Jena. https://jena.apache.org/. Accessed 27 Aug 2016
Kim D, Kim D, Jun S, Rho S, Hwang E (2014) TrendsSummary: a platform for retrieving and summarizing trendy multimedia contents. Multimed Tools Appl 73(2):857–872
Article Google Scholar
Klinger E, Starkweather D (2010) pHash. http://www.phash.org. Accessed 27 Aug 2016
Kroupi E, Hanhart P, Lee JS, Rerabek M, Ebrahimi T (2016) Modeling immersive media experiences by sensing impact on subjects. Multimedia Tools Appl 75(20):12409–12429
Article Google Scholar
Lalinsky L (2016) Chromaprint | AcoustID https://acoustid.org/chromaprint. Accessed 27 Aug 2016
Lee BT, Handler J, Lassila O (2006) The semantic web revisited. IEEE Intell Syst 21(3):96–101
Article Google Scholar
Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: International conference on image processing, pp 900–903
Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2016) Fully convolutional neural networks for remote sensing image classification. In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp 5071–5074
MakeUseOf (2016) Audio file formats explained in simple terms http://www.makeuseof.com/tag/a-look-at-the-different-file-formats-available-part-1-audio/. Accessed 28 Aug 2016
Martinez JM (2016) MPEG-7 overview http://mpeg.chiariglione.org/standards/mpeg-7/mpeg-7.htm. Accessed 27 August 2016
Matthews R (2016) Digital image file types explained http://users.wfu.edu/matthews/misc/graphics/formats/formats.html. Accessed 28 Aug 2016
Müllerová J, Pergl J, Pyšek P (2013) Remote sensing as a tool for monitoring plant invasions: testing the effects of data resolution and image classification approach on the detection of a model plant species Heracleum mantegazzianum (giant hogweed). Int J Appl Earth Obs Geoinf 25:55–65
Article Google Scholar
Norouzi M, Fleet DJ, Salakhutdinov RR (2012) Hamming distance metric learning. In: Advances in neural information processing systems (NIPS), pp 1061–1069
Oracle (2016) URI (Java Platform SE 6) http://docs.oracle.com/javase/6/docs/api/java/net/URI.html. Accessed 26 Aug 2016
Pan JZ, Horrocks I (2007) RDFS(FA): connecting RDF(S) and OWL DL. IEEE Trans Knowl Data Eng 19(2):192–206
Article Google Scholar
Porter A (2012) Evaluating musical fingerprinting systems. McGill University, Doctoral dissertation
Google Scholar
Sexton JO, Urban DL, Donohue MJ, Songh C (2013) Long-term land cover dynamics by multi-temporal classification across the Landsat-5 record. Remote Sens Environ 128:246–258
Article Google Scholar
Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision, pp 1470–1477
Soundhound (2016) Soundhound http://www.soundhound.com/. Accessed 27 Aug 2016
Steiner T (2010) Making video a first class semantic web citizen and a first class web bourgeois. In: 9th International Semantic Web Conference (ISWC10), pp 97–100
Swain MJ (1999) Image and video searching on the World Wide Web. In: Proceedings of the 1999 international conference on challenge of image retrieval (CIR-99), Newcastle, pp 1–8
Opencv Dev Team (2016) Face Recognition with OpenCV http://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html. Accessed 27 Aug 2016
Tehrany MS, Pradhan B, Jebur MN (2013) Remote sensing data reveals eco-environmental changes in urban areas of Klang Valley, Malaysia: contribution from object based analysis. J Indian Soc Remote Sensing 41(4):981–991
Article Google Scholar
Tjondronegoro D, Spink A (2008) Web search engine multimedia functionality. Inf Process Manag 44(1):340–357
Article Google Scholar
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Article Google Scholar
Vachier C, Meyer F (2005) The viscous watershed transform. J Math Imaging Vision 22(2):251–267
Article MathSciNet Google Scholar
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
W3C (2014) RDF Schema 1.1. http://www.w3.org/TR/rdf-schema/. Accessed 27 Aug 2016
W3C (2016a) Extensible Markup Language (XML) http://www.w3.org/XML/. Accessed 27 Aug 2016
W3C (2016b) HTML. https://www.w3.org/html/. Accessed 27 August 2016
Walker W, Lamere L, Kwok P, Raj B, Singh R, Gouvea E, Wolf P, Woelfel J (2004) Sphinx-4: a flexible open source framework for speech recognition. Technical Report, Sun Microsystems, Inc., USA
Google Scholar
Wang H, Wang J (2014) An effective image representation method using kernel classification. In: 26th IEEE International Conference on Tools with Artificial Intelligence, pp 853–858
WebM Project (2016) WebM. https://www.webmproject.org/. Accessed 27 Aug 2016
Zauner C (2010) Implementation and benchmarking of perceptual image hash functions. University of Applied Sciences Hagenberg, Thesis
Google Scholar
Zhen-kun W, Weizong Z (2010) A robust and discriminative image perceptual hash algorithm. In: Fourth international conference on genetic and evolutionary computing (ICGEC). Shenzhen, China, pp 709–712
Google Scholar

Download references

Acknowledgments

Authors are thankful to the editors and anonymous reviewers for their efforts in reviewing the manuscript. A patent has been filed out of this work. We are thankful to IIT Roorkee for providing healthy research and academic environment.

Author information

Authors and Affiliations

Flipkart Bangalore, Bangalore, India
Shikhar Shrivastav
Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, 247667, India
Sandeep Kumar
Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Pilani, India
Kuldeep Kumar

Authors

Shikhar Shrivastav
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Kuldeep Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandeep Kumar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shrivastav, S., Kumar, S. & Kumar, K. Towards an ontology based framework for searching multimedia contents on the web. Multimed Tools Appl 76, 18657–18686 (2017). https://doi.org/10.1007/s11042-017-4350-5

Download citation

Received: 29 September 2016
Revised: 21 December 2016
Accepted: 03 January 2017
Published: 18 January 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s11042-017-4350-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards an ontology based framework for searching multimedia contents on the web

Abstract

Access this article

Similar content being viewed by others

Extracting Metadata from Multimedia Content on Facebook as Media Annotations

Using an Ontology for Multimedia Content Semantics

Context-Based Semantic Tagging of Multimedia Data

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards an ontology based framework for searching multimedia contents on the web

Abstract

Access this article

Similar content being viewed by others

Extracting Metadata from Multimedia Content on Facebook as Media Annotations

Using an Ontology for Multimedia Content Semantics

Context-Based Semantic Tagging of Multimedia Data

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation