Skip to main content
Log in

Towards an ontology based framework for searching multimedia contents on the web

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We live in a world where there are huge number of consumers and producers of multimedia content. In this sea of information, finding the right content is like finding a needle in a haystack. Rich annotation of multimedia content during its initial upload on the Web, and further various methodologies for framing search query can be helpful to the user in this regard. In addition to annotation of multimedia content based on the user-provided description, various approaches for annotation and indexing of multimedia files based upon the embedded contents have been presented in the literature. However, annotating multimedia files by using multiple possible sources simultaneously to generate better annotation needs further exploration. We have proposed a framework utilizing these multiple sources of information like text, audio, image, etc. This framework generates annotation based on the contents of user entered description, embedded audio, image analysis, optical character recognition and finally by gathering more information from the Web. This framework provides multiple options to search for content like search by image, audio, video, face and also provides an improved textual search. A system has been implemented based on the proposed framework and the work has also been evaluated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. http://zemanta.github.io/zemapi-java/com/zemanta/api/Zemanta.html

  2. DBpedia, http://wiki.dbpedia.org/

  3. Enrique Iglesias - Finally Found You, https://www.youtube.com/watch?v=f_EiqPp-vBM

  4. http://www.ibm.com/watson/developercloud/alchemy-language/api/v1/

  5. http://www.imdb.com/title/tt0898266

References

  1. Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 12(28):2037–2041

    Article  MATH  Google Scholar 

  2. AlchemyAPI (2016.) AlchemyAPI http://www.alchemyapi.com/. Accessed 27 Aug 2016

  3. Apvrille L, Courtiat J, Lohr C, Saqui-Sannes P (2004) TURTLE: a real-time UML profile supported by a formal validation toolkit. IEEE Trans Softw Eng 30(7):473–487

    Article  MATH  Google Scholar 

  4. Asprise (2016) OCR. https://asprise.com/home/. Accessed 25 Aug 2016

  5. Banerjee R, Srivastava PK (2013) Reconstruction of contested landscape: detecting land cover transformation hosting cultural heritage sites from Central India using remote sensing. Land Use Policy 34:193–203

    Article  Google Scholar 

  6. Belhumeur PN, Hespanha JP, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720

    Article  Google Scholar 

  7. Celino I, Valle ED, Cerizza D, Turati A (2006) Squiggle: a semantic search engine for indexing and retrieval of multimedia content. In: Proceedings of the 1st international conference on semantic-enhanced multimedia presentation systems, Volume 228, pp 40–54

  8. Chang SF, Huang Q, Huang T, Puri A, Shahraray B (1999) Multimedia search and retrieval. In: Puri A, Chen T (eds) Advances in multimedia: systems, standards, and networks, New York

  9. Chang L, Haofen W, Linhao Y (2010) Towards efficient SPARQL query processing on RDF data. Tsinghua Sci Technol 15(6):613–622

    Article  Google Scholar 

  10. Clausen M, Körner H, Kurth F (2003) An efficient indexing and search technique for multimedia databases. In: SIGIR workshop on multimedia retrieval. Canada, Toronto, pp 1–12

    Google Scholar 

  11. CMU Sphinx (2016) CMU sphinx – open source speech recognition toolkit http://cmusphinx.sourceforge.net/. Accessed 27 Aug 2016

  12. Deilamani MJ, Asli RN (2011) Moving object tracking based on mean shift algorithm. In: International symposium on artificial intelligence and signal processing (AISP), pp 48–53

  13. Faloutsos C (1996) Searching multimedia databases by content. Kluwer Academic Publishers, MA, USA

    Book  MATH  Google Scholar 

  14. FFMPEG (2016) FFMPEG http://www.ffmpeg.org/. Accessed 27 Aug 2016

  15. FileInfo (2016) Video File Types http://www.fileinfo.com/filetypes/video. Accessed 28 Aug 2016

  16. Frankel C, Swain MJ, Athitsos V (1996) Webseer: an image search engine for the world wide web. Technical Report. University of Chicago, Chicago

    Google Scholar 

  17. Gir’o X, Vilaplana V, Marqu’es F, Salembier P (2005) automatic extraction and analysis of visual objects information. In: Stamou G, Kollias S (eds) Multimedia content and the semantic web: methods, standards and tools, John Wiley & Sons, pp 203–221

  18. Hausenblas M (2011) Building scaleable and smart multimedia. 1st edn. GRIN Verlag

  19. Helliker J (2012) Media Release – Nielsen VideoCensus launches in Australia http://www.nielsen.com/content/dam/corporate/au/en/press/2012/Nielsen%20VideoCensus%20media%20release_30.11.12.pdf. Accessed 27 Aug 2016

  20. Hunter J (2005) Adding multimedia to the semantic web - building an mpeg-7 ontology. In: Stamou G, Kollias S (eds) Multimedia content and the semantic web: methods, standards and tools, John Wiley & Sons, pp 75–106

  21. Java Server Pages (2016) Java Server Pages http://www.oracle.com/technetwork/java/%20javaee/jsp/index.html. Accessed 26 Aug 2016

  22. Java Servlet Technology (2016) Java Servlet Technology http://www.oracle.com/technetwork/java/index-jsp-135475.html. Accessed 27 Aug 2012

  23. Apache Jene (2016) Jena. https://jena.apache.org/. Accessed 27 Aug 2016

  24. Kim D, Kim D, Jun S, Rho S, Hwang E (2014) TrendsSummary: a platform for retrieving and summarizing trendy multimedia contents. Multimed Tools Appl 73(2):857–872

    Article  Google Scholar 

  25. Klinger E, Starkweather D (2010) pHash. http://www.phash.org. Accessed 27 Aug 2016

  26. Kroupi E, Hanhart P, Lee JS, Rerabek M, Ebrahimi T (2016) Modeling immersive media experiences by sensing impact on subjects. Multimedia Tools Appl 75(20):12409–12429

    Article  Google Scholar 

  27. Lalinsky L (2016) Chromaprint | AcoustID https://acoustid.org/chromaprint. Accessed 27 Aug 2016

  28. Lee BT, Handler J, Lassila O (2006) The semantic web revisited. IEEE Intell Syst 21(3):96–101

    Article  Google Scholar 

  29. Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: International conference on image processing, pp 900–903

  30. Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2016) Fully convolutional neural networks for remote sensing image classification. In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp 5071–5074

  31. MakeUseOf (2016) Audio file formats explained in simple terms http://www.makeuseof.com/tag/a-look-at-the-different-file-formats-available-part-1-audio/. Accessed 28 Aug 2016

  32. Martinez JM (2016) MPEG-7 overview http://mpeg.chiariglione.org/standards/mpeg-7/mpeg-7.htm. Accessed 27 August 2016

  33. Matthews R (2016) Digital image file types explained http://users.wfu.edu/matthews/misc/graphics/formats/formats.html. Accessed 28 Aug 2016

  34. Müllerová J, Pergl J, Pyšek P (2013) Remote sensing as a tool for monitoring plant invasions: testing the effects of data resolution and image classification approach on the detection of a model plant species Heracleum mantegazzianum (giant hogweed). Int J Appl Earth Obs Geoinf 25:55–65

    Article  Google Scholar 

  35. Norouzi M, Fleet DJ, Salakhutdinov RR (2012) Hamming distance metric learning. In: Advances in neural information processing systems (NIPS), pp 1061–1069

  36. Oracle (2016) URI (Java Platform SE 6) http://docs.oracle.com/javase/6/docs/api/java/net/URI.html. Accessed 26 Aug 2016

  37. Pan JZ, Horrocks I (2007) RDFS(FA): connecting RDF(S) and OWL DL. IEEE Trans Knowl Data Eng 19(2):192–206

    Article  Google Scholar 

  38. Porter A (2012) Evaluating musical fingerprinting systems. McGill University, Doctoral dissertation

    Google Scholar 

  39. Sexton JO, Urban DL, Donohue MJ, Songh C (2013) Long-term land cover dynamics by multi-temporal classification across the Landsat-5 record. Remote Sens Environ 128:246–258

    Article  Google Scholar 

  40. Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision, pp 1470–1477

  41. Soundhound (2016) Soundhound http://www.soundhound.com/. Accessed 27 Aug 2016

  42. Steiner T (2010) Making video a first class semantic web citizen and a first class web bourgeois. In: 9th International Semantic Web Conference (ISWC10), pp 97–100

  43. Swain MJ (1999) Image and video searching on the World Wide Web. In: Proceedings of the 1999 international conference on challenge of image retrieval (CIR-99), Newcastle, pp 1–8

  44. Opencv Dev Team (2016) Face Recognition with OpenCV http://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html. Accessed 27 Aug 2016

  45. Tehrany MS, Pradhan B, Jebur MN (2013) Remote sensing data reveals eco-environmental changes in urban areas of Klang Valley, Malaysia: contribution from object based analysis. J Indian Soc Remote Sensing 41(4):981–991

    Article  Google Scholar 

  46. Tjondronegoro D, Spink A (2008) Web search engine multimedia functionality. Inf Process Manag 44(1):340–357

    Article  Google Scholar 

  47. Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86

    Article  Google Scholar 

  48. Vachier C, Meyer F (2005) The viscous watershed transform. J Math Imaging Vision 22(2):251–267

    Article  MathSciNet  Google Scholar 

  49. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  50. W3C (2014) RDF Schema 1.1. http://www.w3.org/TR/rdf-schema/. Accessed 27 Aug 2016

  51. W3C (2016a) Extensible Markup Language (XML) http://www.w3.org/XML/. Accessed 27 Aug 2016

  52. W3C (2016b) HTML. https://www.w3.org/html/. Accessed 27 August 2016

  53. Walker W, Lamere L, Kwok P, Raj B, Singh R, Gouvea E, Wolf P, Woelfel J (2004) Sphinx-4: a flexible open source framework for speech recognition. Technical Report, Sun Microsystems, Inc., USA

    Google Scholar 

  54. Wang H, Wang J (2014) An effective image representation method using kernel classification. In: 26th IEEE International Conference on Tools with Artificial Intelligence, pp 853–858

  55. WebM Project (2016) WebM. https://www.webmproject.org/. Accessed 27 Aug 2016

  56. Zauner C (2010) Implementation and benchmarking of perceptual image hash functions. University of Applied Sciences Hagenberg, Thesis

    Google Scholar 

  57. Zhen-kun W, Weizong Z (2010) A robust and discriminative image perceptual hash algorithm. In: Fourth international conference on genetic and evolutionary computing (ICGEC). Shenzhen, China, pp 709–712

    Google Scholar 

Download references

Acknowledgments

Authors are thankful to the editors and anonymous reviewers for their efforts in reviewing the manuscript. A patent has been filed out of this work. We are thankful to IIT Roorkee for providing healthy research and academic environment.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sandeep Kumar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shrivastav, S., Kumar, S. & Kumar, K. Towards an ontology based framework for searching multimedia contents on the web. Multimed Tools Appl 76, 18657–18686 (2017). https://doi.org/10.1007/s11042-017-4350-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4350-5

Keywords

Navigation