Boosting Retrieval of Digital Spoken Content

Pereira Nunes, Bernardo; Mera, Alexander; Casanova, Marco A.; Kawase, Ricardo

doi:10.1007/978-3-642-37343-5_16

Bernardo Pereira Nunes^23,24,
Alexander Mera²³,
Marco A. Casanova²³ &
…
Ricardo Kawase²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7828))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

962 Accesses
3 Citations

Abstract

Every day, the Internet expands as millions of new multimedia objects are uploaded in the form of audio, video and images. While traditional text-based content is indexed by search engines, this indexing cannot be applied to audio and video objects, resulting in a plethora of multimedia content that is inaccessible to a majority of online users. To address this issue, we introduce a technique of automatic, semantically enhanced, description generation for multimedia content. The objective is to facilitate indexing and retrieval of the objects with the help of traditional search engines. Essentially, the technique generates static Web pages automatically, which describe the content of the digital audio and video objects. These descriptions are then organized in such a way as to facilitate locating corresponding audio and video segments. The technique employs a combination of Web services and concurrently provides description translation and semantic enhancement. Thorough analysis of the click-data, comparing accesses to the digital content before and after automatic description generation, suggests a significant increase in the number of retrieval items. This outcome, however is not limited to the terms of visibility, but in supporting multilingual access, additionally decreases the number of language barriers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alberti, C., Bacchiani, M., Bezman, A., Chelba, C., Drofa, A., Liao, H., Moreno, P., Power, T., Sahuguet, A., Shugrina, M., Siohan, O.: An audio indexing system for election video material. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4873–4876. IEEE Computer Society, Washington, DC (2009)
Chapter Google Scholar
Baidu search engine, http://www.baidu.com
Brezeale, D., Cook, D.: Automatic video classification: A survey of the literature. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 38(3), 416–430 (2008)
Article Google Scholar
Dublin core metadata initiative, http://www.dublincore.org
Glass, J., Hazen, T.J., Cyphers, S., Malioutov, I., Huynh, D., Barzilay, R.: Recent Progress in the MIT Spoken Lecture Processing Project. In: Proc. Interspeech (2007)
Google Scholar
Haslhofer, B., Momeni, E., Gay, M., Simon, R.: Augmenting europeana content with linked data resources. In: Proceedings of the 6th International Conference on Semantic Systems, I-SEMANTICS 2010, pp. 40:1–40:3. ACM, New York (2010)
Google Scholar
Jiang, L., Wu, Z., Zheng, Q., Liu, J.: Learning deep web crawling with diverse features. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2009, vol. 01, pp. 572–575. IEEE Computer Society, Washington, DC (2009)
Chapter Google Scholar
Larson, M., Soleymani, M., Serdyukov, P., Rudinac, S., Wartena, C., Murdock, V., Friedland, G., Ordelman, R., Jones, G.J.F.: Automatic tagging and geotagging in video collections and communities. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR 2011, pp. 51:1–51:8. ACM, New York (2011)
Google Scholar
Madhavan, J., Afanasiev, L., Antova, L., Halevy, A.: Harnessing the Deep Web: Present and Future. In: 4th Biennial Conference on Innovative Data Systems Research (CIDR) (January 2009)
Google Scholar
Madhavan, J., Ko, D., Kot, L., Ganapathy, V., Rasmussen, A., Halevy, A.: Google’s deep web crawl. Proc. VLDB Endow. 1, 1241–1252 (2008)
Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Ghidini, C., Ngomo, A.-C.N., Lindstaedt, S.N., Pellegrini, T. (eds.) I-SEMANTICS. ACM International Conference Proceeding Series, pp. 1–8. ACM (2011)
Google Scholar
Nexiwave – speech indexing, http://www.nexiwave.com
Nuance – dragon naturallyspeaking, http://www.nuance.com
Repp, S., Meinel, C.: Automatic extraction of semantic descriptions from the lecturer’s speech. In: IEEE International Conference on Semantic Computing, ICSC 2009, pp. 513–520 (September 2009)
Google Scholar
Truveo video search, http://www.truveo.com
W3c – rdfa primer, http://www.w3.org/TR/xhtml-rdfa-primer
Youtube – broadcast yourself, http://www.youtube.com

Download references

Author information

Authors and Affiliations

Department of Informatics, PUC-Rio, Rio de Janeiro, RJ, Brazil
Bernardo Pereira Nunes, Alexander Mera & Marco A. Casanova
L3S Research Center, Leibniz University Hannover, Germany
Bernardo Pereira Nunes & Ricardo Kawase

Authors

Bernardo Pereira Nunes
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Mera
View author publications
You can also search for this author in PubMed Google Scholar
Marco A. Casanova
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Kawase
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computational Science and Artificial Intelligence, University of the Basque Country, Manuel Lardizabal 1, 20018, San Sebastian, Spain
Manuel Graña
Vicomtech-IK4, Paseo Mijeletegui, 20009, San Sebastian, Spain
Carlos Toro
KES International, P.O. Box 2115, BN43 9AF, Shoreham-by-sea, UK
Robert J. Howlett
School of Engineering, University of Canberra, Mawson Lakes Campus, ACT 2601, Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pereira Nunes, B., Mera, A., Casanova, M.A., Kawase, R. (2013). Boosting Retrieval of Digital Spoken Content. In: Graña, M., Toro, C., Howlett, R.J., Jain, L.C. (eds) Knowledge Engineering, Machine Learning and Lattice Computing with Applications. KES 2012. Lecture Notes in Computer Science(), vol 7828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37343-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-37343-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37342-8
Online ISBN: 978-3-642-37343-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics