The Importance of Visual Context Clues in Multimedia Translation

Harris, Christopher G.; Xu, Tao

doi:10.1007/978-3-642-23708-9_13

The Importance of Visual Context Clues in Multimedia Translation

Christopher G. Harris²¹ &
Tao Xu²²

Conference paper

670 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6941))

Abstract

As video-sharing websites such as YouTube proliferate, the ability to rapidly translate video clips into multiple languages has become an essential component for enhancing their global reach and impact. Moreover, the ability to provide closed captioning in a variety of languages is paramount to reach a wider variety of viewers. We investigate the importance of visual context clues by comparing transcripts of multimedia clips (which allow transcriptionists to make use of visual context clues in their translations) with their corresponding written transcripts (which do not). Additionally, we contrast translations produced using crowdsourcing workers with those made by professional translators on cost and quality. Finally, we evaluate several genres of multimedia to examine the effects of visual context clues on each and demonstrate the results through heat maps.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rao, L.: comScore: YouTube Reaches All-Time High of 14.6 Billion Videos Viewed In (May), http://techcrunch.com/2010/06/24/comscore-youtube-reaches-all-time-high-of-14-6-billion-videos-viewed-in-may/ (retrieved May 5, 2011)
Crocker, M.: Computational Psycholinguistics. Kluwer Academic Publishing, Dordrecht (1996)
Book Google Scholar
Grainger, J., Dijkstra, T. (eds.): Visual word recognition: Models and experiments. Computational psycholinguistics: AI and connectionist models of human language processing. Taylor & Francis, London (1996)
Google Scholar
Johnson-Laird, P.N.: Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness. Cambridge University Press, Cambridge (1983)
Google Scholar
Chun, M.M.: Contextual cueing of visual attention. Trends in Cognitive Sciences 4, 170–178 (2000)
Article Google Scholar
Torres-Oviedo, G., Bastian, A.J.: Seeing is believing: effects of visual contextual cues on learning and transfer of locomotor adaptation. Neuroscience 30, 17015–17022 (2010)
Google Scholar
Deubel, H., et al. (eds.): Attention, information processing and eye movement control. Reading as a perceptual process. Elsevier, Oxford (2000)
Google Scholar
Mueller, G.: Visual contextual cues and listening comprehension: An experiment. Modern Language Journal 64, 335–340 (1980)
Article Google Scholar
Meskill, C.: Listening skills development through multimedia. Journal of Educational Multimedia and Hypermedia 5, 179–201 (1996)
Google Scholar
Fernald, A., et al. (eds.): Looking while listening: Using eye movements to monitor spoken language comprehension by infants and young children. Developmental Psycholonguistics: On-line methods in children’s language processing. John Benjamins, Amsterdam (2008)
Google Scholar
Roy, D., Mukherjee, N.: Towards Situated Speech Understanding: Visual Context Priming of Language Models. Computer Speech and Language 19, 227–248 (2005)
Article Google Scholar
Hardison, D.: Visual and auditory input in second-language speech processing. Language Teaching 43, 84–95 (2010)
Article Google Scholar
Cunillera, T., et al.: Speech segmentation is facilitated by visual cues. Quarterly Journal of Experimental Psychology 63, 260–274 (2010)
Article Google Scholar
Long, D.R.: Second language listening comprehension: A schema-theoretic perspective. Modern Language Journal 73 (Spring 1989)
Google Scholar
Gullberg, M., et al.: Adult Language Learning After Minimal Exposure to an Unknown Natural Language. Language Learning 60, 5–24 (2010)
Article Google Scholar
Kawahara, J.: Auditory-visual contextual cuing effect. Percept. Psychophys 69, 1399–1408 (2007)
Article Google Scholar
Lew, M.S., et al.: Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1–19 (2006)
Article Google Scholar
Zhang, X., et al.: A visualized communication system using cross-media semantic association. Presented at the 17th International Conference on Advances in Multimedia Modeling - Volume Part II, Taipei, Taiwan (2011)
Google Scholar
Tung, L.L., Quaddus, M.A.: Cultural differences explaining the differences in results in GSS: implications for the next decade. Decis. Support Syst. 33, 177–199 (2002)
Article Google Scholar
Morita, D., Ishida, T.: Collaborative translation by monolinguals with machine translators. Presented at the 14th International Conference on Intelligent User Interfaces, Sanibel Island, Florida, USA (2009)
Google Scholar
Bar-Hillel, Y.: A demonstration of the nonfeasibility of fully automatic high quality machine translation. Jerusalem Academic Press, Jerusalem (1964)
Google Scholar
Madsen, M.: The Limits of Machine Translation, Masters in Information Technology and Cognition, Scandanavian Studies and Linguistics. University of Copenhagen, Copenhagen (2009)
Google Scholar
Howe, J.: The Rise of Crowdsourcing. Wired (June 2006)
Google Scholar
Munro, R., et al.: Crowdsourcing and language studies: the new generation of linguistic data. Presented at the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (CSLDAMT 2010), pp. 122–130 (2010)
Google Scholar
Snow, R., et al.: Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. Presented at the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii (2008)
Google Scholar
Marge, M., et al.: Using the Amazon Mechanical Turk for transcription of spoken language. In: ICASSP (2010)
Google Scholar
Novotney, S., Callison-Burch, C.: Cheap, fast and good enough: automatic speech recognition with non-expert transcription. Presented at Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT 2010), pp. 207–215 (2010)
Google Scholar
Banerjee, S., Lavie, A.: METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. Presented at the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, Michigan (2005)
Google Scholar
Porter, M.: Snowball: A language for stemming algorithms (2001), http://snowball.tartarus.org/texts/
Miller, G., Fellbaum, C.: WordNet, http://wordnet.princeton.edu (retrieved April 6, 2011)
van Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths, London (1979)
MATH Google Scholar
Agarwal, A., Lavie, A.: METEOR, M-BLEU and M-TER: evaluation metrics for high-correlation with human rankings of machine translation output. Presented at the Third Workshop on Statistical Machine Translation, Columbus, Ohio (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Informatics Program, The University of Iowa, Iowa City, IA, 52242, USA
Christopher G. Harris
School of Foreign Languages, Tongji University, Shanghai, 200092, China
Tao Xu

Authors

Christopher G. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Tao Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for the Evaluation of Language and Communication Technologies (CELCT), Via alla Casata 56/c, 38123, Povo, Italy
Pamela Forner
National University of Distance Education, E.T.S.I. Informática de la UNED, c/Juan del Rosal 16, 28040, Madrid, Spain
Julio Gonzalo
School of Information Sciences, University of Tampere, Kanslerinrinne 1, 33014, Tampere, Finland
Jaana Kekäläinen
Yahoo! Research, Avinguda Diagonal 177, 8th Floor, 08018, Barcelona, Spain
Mounia Lalmas
Intelligent Systems Laboratory, University of Amsterdam, Science Park 107, 1098 XG, Amsterdam, The Netherlands
Marteen de Rijke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Harris, C.G., Xu, T. (2011). The Importance of Visual Context Clues in Multimedia Translation. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds) Multilingual and Multimodal Information Access Evaluation. CLEF 2011. Lecture Notes in Computer Science, vol 6941. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23708-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-23708-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23707-2
Online ISBN: 978-3-642-23708-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics