Abstract
Parallel Text Alignment (PTA) is the problem of automatically aligning content in multiple text documents originating or derived from the same source. The implications of this result in improving multimedia data access in digital library applications range from facilitating the analysis of multiple English language translations of classical texts to enabling the ondemand and random comparison of multiple transcriptions derived from a given audio stream, or associated with a given stream of video, audio, or images. In this paper we give an efficient algorithm for achieving such an alignment, and demonstrate its use with two applications. This result is an application of the new framework of Cross-Modal Information Retrieval recently developed at Dartmouth.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Owen, C.B. and Makedon, F.: Cross-Modal Retrieval of Scripted Speech Audio. In: Proc. of SPIE Multimedia Computing and Networking, San Jose, CA (1998) to appear
Dagan, I., Pereira, F., and Lee, L.: Similarity-Based Estimation of Word Cooccurrence Probabilities. In: Proc. of the 32nd Annual Meeting of the Assoc. for Computational Linguistics, ACL’94, New Mexico State University, Las Cruces, NM (1994)
Chen, T., Graf, H.P., and Wang, K.: Lip Synchronization Using Speech-Assisted Video Processing. IEEE Signal Proc. Letters 2 (1995) 57–59
Bloom, P.J.: High-Quality Digital Audio in the Entertainment Industry: An Overview of Achievements and Challenges. IEEE ASSP Magazine 2 (1995) 2–25
Brown, M.G., Foote, J.T., Jones, G.J.F., Spärck Jones, K., and Young, S.J.: Video Mail Retrieval by Voice: An Overview of the Cambridge/Olivetti Retrieval System. In: Proc. of the ACM Multimedia’ 94 Workshop on Multimedia Database Management Systems, San Francisco, CA (1994) 47–55
Ballerini, J.-P., Büchel, M., Domenig, R., Knaus, D., Mateev, B., Mittendorf, E., Schäuble, P., Sheridan, P., and Wechsler, M.: SPIDER Retrieval System at TREC-5. In: Proc. of TREC-5 (1996)
Hauptmann, A.G., Witbrock, M.J., Rudnicky, A.I., and Reed, S.: Speech for Multimedia Information Retrieval. In: Proc. of User Interface Software and Technology UIST-95, Pittsburg, PA (1995)
Gibbs, S., Breiteneder, C., and Tsichritzis, D.: Modeling Time-Based Media. The Handbook of Multimedia Information Management. Prentice Hall PTR (1997) 13–38.
Bonhomme, P., and Romary, L.: The Lingua Parallel Concordancing Project: Managing Multilingual Texts for Educational Purposes. In: Proc. of Language Engineering 95, Montpellier, France (1995)
Church, K.W.: Char_Align: A Program for Aligning Parallel Texts at the Character Level. In: Proc. of the 30th Annual Meeting of the Assoc. for Computational Linguistics, ACL’93, Columbus, OH (1993)
Makedon, F., Owen,, M., and Owen, C.: Multimedia-Data Access Remote Prototype for Ancient Texts. In: Proc. of ED-MEDIA 98, Freiburg, Germany (1998)
Owen, C.B.: Multiple Media Correlation: Theory and Applications. Ph.D. thesis, Dartmouth College Dept. of Computer Science (1998)
Melamed, I.D.: A Portable Algorithm for Mapping Bitext Correspondence. In: Proc. of the 35th Conference of the Assoc. for Computational Linguistics, ACL’97, Madrid, Spain (1997)
Rigau, G., and Agirre, E.: Disambiguating Bilingual Nominal Entries Against WordNet. In: Proc. of the Workshop on the Computational Lexicon, ESSLLI’95 (1995)
Fung, P., and McKeown, K.: Aligning Noisy Parallel Corpora Across Language Groups: Word Pair Feature Matching by Dynamic Time Warping. In: Proc. of the 1st Conf. of the Assoc. for Machine Translation in the Americas, AMTA-94, Columbia, Maryland (1994)
Kabir, A.S.: Identifying And Encoding Correlations Across Multiple Documents. DEVLAB Research Report, Dartmouth College (1997)
Fung, P., and Church, K.W.: K-vec: A New Approach for Aligning Parallel Texts. In: Proc. of the 15th Int. Conf. on Computational Linguistics COLING’94„ Kyoto, Japan, (1994) 1096–1102
Homer: The Odyssey. Translated by Samuel Butler.
Homer: The Odyssey. Translated by George Chapman.
Melamed, I.D.: A Geometric Approach to Mapping Bitext Correspondence. Report 96-22, IRCS (1996)
van der Eijk, P.: Comparative Discourse Analysis of Parallel Texts. Unpublished manuscript (1994)
Salton, G.: Introduction to Modern Information Retrieval. McGraw-Hill Computer Science Series, New York (1982)
Richard Beckwith, George A. Miller, and Randee Tengi. Design and Implementation of the Wordnet Lexical Database and Searching Software. Report, Princeton University Cognitive Science Laboratory (1993)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K.: Introduction to WordNet: An On-line Lexical Database (revised). CSL Report 43, Princeton University Cognitive Science Laboratory (1993)
Cormen, T.H., Leiserson, C.E., and Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge, MA (1990)
Owen, C.B.: The Imagetcl Multimedia Algorithm Development System. In: Proc. of the 5th Annual Tcl/Tk Workshop’97, Boston, MA (1997) 97–105
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Owen, C.B., Ford, J., Makedon, F., Steinberg, T. (1998). Parallel Text Alignment. In: Nikolaou, C., Stephanidis, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1998. Lecture Notes in Computer Science, vol 1513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49653-X_15
Download citation
DOI: https://doi.org/10.1007/3-540-49653-X_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65101-7
Online ISBN: 978-3-540-49653-3
eBook Packages: Springer Book Archive