Skip to main content

The use of automatic alignment on structured multilingual documents

  • Part III: EP'98
  • Conference paper
  • First Online:
Electronic Publishing, Artistic Imaging, and Digital Typography (RIDT 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1375))

Included in the following conference series:

Abstract

Originally seen as a problem in translation of multilingual texts, the alignment of corresponding entities from two versions of a document has become a scientific research topic. In this paper, natural language processing methods are reviewed and an alignment algorithm is presented that takes into account both the linguistic features, and the structural data present in modern multilingual documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Margaret King. Sdt: A case study. ISSCO (University of Geneva, Geneva, Switzerland), July 1996.

    Google Scholar 

  2. M. Bryan. Linking HTML Translations. In WWW Conference, Internationalization Workshop, Paris, May 1996.

    Google Scholar 

  3. T.C. Benitez. Internationalization & multilinguism. In WWW Conference, Internationalization Workshop, Paris, May 1996.

    Google Scholar 

  4. K. Church. Char-align: A program for aligning parallel texts at the character level. In Computational Linguistics. Association for Computational Linguistics, 1993.

    Google Scholar 

  5. Martin Kay and Martin Roescheisen. Text-translation alignment. Computational Linguistics, 19(1), march 1993.

    Google Scholar 

  6. Adnane Zribi. Contribution à l'étude de l'appariement de textes bilingues et monolingues'. PhD thesis, University of Paris-Sud, July 1995.

    Google Scholar 

  7. P. Brown, S. Delia Pietra, V. Delia Pietrs, and R. Mercer. A statistical approach to language translation. In Proceedings of the 12th International Conference on Computational Linguistics, Budapest, Hungary, 1988.

    Google Scholar 

  8. Frank Debili. Construction automatique de transfert d'expressions français-anglais et français-arabe. CNRS, Paris, December 1990.

    Google Scholar 

  9. R. Catizone, G. Russell, and S. Warwick. Deriving translation data from bilingual texts. In U. Zernick, editor, Proceedings of the First Lexical Acquisition Workshop, Detroit, Mich., USA, 1989.

    Google Scholar 

  10. Peter F. Brown, Jennifer C. Lai, and Robert L. Mercer. Aligning sentences in parallel corpora. In 29th Annual Meeting of the ACL-Proceedings of the Conference, pages 169–176, Berkeley, Californie, USA, juin 1996. Association for Computational Linguistics (ACL), The University of California at Berkeley.

    Google Scholar 

  11. W. Gale and K. W. Church. A program for aligning sentences in bilingual corpora. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, California (U. S. A.), 1991.

    Google Scholar 

  12. M. Simard, G. Foster, and P. Isabelle. Using cognates to align sentences in bilingual corpora. In Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation, 1992.

    Google Scholar 

  13. A. M. McEnery and P. Oakes. Cognate extraction in the crater project. In Proceedings of the EACL-SIGDAT workshop, pages 77–86, Dublin, 1995.

    Google Scholar 

  14. L. Cranias, H. Papageorgiou, and S. Piperidis. A matching technique in example-based machine translation. In Proceedings of the 15th International Conference on Computational Linguistics, Kyoto, Japan, 1994.

    Google Scholar 

  15. A. McEnery and A. Wilson. Corpus Linguistics. Edinburgh University Press, 1996.

    Google Scholar 

  16. Association of Computational Linguistics, editor. An algorithm for finding noun phrase correspondences in bilingual corpora, Palo Alto, 1993. Rank Xerox.

    Google Scholar 

  17. Stanley F. Chen. Building Probabilistic Models for Natural Language. PhD thesis, Harvard University, 1996.

    Google Scholar 

  18. Hadar Shemtov. Text alignment in a tool for translating revised documents. In Sixth Conference of the EACL, pages 449–453. EACL, avril 1993.

    Google Scholar 

  19. A. Garside, G. Leech, and A. McEnery. Corpus Annotation: Linguistic Information from Computer Corpora. Longman, London, forthcoming.

    Google Scholar 

  20. David D. Palmer and Marti A. Hearst. Adaptive sentence boundary disambiguation. Computational Linguistics, novembre 1994.

    Google Scholar 

  21. G. D. Ritchie, A. W. Black, G. Russell, and S. G. Pullmann. Computational morphology. MIT Press, Cambridge, Mass. (U. S. A.), 1992.

    Google Scholar 

  22. Eric Brill. A Corpus-Based Approach to Language Learning. PhD thesis, University of Pennsylvania (U. S. A.), 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roger D. Hersch Jacques André Heather Brown

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ballim, A., Coray, G., Linden, A., Vanoirbeek, C. (1998). The use of automatic alignment on structured multilingual documents. In: Hersch, R.D., André, J., Brown, H. (eds) Electronic Publishing, Artistic Imaging, and Digital Typography. RIDT 1998. Lecture Notes in Computer Science, vol 1375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053292

Download citation

  • DOI: https://doi.org/10.1007/BFb0053292

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64298-5

  • Online ISBN: 978-3-540-69718-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics