Skip to main content

Improving Translation Quality by Manipulating Sentence Length

  • Conference paper
  • First Online:
Machine Translation and the Information Soup (AMTA 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1529))

Included in the following conference series:

Abstract

Translation systems tend to have more trouble with long sentences than with short ones for a variety of reasons. When the source and target languages differ rather markedly, as do Japanese and English, this problem is reflected in lower quality output. To improve readability, we experimented with automatically splitting long sentences into shorter ones. This paper outlines the problem, describes the sentence splitting procedure and rules, and provides an evaluation of the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bloomfield, L. 1933. Language. New York: Henry Holt.

    Google Scholar 

  2. Fodor, J.A., T.G. Bever, and M.F. Garrett. 1974. The Psychology of Language. New York: McGraw-Hill.

    Google Scholar 

  3. Gunter, R. 1974. Sentences in Dialog. Columbia, SC: Hernbeam Press.

    Google Scholar 

  4. Hovy, E.H. and L. Gerber. 1997. MT at the Paragraph Level: Improving English Synthesis in SYSTRAN. In Proceedings of the 4th Conference of Theoretical and Methodological Issues in Machine Translation (TMI). Santa Fe, New Mexico.

    Google Scholar 

  5. Hovy, E.H. and C-Y. Lin. 1998. Automated Text Summarization in SUMMARIST. In M. Maybury and I. Mani (eds), Advances in Automatic Text Summarization. MIT Press, to appear.

    Google Scholar 

  6. Hutchins, J. and H. Somers. 1992. An Introdution to Machine Translation. Cambridge: Academic Press Limited.

    Google Scholar 

  7. Knight, K. and V. Hatzivassiloglou. 1995. Two-Level, Many-Paths Generation. In Proceedings of the 4th Conference of the ACL (252–260).

    Google Scholar 

  8. Knight, K., I. Chander, M. Haines, V. Hatzivassiloglou, E.H. Hovy, M. Iida, S.K. Luk, R.A. Whitney, and K. Yamada. 1995. Filling Knowledge Gaps in a Broad-Coverage MT System. In Proceedings of the 14th IJCAI Conference. Montreal, Canada.

    Google Scholar 

  9. Levinson, J. 1986. Punctuation and the Orthographic Sentence: A Linguistic Analysis. Ph.D dissertation, City University of New York.

    Google Scholar 

  10. Marcu, D. 1997. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. dissertation, University of Toronto.

    Google Scholar 

  11. Nunberg, G. 1990. The Linguistics of Punctuation. CSLI Lecture Notes 18, Center for the Study of Language and Information, Stanford University.

    Google Scholar 

  12. Penman 1989 The Penman Documentation. Unpublished documentation for the Penman Language Generation System, USC/Information Sciences Institute.

    Google Scholar 

  13. Quirk, R., S. Greenbaum, G. Leech, and J. Svartvik. 1985. A Comprehensive Grammar of the English Language. London and New York: Longman.

    Google Scholar 

  14. Vaughan Payne, L. 1965. The Lively Art of Writing. Chicago: Follett Publishing Company.

    Google Scholar 

  15. Weathers, W. and O. Winchester. 1978. The New Strategy of Style. McGraw-Hill.

    Google Scholar 

  16. Yang, J. and Gerber L. 1996. SYSTRAN Chinese-English MT System. In Proceedings of the International Conference on Chinese Computing 96. Singapore.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gerber, L., Hovy, E. (1998). Improving Translation Quality by Manipulating Sentence Length. In: Farwell, D., Gerber, L., Hovy, E. (eds) Machine Translation and the Information Soup. AMTA 1998. Lecture Notes in Computer Science(), vol 1529. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49478-2_40

Download citation

  • DOI: https://doi.org/10.1007/3-540-49478-2_40

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65259-5

  • Online ISBN: 978-3-540-49478-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics