Skip to main content
Log in

Some key issues in designing MT systems

  • Published:
Machine Translation

Abstract

Development of a machine translation system (Mts) requires many tradeoffs in terms of the variety of available formalisms and control mechanisms. The tradeoffs involve issues in the generative power of grammar, formal linguistic power and efficiency of the parser, manipulation flexibility for knowledge bases, knowledge acquisition, degree of expressiveness and uniformity of the system, integration of the knowledge sources, and so forth. In this paper we discuss some basic decisions which must be made in constructing a large system. Our experience with an operational English-Chinese Mts, ArchTran, is presented to illustrate decision making related to procedural tradeoffs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aho, A.V., and J.D. Ullman. 1972. The Theory of Parsing, Translation, and Compiling. Vol. 1. Parsing. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Aho, A.V., R. Sethi and J.D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Bahl, L.R., P.F. Brown, P.V. de Souza and R.L. Mercer. 1988. A New Algorithm for the Estimation of Hidden Markov Model Parameters. Proceedings of the 1988 IEEE International Conference on Acoustics, Speech, and Signal Processing, 493–496.

  • Baker, J.K. 1979. Trainable Grammars for Speech Recognition. Proceedings of the Spring Conference of the Acoustical Society of America.

  • Bate, M. 1978. The Theory and Practice of Augmented Transition Network Grammars. In L. Bolc (ed.), Natural Language Communication with Computer. Berlin: Springer, 191–259.

    Google Scholar 

  • Bennett, W.S., and J. Slocum. 1985. The LRC Machine Translation System. Computational Linguistics 11: 111–119.

    Google Scholar 

  • Bennett, W.S. 1990. How Much Semantics is Necessary for MT Systems? Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language. University of Texas at Austin, 261–270.

  • Brown, P., J. Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, F. Jelinek, J. D. Lafferty, R.L. Mercer and P.S. Roossin. 1990. A Statistical Approach to Machine Translation. Computational Linguistics 16: 79–85.

    Google Scholar 

  • Caeyers, H., and G. Adriaens. 1990. Efficient Parsing Using Preferences. Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language. University of Texas at Austin, 279–286.

  • Carbonell, J.G., and M. Tomita. 1987. Knowledge-Based Machine Translation, the CMU Approach. In S. Nirenburg (ed.), Machine Translation: Theoretical and Methodological Issues. New York: Cambridge University Press, 68–89.

    Google Scholar 

  • Chang, J.-S., and K.-Y. Su. 1988. Incremental Environment for Scored Machine Translation Systems. Proceedings of Rocling-i, Nantou, Taiwan, 63–73.

  • Chang, J.-S. 1990. Gpsm: A Generalized Probabilistic Semantic Model for Ambiguity Resolution. Master's thesis, Institute of Computer Science and Information Engineering, National Chaio Tung University, Hsinchu, Taiwan.

  • Church, K. 1988. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. ACL Proceedings of 2nd Conference on Applied Natural Language Processing, Austin, Texas, 136–143.

  • Church, K., and P. Hanks 1989. Word Association Norms, Mutual Information, and Lexicography. Proceedings of Acl-27, Vancouver, 76–83.

  • DeRose, S.J. 1988. Grammatical Category Disambiguation by Statistical Optimization. Computational Linguistics 14: 31–39.

    Google Scholar 

  • Earley, J. 1970. An Efficient Context-Free Parsing Algorithm. CACM 13: 94–102.

    Google Scholar 

  • Efron, B. 1979. Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics 7: 1–26.

    Google Scholar 

  • Erman, L.D., and V.R. Lesser. 1980. The Hearsay-ii Speech Understanding System: A Tutorial. In W.A. Lea (ed.), Trends in Speech Recognition. Englewood, NJ: Prentice-Hall, 361–381.

    Google Scholar 

  • Fienberg, S.E., and P.W. Holland. 1972. On the Choice of Flattening Constants for Estimating Multinominal Probabilities. Journal of Multivariate Analysis 2: 127–134.

    Google Scholar 

  • Fujisaki, T. 1984. A Stochastic Approach to Sentence Parsing. Proceedings of Coling-84, Stanford, 16–19.

  • Garside, R., G. Leech and G. Sampson, 1987. The Computational Analysis of English: A Corpus-Based Approach. London: Longman.

    Google Scholar 

  • Hopcroft, J.E., and J.D. Ullman. 1979. Introduction to Automata Theory, Language, and Computation. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Horowitz, E., and S. Sahni. 1978. Fundamentals of Computer Algorithms. New York: Computer Science Press.

    Google Scholar 

  • Hsu, H.-H., and K.-Y. Su. 1986. A Bottom-Up Parser in The Machine Translation System with the Essence of Atn. Proceedings of International Computer Symposium (Ics), Tainan, Taiwan, Vol. 1 of 3, 166–173.

    Google Scholar 

  • Hutchins, W.J. 1986. Machine Translation: Past, Present, Future. Chichester, Sussex: Ellis Horwood.

    Google Scholar 

  • Ingria, R.J.P. 1990. The Limits of Unification. Proceedings of Acl-28, Pittsburgh, 194–204.

  • Jelinek, F., and R.L. Mercer. 1980. Interpolated Estimation of Markov Source Parameters from Sparse Data. In E.S. Gelsema and L.N. Kanal (eds.), Pattern Recognition in Practice. Amsterdam: North-Holland, 381–397

    Google Scholar 

  • Kay, M. 1980. Algorithm Schemata and Data Structures in Syntactic Processing. Xerox technical report, CSL-80-12.

  • Kay, M. 1990. On the Architecture of a Machine Translation System. Proceedings of Rocling-iii, Taipei, 1–3.

  • Liu, C.-L. 1989a. On the Resolution of English PP Attachment Problem with a Probabilistic Semantic Model. Master's thesis, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan.

  • Liu, C.-L., and K.-Y. Su. 1989b. A Quantitative Comparison Between an LR Parser and an ATN Interpreter. Proceedings of Rocling-ii, Nantou, Taiwan, 277–288.

  • Liu, C.-L, J.-S. Chang and K.-Y. Su. 1990. The Semantic Score Approach to the Disambiguation of PP Attachment Problem. Proceedings of Rocling-iii, Taipei, 253–270.

  • Marcus, M.P. 1980. A Theory of Syntactic Recognition for Natural Language. Cambridge, MA: MIT Press.

    Google Scholar 

  • Nádas, A. 1985. On Turing's Formula for Word Probabilities. IEEE Transactions on Acoustics, Speech, and Signal Processing 33: 1414–1416.

    Google Scholar 

  • Pratt, V.R. 1973. A linguistics oriented programming language. Proceedings of Ijcai-3, Stanford, 372–381.

  • Pratt, V.R. 1975. Lingol — A progress report. Proceedings of Ijcai-4, Tbilisi, 422–428.

  • Rabiner, L.R., and B.H. Juang. 1986. An Introduction to Hidden Markov Models. IEEE ASSP Magazine, January, 4–16.

  • Sampson, G. 1986. A Stochastic Approach to Parsing. Proceedings of Coling-86, Bonn, 151–155.

  • Shieber, S.M. 1985. Criteria for designing computer facilities for linguistic analysis. Linguistics 23: 189–211.

    Google Scholar 

  • Su, K.-Y., J.-S. Chang, and H.-H. Hsu. 1987a. A Powerful Language Processing System for English-Chinese Machine Translation. Proceedings of International Conference on Chinese and Oriental Language Computing, Chicago, 260–264.

  • Su, K.-Y., J.-N. Wang, W.-H. Li and J.-S. Chang. 1987b. A New Parsing Strategy in Natural Language Processing Based on the Truncation Algorithm. Proceedings of National Computer Symposium, Taipei, 580–586.

  • Su, K.-Y., and J.-S. Chang. 1988. Semantic and Syntactic Aspects of Score Function. Proceedings of Coling-88, Budapest, 642–644.

  • Su, K.-Y, J.-N. Wang, M.-H. Su and J.-S. Chang. 1989a. A Sequential Truncation Parsing Algorithm Based on the Score Function. Proceedings of International Workshop on Parsing Technologies, Pittsburgh, 95–104.

  • Su, K.-Y., M.-H. Su and L.-M. Kuan. 1989b. Smoothing Statistic Databases in a Machine Translation System. Proceedings of Rocling-ii, Nantou, Taiwan, 333–347.

  • Su, K.-Y., J.-N. Wang, M.-H. Su and J.-S. Chang. 1990a. GLR Parsing with Scoring. To appear In M. Tomita (ed.), Generalized LR Parsing.

  • Su, K.-Y., J.-S. Chang and Y.-C. Lin. 1990b. A Unified Approach to Disambiguation Using A Uniform Formulation of Probabilistic Score Functions. In preparation.

  • Tomita, M. 1987. An Efficient Augmented-Context-Free Parsing Algorithm. Computational Linguistics 13: 31–46.

    Google Scholar 

  • Wilks, Y.A. 1975a. Preference Semantics. In E.L. Keenan (ed.), Formal Semantics of Natural Language. Cambridge, Cambridge University Press, 329–348.

    Google Scholar 

  • Wilks, Y. A. 1975b. An Intelligent Analyzer and Understander of English. CACM 18: 264–274.

    Google Scholar 

  • Wilks, Y.A. 1983. Preference Semantics, Ill-Formedness, and Metaphor. Computational Linguistics 9: 178–187.

    Google Scholar 

  • Winograd, T. 1983. Language as a Cognitive Process. Vol. 1. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Woods, W.A. 1970a. Context-Sensitive Parsing. CACM 13: 437–445.

    Google Scholar 

  • Woods, W.A. 1970b. Transition Network Grammars for Natural Language Analysis. CACM 13: 591–606.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

We would like to thank the Behavior Tech Computer Corp. for its full financial support for research described in this report, and members of the MT research team at the corporation's R&D center for technical support. Also, several referees for this journal offered helpful suggestions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Su, KY., Chang, JS. Some key issues in designing MT systems. Mach Translat 5, 265–300 (1990). https://doi.org/10.1007/BF00376644

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00376644

Keywords

Navigation