Abstract
Development of a machine translation system (Mts) requires many tradeoffs in terms of the variety of available formalisms and control mechanisms. The tradeoffs involve issues in the generative power of grammar, formal linguistic power and efficiency of the parser, manipulation flexibility for knowledge bases, knowledge acquisition, degree of expressiveness and uniformity of the system, integration of the knowledge sources, and so forth. In this paper we discuss some basic decisions which must be made in constructing a large system. Our experience with an operational English-Chinese Mts, ArchTran, is presented to illustrate decision making related to procedural tradeoffs.
Similar content being viewed by others
References
Aho, A.V., and J.D. Ullman. 1972. The Theory of Parsing, Translation, and Compiling. Vol. 1. Parsing. Englewood Cliffs, NJ: Prentice-Hall.
Aho, A.V., R. Sethi and J.D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley.
Bahl, L.R., P.F. Brown, P.V. de Souza and R.L. Mercer. 1988. A New Algorithm for the Estimation of Hidden Markov Model Parameters. Proceedings of the 1988 IEEE International Conference on Acoustics, Speech, and Signal Processing, 493–496.
Baker, J.K. 1979. Trainable Grammars for Speech Recognition. Proceedings of the Spring Conference of the Acoustical Society of America.
Bate, M. 1978. The Theory and Practice of Augmented Transition Network Grammars. In L. Bolc (ed.), Natural Language Communication with Computer. Berlin: Springer, 191–259.
Bennett, W.S., and J. Slocum. 1985. The LRC Machine Translation System. Computational Linguistics 11: 111–119.
Bennett, W.S. 1990. How Much Semantics is Necessary for MT Systems? Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language. University of Texas at Austin, 261–270.
Brown, P., J. Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, F. Jelinek, J. D. Lafferty, R.L. Mercer and P.S. Roossin. 1990. A Statistical Approach to Machine Translation. Computational Linguistics 16: 79–85.
Caeyers, H., and G. Adriaens. 1990. Efficient Parsing Using Preferences. Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language. University of Texas at Austin, 279–286.
Carbonell, J.G., and M. Tomita. 1987. Knowledge-Based Machine Translation, the CMU Approach. In S. Nirenburg (ed.), Machine Translation: Theoretical and Methodological Issues. New York: Cambridge University Press, 68–89.
Chang, J.-S., and K.-Y. Su. 1988. Incremental Environment for Scored Machine Translation Systems. Proceedings of Rocling-i, Nantou, Taiwan, 63–73.
Chang, J.-S. 1990. Gpsm: A Generalized Probabilistic Semantic Model for Ambiguity Resolution. Master's thesis, Institute of Computer Science and Information Engineering, National Chaio Tung University, Hsinchu, Taiwan.
Church, K. 1988. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. ACL Proceedings of 2nd Conference on Applied Natural Language Processing, Austin, Texas, 136–143.
Church, K., and P. Hanks 1989. Word Association Norms, Mutual Information, and Lexicography. Proceedings of Acl-27, Vancouver, 76–83.
DeRose, S.J. 1988. Grammatical Category Disambiguation by Statistical Optimization. Computational Linguistics 14: 31–39.
Earley, J. 1970. An Efficient Context-Free Parsing Algorithm. CACM 13: 94–102.
Efron, B. 1979. Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics 7: 1–26.
Erman, L.D., and V.R. Lesser. 1980. The Hearsay-ii Speech Understanding System: A Tutorial. In W.A. Lea (ed.), Trends in Speech Recognition. Englewood, NJ: Prentice-Hall, 361–381.
Fienberg, S.E., and P.W. Holland. 1972. On the Choice of Flattening Constants for Estimating Multinominal Probabilities. Journal of Multivariate Analysis 2: 127–134.
Fujisaki, T. 1984. A Stochastic Approach to Sentence Parsing. Proceedings of Coling-84, Stanford, 16–19.
Garside, R., G. Leech and G. Sampson, 1987. The Computational Analysis of English: A Corpus-Based Approach. London: Longman.
Hopcroft, J.E., and J.D. Ullman. 1979. Introduction to Automata Theory, Language, and Computation. Reading, MA: Addison-Wesley.
Horowitz, E., and S. Sahni. 1978. Fundamentals of Computer Algorithms. New York: Computer Science Press.
Hsu, H.-H., and K.-Y. Su. 1986. A Bottom-Up Parser in The Machine Translation System with the Essence of Atn. Proceedings of International Computer Symposium (Ics), Tainan, Taiwan, Vol. 1 of 3, 166–173.
Hutchins, W.J. 1986. Machine Translation: Past, Present, Future. Chichester, Sussex: Ellis Horwood.
Ingria, R.J.P. 1990. The Limits of Unification. Proceedings of Acl-28, Pittsburgh, 194–204.
Jelinek, F., and R.L. Mercer. 1980. Interpolated Estimation of Markov Source Parameters from Sparse Data. In E.S. Gelsema and L.N. Kanal (eds.), Pattern Recognition in Practice. Amsterdam: North-Holland, 381–397
Kay, M. 1980. Algorithm Schemata and Data Structures in Syntactic Processing. Xerox technical report, CSL-80-12.
Kay, M. 1990. On the Architecture of a Machine Translation System. Proceedings of Rocling-iii, Taipei, 1–3.
Liu, C.-L. 1989a. On the Resolution of English PP Attachment Problem with a Probabilistic Semantic Model. Master's thesis, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan.
Liu, C.-L., and K.-Y. Su. 1989b. A Quantitative Comparison Between an LR Parser and an ATN Interpreter. Proceedings of Rocling-ii, Nantou, Taiwan, 277–288.
Liu, C.-L, J.-S. Chang and K.-Y. Su. 1990. The Semantic Score Approach to the Disambiguation of PP Attachment Problem. Proceedings of Rocling-iii, Taipei, 253–270.
Marcus, M.P. 1980. A Theory of Syntactic Recognition for Natural Language. Cambridge, MA: MIT Press.
Nádas, A. 1985. On Turing's Formula for Word Probabilities. IEEE Transactions on Acoustics, Speech, and Signal Processing 33: 1414–1416.
Pratt, V.R. 1973. A linguistics oriented programming language. Proceedings of Ijcai-3, Stanford, 372–381.
Pratt, V.R. 1975. Lingol — A progress report. Proceedings of Ijcai-4, Tbilisi, 422–428.
Rabiner, L.R., and B.H. Juang. 1986. An Introduction to Hidden Markov Models. IEEE ASSP Magazine, January, 4–16.
Sampson, G. 1986. A Stochastic Approach to Parsing. Proceedings of Coling-86, Bonn, 151–155.
Shieber, S.M. 1985. Criteria for designing computer facilities for linguistic analysis. Linguistics 23: 189–211.
Su, K.-Y., J.-S. Chang, and H.-H. Hsu. 1987a. A Powerful Language Processing System for English-Chinese Machine Translation. Proceedings of International Conference on Chinese and Oriental Language Computing, Chicago, 260–264.
Su, K.-Y., J.-N. Wang, W.-H. Li and J.-S. Chang. 1987b. A New Parsing Strategy in Natural Language Processing Based on the Truncation Algorithm. Proceedings of National Computer Symposium, Taipei, 580–586.
Su, K.-Y., and J.-S. Chang. 1988. Semantic and Syntactic Aspects of Score Function. Proceedings of Coling-88, Budapest, 642–644.
Su, K.-Y, J.-N. Wang, M.-H. Su and J.-S. Chang. 1989a. A Sequential Truncation Parsing Algorithm Based on the Score Function. Proceedings of International Workshop on Parsing Technologies, Pittsburgh, 95–104.
Su, K.-Y., M.-H. Su and L.-M. Kuan. 1989b. Smoothing Statistic Databases in a Machine Translation System. Proceedings of Rocling-ii, Nantou, Taiwan, 333–347.
Su, K.-Y., J.-N. Wang, M.-H. Su and J.-S. Chang. 1990a. GLR Parsing with Scoring. To appear In M. Tomita (ed.), Generalized LR Parsing.
Su, K.-Y., J.-S. Chang and Y.-C. Lin. 1990b. A Unified Approach to Disambiguation Using A Uniform Formulation of Probabilistic Score Functions. In preparation.
Tomita, M. 1987. An Efficient Augmented-Context-Free Parsing Algorithm. Computational Linguistics 13: 31–46.
Wilks, Y.A. 1975a. Preference Semantics. In E.L. Keenan (ed.), Formal Semantics of Natural Language. Cambridge, Cambridge University Press, 329–348.
Wilks, Y. A. 1975b. An Intelligent Analyzer and Understander of English. CACM 18: 264–274.
Wilks, Y.A. 1983. Preference Semantics, Ill-Formedness, and Metaphor. Computational Linguistics 9: 178–187.
Winograd, T. 1983. Language as a Cognitive Process. Vol. 1. Reading, MA: Addison-Wesley.
Woods, W.A. 1970a. Context-Sensitive Parsing. CACM 13: 437–445.
Woods, W.A. 1970b. Transition Network Grammars for Natural Language Analysis. CACM 13: 591–606.
Author information
Authors and Affiliations
Additional information
We would like to thank the Behavior Tech Computer Corp. for its full financial support for research described in this report, and members of the MT research team at the corporation's R&D center for technical support. Also, several referees for this journal offered helpful suggestions.
Rights and permissions
About this article
Cite this article
Su, KY., Chang, JS. Some key issues in designing MT systems. Mach Translat 5, 265–300 (1990). https://doi.org/10.1007/BF00376644
Issue Date:
DOI: https://doi.org/10.1007/BF00376644