Skip to main content
Log in

Efficient Part-of-Speech Tagging with a Min-Max Modular Neural-Network Model

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper presents a part-of-speech tagging method based on a min-max modular neural-network model. The method has three main steps. First, a large-scale tagging problem is decomposed into a number of relatively smaller and simpler subproblems according to the class relations among a given training corpus. Secondly, all of the subproblems are learned by smaller network modules in parallel. Finally, following two simple module combination laws, all of the trained network modules are integrated into a modular parallel tagging system that produces solutions to the original tagging problem. The proposed method has several advantages over existing tagging systems based on multilayer perceptrons. (1) Training times can be drastically reduced and desired learning accuracy can be easily achieved; (2) the method can scale up to larger tagging problems; (3) the tagging system has quick response and facilitates hardware implementation. In order to demonstrate the effectiveness of the proposed method, we perform simulations on two different language corpora: a Thai corpus and a Chinese corpus, which have 29,028 and 45,595 ambiguous words, respectively. We also compare our method with several existing tagging models including hidden Markov models, multilayer perceptrons and neuro-taggers. The results show that both the learning accuracy and generalization performance of the proposed tagging model are better than statistical models and multilayer perceptrons, and they are comparable to the most successful tagging models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Q. Ma, M. Sun, and H. Isahara, “A multi-neuro tagger applied in Chinese texts,” in Proc. of 1998 Int. Conf. Chinese Info. Processing, Beijing, Nov. 18–20, 1998, pp. 200–207.

  2. E. Brill, “Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging,” Computational Linguistics, vol. 21, no.4, pp. 543–565, 1994.

    Google Scholar 

  3. T. Charoenporn, V. Sornlertlamvanich, and H. Isahara, “Building a large Thai text corpus—part of speech tagged corpus: ORCHID,” in Proc. Natural Language Processing Pacific Rim Symposium 1997, Phuket, Thailand, 1997, pp. 509–512.

  4. B. Merialdo, “Tagging English text with a probabilistic model,” Computational Linguistics, vol. 20, no.2, pp. 155–171, 1994.

    Google Scholar 

  5. R. Weischedel, M. Meteer, R. Schwartz, L. Ramshaw, and J. Palmucci, “Coping with ambiguity and unknown words through probabilistic models,” Computational Linguistics, vol. 19, no.2, pp. 359–382, 1993.

    Google Scholar 

  6. E. Charniak, Statistical Language Learning, MIT Press: Cambridge, MA, 1993.

    Google Scholar 

  7. C.D. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press: Cambridge, MA, 1999.

    Google Scholar 

  8. J. Benello, A.W. Mackie, and J.A. Anderson, “Syntactic category disambiguation with neural networks,” Computer Speech and Language, vol. 3, pp. 203–217, 1989.

    Google Scholar 

  9. M. Nakamura, K. Maruyama, T. Kawabata, and K. Shikano, “Neural network approach to word category prediction for English texts,” in Proc. COLING’90, Helsinki University, 1990, pp. 213–218.

  10. H. Schmid, “Part-of-speech tagging with neural networks,” in Proc. COLING’94, Kyoto, Japan, 1994, pp. 172–176.

  11. Q. Ma and H. Isahara, “A multi-neuro tagger using variable lengths of contexts,” in Proc. COLING-ACL’98, Montreal, 1998, pp. 802–806.

  12. Q. Ma, K. Uchimoto, M. Murata, and H. Isahara, “Hybrid neuro and rule-based part of speech taggers,” in Proc. COLING-2000, Saarbrücken, 2000, pp. 509–515.

  13. S. Haykin, Neural Networks, 2nd edn., Prentice-Hall, Inc., 1999.

  14. D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning internal representations by error propagation,” in Parallel Distributed Processing: Exploration in the Microstructure of Cognition, edited by D.E. Rumelhart, J.L. McClelland, and PDP Research Group, MIT Press: Cambridge, MA, vol. 1, 1986, pp. 318–362.

    Google Scholar 

  15. J. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann: San Mateo, CA, 1993.

    Google Scholar 

  16. B.L. Lu and M. Ito, “Task decomposition based on class relations: A modular neural network architecture for pattern classification,” in Biological and Artificial Computation: From Neuroscience to Technology, Lecture Notes in Computer Sciences, edited by J. Mira, R. Moreno-Diaz, and J. Cabestany, Springer-Verlag: New York, vol. 1240, 1997, pp. 330–339.

    Google Scholar 

  17. B.L. Lu and M. Ito, “Task decomposition and module combination based on class relations: A modular neural network for pattern classification,” IEEE Trans. Neural Networks, vol. 10, no.5, pp. 1244–1256, 1999.

    Google Scholar 

  18. B.L. Lu and M. Ichikawa, “Emergence of learning: An approach to coping with NP-complete problems in learning,” in Proc. IJCNN’2000, Como, Italy, 2000, July 24–27, vol. 4, pp. 159–164.

    Google Scholar 

  19. N.J. Nilsson, Learning Machines: Foundations of Trainable Pattern Classifying Systems, McGraw-Hill: New York, 1965; reissued as The Mathematical Foundations of Learning Machines, Morgan Kaufmann, San Mateo, CA, 1990.

    Google Scholar 

  20. M.I. Jordan and R.A. Jacobs, “Hierarchical mixtures of experts and the EM algorithm,” Neural Computation, vol. 6, pp. 181–214, 1994.

    Google Scholar 

  21. C.Y. Baldwin and K.B. Clark, Design Rules: The Power of Modularity, vol. 1, MIT Press: Cambridge, MA, 2000.

    Google Scholar 

  22. J.H. Friedman, “Another approach to polychotomous classification,” Technical Report (ftp://stat.stanford.edu/pub/friedman/poly.ps.Z), Stanford University, 1996.

  23. G.S. Almasi and A. Gottlieb, Highly Parallel Computing, 2nd edn., The Benjamin/Cummings Publishing Company, Inc., 1994.

  24. M.S. Sun, “Design of Chinese taggers,” Technical Report, Tsinghua University, 1996, in Chinese.

  25. J.A. Anderson, An Introduction to Neural Networks, MIT Press: Cambridge, MA, 1995.

    Google Scholar 

  26. R. Anand, K.G. Mehrota, C.K. Mohan, and S. Ranka, “An improved algorithm for neural network classification of imbalanced training sets,” IEEE Trans. Neural Networks, vol. 4, pp. 962–973, 1993.

    Google Scholar 

  27. Q. Ma, K. Uchimoto, M. Murata, and H. Isahara, “Elastic neural networks for part of speech tagging,” in Proc. IJCNN’99, Washington DC, 1999, pp. 2991–2996.

  28. J.S. Judd, Neural Network Design and the Complexity of Learning, MIT Press: Cambridge, MA, 1990.

    Google Scholar 

  29. A.L. Blum and R.L. Rivest, “Training a 3-node neural network is NP-complete,” Neural Networks, vol. 5, pp. 117–127, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bao-Liang Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, BL., Ma, Q., Ichikawa, M. et al. Efficient Part-of-Speech Tagging with a Min-Max Modular Neural-Network Model. Applied Intelligence 19, 65–81 (2003). https://doi.org/10.1023/A:1023868723792

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023868723792

Navigation