Abstract
This paper introduces the Alibaba NLP team’s system for NLPCC 2018 shared task of Chinese Grammatical Error Correction (GEC). Chinese as a Second Language (CSL) learners can use this system to correct grammatical errors in texts they wrote. We proposed a method to combine statistical and neural models for the GEC task. This method consists of two modules: the correction module and the combination module. In the correction module, two statistical models and one neural model generate correction candidates for each input sentence. Those two statistical models are a rule-based model and a statistical machine translation (SMT)-based model. The neural model is a neural machine translation (NMT)-based model. In the combination module, we implemented it in a hierarchical manner. We first combined models at a lower level, which means we trained several models with different configurations and combined them. Then we combined those two statistical models and a neural model at the higher level. Our system reached the second place on the leaderboard released by the official.
J. Zhou—Work done during an internship at Alibaba Group
J. Zhou and C. Li—Equal Contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Brockett, C., Dolan, W.B., Gamon, M.: Correcting ESL errors using phrasal SMT techniques. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 249–256. Association for Computational Linguistics (2006)
Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)
Bustamante, F.R., León, F.S.: GramCheck: a grammar and style checker. In: Proceedings of the 16th Conference on Computational Linguistics-Volume 1, pp. 175–181. Association for Computational Linguistics (1996)
Chang, R.Y., Wu, C.H., Prasetyo, P.K.: Error diagnosis of Chinese sentences using inductive learning algorithm and decomposition-based testing mechanism. ACM Trans. Asian Lang. Inf. Process. (TALIP) 11(1), 3 (2012)
Chang, T.H., Sung, Y.T., Hong, J.F., Chang, J.I.: KNGED: a tool for grammatical error diagnosis of Chinese sentences. In: 22nd International Conference on Computers in Education, ICCE 2014. Asia-Pacific Society for Computers in Education (2014)
Cheng, S.M., Yu, C.H., Chen, H.H.: Chinese word ordering errors detection and correction for non-native Chinese language learners. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: Technical Papers, pp. 279–289 (2014)
Felice, M., Yuan, Z., Andersen, Ø.E., Yannakoudakis, H., Kochmar, E.: Grammatical error correction using hybrid systems and type filtering. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 15–24 (2014)
Gaoqi, R., Zhang, B., Endong, X., Lee, L.H.: IJCNLP-2017 task 1: Chinese grammatical error diagnosis. In: Proceedings of the IJCNLP 2017, Shared Tasks, pp. 1–8 (2017)
Gra, D., Chen, K.: Chinese gigaword. LDC Catalog No.: LDC2003T09, ISBN 1, 58563-58230 (2005)
Han, N.R., Chodorow, M., Leacock, C.: Detecting errors in English article usage with a maximum entropy classifier trained on a large, diverse corpus. In: LREC (2004)
Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 187–197. Association for Computational Linguistics (2011)
Heidorn, G.E., Jensen, K., Miller, L.A., Byrd, R.J., Chodorow, M.S.: The EPISTLE text-critiquing system. IBM Syst. J. 21(3), 305–326 (1982)
Ji, J., Wang, Q., Toutanova, K., Gong, Y., Truong, S., Gao, J.: A nested attention neural hybrid model for grammatical error correction. arXiv preprint arXiv:1707.02026 (2017)
Junczys-Dowmunt, M., Grundkiewicz, R.: The AMU system in the CoNLL-2014 shared task: grammatical error correction by data-intensive and feature-rich statistical machine translation. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 25–33 (2014)
Lee, L.H., Gaoqi, R., Yu, L.C., Endong, X., Zhang, B., Chang, L.P.: Overview of NLP-TEA 2016 shared task for Chinese grammatical error diagnosis. In: Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016), pp. 40–48 (2016)
Lee, L.H., Yu, L.C., Chang, L.: Overview of the NLP-TEA 2015 shared task for Chinese grammatical error diagnosis, 07 March 2015
Lee, L.H., Yu, L.C., Lee, K.C., Tseng, Y.H., Chang, L.P., Chen, H.H.: A sentence judgment system for grammatical error detection. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: System Demonstrations, pp. 67–70 (2014)
Lin, C.J., Chan, S.H.: Description of NTOU Chinese grammar checker in CFL 2014. In: Proceedings of the 1st Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2014), Nara, Japan, pp. 75–78 (2014)
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
Madnani, N., Tetreault, J., Chodorow, M.: Exploring grammatical error correction with not-so-crummy machine translation. In: Proceedings of the Seventh Workshop on Building Educational Applications using NLP, pp. 44–53. Association for Computational Linguistics (2012)
Napoles, C., Callison-Burch, C.: Systematically adapting machine translation for grammatical error correction. In: Proceedings of the 12th Workshop on Innovative use of NLP for Building Educational Applications, pp. 345–356 (2017)
Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H., Bryant, C.: The CoNLL-2014 shared task on grammatical error correction. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 1–14 (2014)
Ng, H.T., Wu, S.M., Wu, Y., Hadiwinoto, C., Tetreault, J.: The CoNLL-2013 shared task on grammatical error correction (2013)
Rozovskaya, A., Roth, D.: Algorithm selection and model adaptation for ESL correction tasks. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 924–933. Association for Computational Linguistics (2011)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Sun, C., Jin, X., Lin, L., Zhao, Y., Wang, X.: Convolutional neural networks for correcting English article errors. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds.) NLPCC 2015. LNCS (LNAI), vol. 9362, pp. 102–110. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25207-0_9
Sun, J.: ‘jieba’ Chinese word segmentation tool (2012)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Tetreault, J.R., Chodorow, M.: The ups and downs of preposition error detection in ESL writing. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pp. 865–872. Association for Computational Linguistics (2008)
Wu, X., Huang, P., Wang, J., Guo, Q., Xu, Y., Chen, C.: Chinese grammatical error diagnosis system based on hybrid model. In: Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications, pp. 117–125 (2015)
Yu, L.C., Lee, L.H., Chang, L.P.: Overview of grammatical error diagnosis for learning Chinese as a foreign language. In: Proceedings of the 1st Workshop on Natural Language Processing Techniques for Educational Applications, pp. 42–47 (2014)
Yuan, Z., Briscoe, T.: Grammatical error correction using neural machine translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 380–386 (2016)
Yuan, Z., Felice, M.: Constrained grammatical error correction using statistical machine translation. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task, pp. 52–61 (2013)
Zampieri, M., Tan, L.: Grammatical error detection with limited training data: the case of Chinese. In: Proceedings of ICCE (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, J., Li, C., Liu, H., Bao, Z., Xu, G., Li, L. (2018). Chinese Grammatical Error Correction Using Statistical and Neural Models. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2018. Lecture Notes in Computer Science(), vol 11109. Springer, Cham. https://doi.org/10.1007/978-3-319-99501-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-99501-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99500-7
Online ISBN: 978-3-319-99501-4
eBook Packages: Computer ScienceComputer Science (R0)