skip to main content
10.1145/3488933.3489034acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

Hybrid Chinese Grammar Error Checking Model Based on Transformer

Authors Info & Claims
Published:25 February 2022Publication History

ABSTRACT

Chinese grammar error correction (CGEC) is a basic application of natural language processing. It can will detect and correct various grammatical errors in the texts. In this paper, we propose a Hybrid Chinese Grammar error checking Model based on Transformer. First, the model introduces Bert pre-training model to process text sequences at the Chinese character level. Next, we integrate point mutual information (PMI) to enable the model to deal with the semantic collocation information between Chinese characters. We benchmark a baseline model against our model on CGED_2018 dataset and Corrupt dataset. The proposed method is better than the baseline model, and obtains state-of-the-art results on CGED_2018 datasets.

References

  1. Meng, Yuxian, Xiaoya Li, Xiaofei Sun, Qinghong Han, Arianna Yuan, Jiwei Li. Is Word Segmentation Necessary for Deep Learning of Chinese Representations? [C]//Annual Meeting of the Association for Computational Linguistics. Florence,2019: 3242-3253Google ScholarGoogle Scholar
  2. Tsai, J. L., McConkie, G. W. Where Do Chinese Readers Send Their Eyes? [C]//In The Mind's Eye: Cognitive and Applied Aspects of Eye. Elsevier,2003: 159-176.Google ScholarGoogle Scholar
  3. Pecina P, Schlesinger P. Combining association measures for collocation extraction [C] // Proceeding Soft of the 21th International Conference on Computational Linguisticsand 44th Annual Meeting of the Association for Computational Linguistics, Sydney. 2006: 651–658.Google ScholarGoogle Scholar
  4. Du Liping, Li Xiaoge, Yu gen, Liu Chunli, Liu Rui. Improvement of Chinese word segmentation system based on new word discovery based on mutual information improvement algorithm [J]//Journal of Peking University (NATURAL SCIENCE EDITION). 2016,52 (01): 35-40.Google ScholarGoogle Scholar
  5. Wang Chencheng, Yang liner, Wang Yingying, Du Yongping, Yang Erhong. Chinese grammar error correction method based on transformer enhanced architecture [J]//Chinese Journal of information technology. 2020,34 (06): 106-114.Google ScholarGoogle Scholar
  6. Devlin J, Chang M W, Le K, BERT:Pre-training of dep bidirectional transformers for language understanding[C]//North American Chapter of the Association for Computational Linguistics:Human Lan-guage Technologies. Minneapolis,2019: 4171- 4186.Google ScholarGoogle Scholar
  7. Riseman E M, Hanson A R. A contextual postprocessing system for error correction using binary n-grams[J]. IEEE Transactions on Computers. 1974, 100(5): 480-493.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Islam, Aminul, D. Inkpen. Real-Word Spelling Correction using Google Web 1T 3-grams[C]//Empirical Methods in Natural Language Processing. Singapore,2009: 1241-1249.Google ScholarGoogle Scholar
  9. Yang, Yi, Pengjun Xie, Jun Tao, G. Xu, Linlin Li, Si Luo. Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task[C]//International Joint Conference on Natural Language Processing. Sydney,2017: 41-46.Google ScholarGoogle Scholar
  10. Yuan, Zheng, Briscoe, Ted. Grammatical error correction using neural machine translation[J]//North American Chapter of the Association for Computational Linguistics:Human Lan-guage Technologies. San Diego, 2016: 380-386.Google ScholarGoogle Scholar
  11. Tan Yongmei, Yang Yixiao, Yang Lin, Liu Shuwen. Automatic syntax error correction method for ESL articles based on LSTM and n-gram [J]//Chinese Journal of information technology. 2018,32 (06): 19-27.Google ScholarGoogle Scholar
  12. Chollampatt, Shamil, H. Ng. A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction[C]//The Association for the Advance of Artificial Intelligence Conference on Artificial Intelligence. Virginia,2018: 5755-5762.Google ScholarGoogle Scholar
  13. Grundkiewicz, Roman, Marcin Junczys-Dowmunt. Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation[C]//North American Chapter of the Association for Computational Linguistics – Human Language Technologies. New Orleans,2018: 284-290.Google ScholarGoogle Scholar
  14. Fu, Kai, J. Huang, Yitao Duan. Youdao's Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction[C]//Natural Language Processing and Chinese Computing. Hohhot,2018: 341-350.Google ScholarGoogle Scholar
  15. Duan Jianyong, Yuan Yang, Wang Hao. Chinese spelling error correction method based on transformer local information and grammar enhancement Architecture [J]//Journal of Peking University (NATURAL SCIENCE EDITION). 2021,57 (01): 61-67.Google ScholarGoogle Scholar
  16. Xie Haihua, Li Olin, Li Yabo, Chen Zhiyou, Cheng Jing, LV Xiaoqing, Tang Qi. CPLM-CSC: Chinese typo correction method based on single word level pre training language model [J]//Chinese Journal of information technology. 2021,35 (05): 38-45.Google ScholarGoogle Scholar
  17. Papineni, Kishore, S. Roukos, T. Ward, Wei-Jing Zhu. Bleu: a Method for Automatic Evaluation of Machine Translation[C]//Annual Meeting of the Association for Computational Linguistics. Grenoble,2002: 311-318.Google ScholarGoogle Scholar
  18. Vaswani A, Shazeer N, Parmar N, Attention is all you need[C]//Advances in neural information processing systems. Red Hook,2017: 5998-6008.Google ScholarGoogle Scholar
  19. Xie, Weijian, Peijie Huang, Xinrui Zhang, Kaiduo Hong, Qiang Huang, Bingzhou Chen, Lei Huang. Chinese Spelling Check System Based on N-gram Model[C]//Proceedings of the Eighth SIGHAN Workshop on Chinese Language. Beijing,2015:128-136.Google ScholarGoogle Scholar
  20. Wang, Dingmin, Yi Tay, Li Zhong. Confusionset-guided Pointer Networks for Chinese Spelling Check[C]// Annual Meeting of the Association for Computational Linguistics. Florence, 2019: 5780-5785.Google ScholarGoogle Scholar
  21. Nguyen, Minh-Thuan, Gia H. Ngo, Nancy F. Chen. [C]/ Domain-Shift Conditioning Using Adaptable Filtering Via Hierarchical Embeddings for Robust Chinese Spell Check /Institute of Electrical and Electronics Engineers Transactions on Audio, Speech, and Language Processing (Volume: 29). 2021: 2027-2036.Google ScholarGoogle Scholar

Index Terms

  1. Hybrid Chinese Grammar Error Checking Model Based on Transformer
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition
            September 2021
            715 pages
            ISBN:9781450384087
            DOI:10.1145/3488933

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 February 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited
          • Article Metrics

            • Downloads (Last 12 months)12
            • Downloads (Last 6 weeks)0

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format