research-article

Hybrid Chinese Grammar Error Checking Model Based on Transformer

Authors:
Nawei Zhong

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China
View Profile

,
Xiaoge Li

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China
View Profile

,
Long Qin

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China
View Profile

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern RecognitionSeptember 2021Pages 574–579https://doi.org/10.1145/3488933.3489034

Published:25 February 2022Publication History

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

Pages 574–579

ABSTRACT

Chinese grammar error correction (CGEC) is a basic application of natural language processing. It can will detect and correct various grammatical errors in the texts. In this paper, we propose a Hybrid Chinese Grammar error checking Model based on Transformer. First, the model introduces Bert pre-training model to process text sequences at the Chinese character level. Next, we integrate point mutual information (PMI) to enable the model to deal with the semantic collocation information between Chinese characters. We benchmark a baseline model against our model on CGED_2018 dataset and Corrupt dataset. The proposed method is better than the baseline model, and obtains state-of-the-art results on CGED_2018 datasets.

References

Meng, Yuxian, Xiaoya Li, Xiaofei Sun, Qinghong Han, Arianna Yuan, Jiwei Li. Is Word Segmentation Necessary for Deep Learning of Chinese Representations? [C]//Annual Meeting of the Association for Computational Linguistics. Florence,2019: 3242-3253Google Scholar
Tsai, J. L., McConkie, G. W. Where Do Chinese Readers Send Their Eyes? [C]//In The Mind's Eye: Cognitive and Applied Aspects of Eye. Elsevier,2003: 159-176.Google Scholar
Pecina P, Schlesinger P. Combining association measures for collocation extraction [C] // Proceeding Soft of the 21th International Conference on Computational Linguisticsand 44th Annual Meeting of the Association for Computational Linguistics, Sydney. 2006: 651–658.Google Scholar
Du Liping, Li Xiaoge, Yu gen, Liu Chunli, Liu Rui. Improvement of Chinese word segmentation system based on new word discovery based on mutual information improvement algorithm [J]//Journal of Peking University (NATURAL SCIENCE EDITION). 2016,52 (01): 35-40.Google Scholar
Wang Chencheng, Yang liner, Wang Yingying, Du Yongping, Yang Erhong. Chinese grammar error correction method based on transformer enhanced architecture [J]//Chinese Journal of information technology. 2020,34 (06): 106-114.Google Scholar
Devlin J, Chang M W, Le K, BERT:Pre-training of dep bidirectional transformers for language understanding[C]//North American Chapter of the Association for Computational Linguistics:Human Lan-guage Technologies. Minneapolis,2019: 4171- 4186.Google Scholar
Riseman E M, Hanson A R. A contextual postprocessing system for error correction using binary n-grams[J]. IEEE Transactions on Computers. 1974, 100(5): 480-493.Google ScholarDigital Library
Islam, Aminul, D. Inkpen. Real-Word Spelling Correction using Google Web 1T 3-grams[C]//Empirical Methods in Natural Language Processing. Singapore,2009: 1241-1249.Google Scholar
Yang, Yi, Pengjun Xie, Jun Tao, G. Xu, Linlin Li, Si Luo. Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task[C]//International Joint Conference on Natural Language Processing. Sydney,2017: 41-46.Google Scholar
Yuan, Zheng, Briscoe, Ted. Grammatical error correction using neural machine translation[J]//North American Chapter of the Association for Computational Linguistics:Human Lan-guage Technologies. San Diego, 2016: 380-386.Google Scholar
Tan Yongmei, Yang Yixiao, Yang Lin, Liu Shuwen. Automatic syntax error correction method for ESL articles based on LSTM and n-gram [J]//Chinese Journal of information technology. 2018,32 (06): 19-27.Google Scholar
Chollampatt, Shamil, H. Ng. A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction[C]//The Association for the Advance of Artificial Intelligence Conference on Artificial Intelligence. Virginia,2018: 5755-5762.Google Scholar
Grundkiewicz, Roman, Marcin Junczys-Dowmunt. Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation[C]//North American Chapter of the Association for Computational Linguistics – Human Language Technologies. New Orleans,2018: 284-290.Google Scholar
Fu, Kai, J. Huang, Yitao Duan. Youdao's Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction[C]//Natural Language Processing and Chinese Computing. Hohhot,2018: 341-350.Google Scholar
Duan Jianyong, Yuan Yang, Wang Hao. Chinese spelling error correction method based on transformer local information and grammar enhancement Architecture [J]//Journal of Peking University (NATURAL SCIENCE EDITION). 2021,57 (01): 61-67.Google Scholar
Xie Haihua, Li Olin, Li Yabo, Chen Zhiyou, Cheng Jing, LV Xiaoqing, Tang Qi. CPLM-CSC: Chinese typo correction method based on single word level pre training language model [J]//Chinese Journal of information technology. 2021,35 (05): 38-45.Google Scholar
Papineni, Kishore, S. Roukos, T. Ward, Wei-Jing Zhu. Bleu: a Method for Automatic Evaluation of Machine Translation[C]//Annual Meeting of the Association for Computational Linguistics. Grenoble,2002: 311-318.Google Scholar
Vaswani A, Shazeer N, Parmar N, Attention is all you need[C]//Advances in neural information processing systems. Red Hook,2017: 5998-6008.Google Scholar
Xie, Weijian, Peijie Huang, Xinrui Zhang, Kaiduo Hong, Qiang Huang, Bingzhou Chen, Lei Huang. Chinese Spelling Check System Based on N-gram Model[C]//Proceedings of the Eighth SIGHAN Workshop on Chinese Language. Beijing,2015:128-136.Google Scholar
Wang, Dingmin, Yi Tay, Li Zhong. Confusionset-guided Pointer Networks for Chinese Spelling Check[C]// Annual Meeting of the Association for Computational Linguistics. Florence, 2019: 5780-5785.Google Scholar
Nguyen, Minh-Thuan, Gia H. Ngo, Nancy F. Chen. [C]/ Domain-Shift Conditioning Using Adaptable Filtering Via Hierarchical Embeddings for Robust Chinese Spell Check /Institute of Electrical and Electronics Engineers Transactions on Audio, Speech, and Language Processing (Volume: 29). 2021: 2027-2036.Google Scholar

Index Terms

Hybrid Chinese Grammar Error Checking Model Based on Transformer
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning paradigms
    2. Learning settings

Index terms have been assigned to the content through auto-classification.

Recommendations

Panini: a transformer-based grammatical error correction method for Bangla
Abstract
The purpose of the Bangla grammatical error correction task is to spontaneously identify and correct syntactic, morphological, semantic, and punctuation mistakes in written Bangla text using computational models, ultimately enhancing language ...
Read More
Hybrid model for Chinese character recognition based on Tesseract-OCR

Optical character recognition (OCR) is an important way to input information into a computer. And text information can be extracted by OCR from an image. Currently, the accuracy rate of Chinese OCR can also be improved. This study proposes a hybrid ...
Read More
English Grammar Error Correction Algorithm Based on Classification Model
English grammar error correction algorithm refers to the use of computer programming technology to automatically recognize and correct the grammar errors contained in English text written by nonnative language learners. Classification model is the core of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition
September 2021
715 pages
ISBN:9781450384087
DOI:10.1145/3488933

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 February 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bert
Chinese Correct Grammatical Errors
PMI
Transformer
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 60
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Hybrid Chinese Grammar Error Checking Model Based on Transformer

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

ABSTRACT

References

Cited By

Index Terms

Recommendations

Panini: a transformer-based grammatical error correction method for Bangla

Hybrid model for Chinese character recognition based on Tesseract-OCR

English Grammar Error Correction Algorithm Based on Classification Model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Hybrid Chinese Grammar Error Checking Model Based on Transformer

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

ABSTRACT

References

Cited By

Index Terms

Recommendations

Panini: a transformer-based grammatical error correction method for Bangla

Hybrid model for Chinese character recognition based on Tesseract-OCR

English Grammar Error Correction Algorithm Based on Classification Model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media