skip to main content
10.1145/2661829.2661942acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Correct Me If I'm Wrong: Fixing Grammatical Errors by Preposition Ranking

Published: 03 November 2014 Publication History

Abstract

The detection and correction of grammatical errors still represent very hard problems for modern error-correction systems. As an example, the top-performing systems at the preposition correction challenge CoNLL-2013 only achieved a F1 score of 17%. In this paper, we propose and extensively evaluate a series of approaches for correcting prepositions, analyzing a large body of high-quality textual content to capture language usage. Leveraging n-gram statistics, association measures, and machine learning techniques, our system is able to learn which words or phrases govern the usage of a specific preposition. Our approach makes heavy use of n-gram statistics generated from very large textual corpora. In particular, one of our key features is the use of n-gram association measures (e.g., Pointwise Mutual Information) between words and prepositions to generate better aggregated preposition rankings for the individual n-grams. We evaluate the effectiveness of our approach using cross-validation with different feature combinations and on two test collections created from a set of English language exams and StackExchange forums. We also compare against state-of-the-art supervised methods. Experimental results from the CoNLL-2013 test collection show that our approach to preposition correction achieves ∼30% in F1 score which results in 13% absolute improvement over the best performing approach at that challenge.

References

[1]
S. Bergsma, D. Lin, and R. Goebel. Web-scale n-gram models for lexical disambiguation. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI'09, pages 1507--1512, San Francisco, CA, USA, 2009. Morgan Kaufmann Publishers Inc.
[2]
K. W. Church and P. Hanks. Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1):22--29, Mar. 1990.
[3]
D. Dahlmeier and H. T. Ng. A beam-search decoder for grammatical error correction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL '12, pages 568--578, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.
[4]
D. Dahlmeier, H. T. Ng, and E. J. F. Ng. Nus at the hoo 2012 shared task. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, pages 216--224, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.
[5]
R. Dale, I. Anisimoff, and G. Narroway. Hoo 2012: A report on the preposition and determiner error correction shared task. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, pages 54--62, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.
[6]
A. Elghafari, D. Meurers, and H. Wunsch. Exploring the data-driven prediction of prepositions in english. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING '10, pages 267--275, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.
[7]
P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3--42, Apr. 2006.
[8]
M. Heilman, A. Cahill, and J. Tetreault. Precision isn't everything: A hybrid approach to grammatical error detection. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, pages 233--241, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.
[9]
J. Huang, J. Gao, J. Miao, X. Li, K. Wang, F. Behr, and C. L. Giles. Exploring web scale language models for search query processing. In Proceedings of the 19th International Conference on World Wide Web, WWW '10, pages 451--460, New York, NY, USA, 2010. ACM.
[10]
A. Islam and D. Inkpen. An unsupervised approach to preposition error correction. In Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on, pages 1--4. IEEE, 2010.
[11]
T.-h. Kao, Y.-w. Chang, H.-w. Chiu, T.-H. Yen, J. Boisson, J.-c. Wu, and J. S. Chang. Conll-2013 shared task: Grammatical error correction nthu system description. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task, pages 20--25, Sofia, Bulgaria, 2013. Association for Computational Linguistics.
[12]
E. Kochmar, O. Andersen, and T. Briscoe. Hoo 2012 error recognition and correction shared task: Cambridge university submission report. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, pages 242--250, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.
[13]
C. Leacock, M. Chodorow, M. Gamon, and J. Tetreault. Automated Grammatical Error Detection for Language Learners. Morgan and Claypool Publishers, 2010.
[14]
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008.
[15]
J.-B. Michel, Y. K. Shen, A. P. Aiden, A. Veres, M. K. Gray, T. G. B. Team, J. P. Pickett, D. Hoiberg, D. Clancy, P. Norvig, J. Orwant, S. Pinker, M. A. Nowak, and E. L. Aiden. Quantitative analysis of culture using millions of digitized books. Science, 331(6014):176--182, 2011.
[16]
H. T. Ng, S. M. Wu, Y. Wu, C. Hadiwinoto, and J. Tetreault. The conll-2013 shared task on grammatical error correction. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task, pages 1--12, Sofia, Bulgaria, 2013. Association for Computational Linguistics.
[17]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011.
[18]
R. Prokofyev, G. Demartini, and P. Cudré-Mauroux. Effective named entity recognition for idiosyncratic web collections. In Proceedings of the 23rd International Conference on World Wide Web, WWW '14, pages 397--408, Republic and Canton of Geneva, Switzerland, 2014. International World Wide Web Conferences Steering Committee.
[19]
A. Rozovskaya, K.-W. Chang, M. Sammons, and D. Roth. The University of Illinois system in the conll-2013 shared task. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task, pages 13--19, Sofia, Bulgaria, 2013. Association for Computational Linguistics.
[20]
Y. Xiang, B. Yuan, Y. Zhang, X. Wang, W. Zheng, and C. Wei. A hybrid model for grammatical error correction. CoNLL-2013, page 115, 2013.

Cited By

View all
  • (2024)Development of an automatic grammar checker for Yorùbá word processing using Government and Binding TheoryExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121351236:COnline publication date: 1-Feb-2024
  • (2019)WordPrep: Word-based Preposition Prediction Tool2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9005608(2169-2176)Online publication date: Dec-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
November 2014
2152 pages
ISBN:9781450325981
DOI:10.1145/2661829
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. n-gram statistics
  2. pointwise mutual information
  3. preposition correction
  4. supervised learning

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '14
Sponsor:

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Development of an automatic grammar checker for Yorùbá word processing using Government and Binding TheoryExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121351236:COnline publication date: 1-Feb-2024
  • (2019)WordPrep: Word-based Preposition Prediction Tool2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9005608(2169-2176)Online publication date: Dec-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media