A Fast Algorithm for Words Reordering Based on Language Model

Athanaselis, Theologos; Bakamidis, Stelios; Dologlou, Ioannis

doi:10.1007/11840930_98

Theologos Athanaselis²⁰,
Stelios Bakamidis²⁰ &
Ioannis Dologlou²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4132))

Included in the following conference series:

International Conference on Artificial Neural Networks

957 Accesses

Abstract

What appears to be given in all languages is that words can not be randomly ordered in sentences, but that they must be arranged in certain ways, both globally and locally. The “scrambled” words into a sentence cause a meaningless sentence. Although the use of manually collected grammatical rules can boost the performance of grammar checker in word order diagnosis, the repairing task is still very difficult. This work proposes a method for repairing word order errors in English sentences by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of a permutations’ filtering approach in order to reduce the search space among the possible sentences with reordered words. The filtering method is based on bigrams’ probabilities. In this work the search space is further reduced using a threshold over bigrams’ probabilities. The experimental results show that more than 95% of the test sentences can be repaired using this technique. The comparative advantage of this method is that it is not restricted into a specific set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns. Unlike most of the approaches, the proposed method is applicable to any language (language models can be simply computed in any language) and does not work only with a specific set of words. The use of parser and/or tagger is not necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Atwell, E.S.: How to detect grammatical errors in a text without parsing it. In: Proceedings of the 3rd EACL, pp. 38–45 (1987)
Google Scholar
Bigert, J., Knutsson, O.: Robust error detection: A hybrid approach combining unsupervised error detection and linguistic knowledge. In: Proceedings of Robust Methods in Analysis of Natural language Data (ROMAND 2002), pp. 10–19 (2002)
Google Scholar
Chodorow, M., Leacock, C.: An unsupervised method for detecting grammatical errors. In: Proceedings of NAACL 2000, pp. 140–147 (2000)
Google Scholar
Feyton, C.M.: Teaching ESL/EFL with the internet. Merill Prentice- Hall (2002)
Google Scholar
Folse, K.S.: Intermediate TOEFL Test Practices (rev. ed.). The University of Michigan Press, Ann Arbor (1997)
Google Scholar
Good, I.J.: The population frequencies of species and the estimation of population parameters. Biometrika 40(3 and 4), 237–264 (1953)
MATH MathSciNet Google Scholar
Golding, A.A.: Bayesian hybrid for context-sensitive spelling correction. In: Proceedings of the 3rd Workshop on Very Large Corpora, pp. 39–53 (1995)
Google Scholar
Hawkins, J.A.: A Performance Theory of Order and Constituency. Cambridge University Press, Cambridge (1994)
Google Scholar
Heift, T.: Intelligent Language Tutoring Systems for Grammar Practice. Zeitschrift für Interkulturellen Fremdsprachenunterricht (Online) 6(2), 15 (2001)
Google Scholar
Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech and Signal Processing 35(3), 400–401 (1987)
Article Google Scholar
Sjöbergh, J.: Chunking: an unsupervised method to find errors in text. In: Proceedings of the 15th Nordic Conference of Computational Linguistics, NODALIDA (2005)
Google Scholar
Young, S.J.: Large Vocabulary Continuous Speech Recognition. IEEE Signal Processing Magazine 13(5), 45–57 (1996)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Language and Speech Processing, Artemidos 6 and Epidavrou, GR 15125, Maroussi, Greece
Theologos Athanaselis, Stelios Bakamidis & Ioannis Dologlou

Authors

Theologos Athanaselis
View author publications
You can also search for this author in PubMed Google Scholar
Stelios Bakamidis
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Dologlou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Computer Engineering, Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, 157 80, Zographou, GR, Greece
Stefanos Kollias
Department of Electrical and Computer Engineering, National Technical University of Athens, 15780, Zographou, Greece
Andreas Stafylopatis
Department of Informatics, Nicolaus Copernicus University, Toruń, Poland
Włodzisław Duch
Adaptive Informatics Research Centre, Helsinki University of Technology, P.O. Box 5400, 02015, HUT, Finland
Erkki Oja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Athanaselis, T., Bakamidis, S., Dologlou, I. (2006). A Fast Algorithm for Words Reordering Based on Language Model. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840930_98

Download citation

DOI: https://doi.org/10.1007/11840930_98
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38871-5
Online ISBN: 978-3-540-38873-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics