skip to main content
10.1145/3639233.3639236acmotherconferencesArticle/Chapter ViewAbstractPublication PagesnlpirConference Proceedingsconference-collections
research-article

Spelling Check with Sparse Distributed Representations Learning

Authors Info & Claims
Published:05 March 2024Publication History

ABSTRACT

This study focused on enhancing learning sequences using a method inspired by the brain, following Hawkins's approach. Capable of not only recognizing existing sequences but also learning new ones and ensuring fault-tolerant operations, the learning method was evaluated through a spelling check. The evaluation utilized the standard TREC-5 Confusion Track dataset to automatically correct incorrect words. The new method was compared with other techniques, such as Levenshtein Distance, pyspellchecker, LSTM, and Elmosclstm (Semantically Conditioned LSTM and Elmo Transformer), which is the state-of-the-art. The results demonstrated that the highest accuracy at the word level was 79.35%%, surpassing Elmosclstm's 74.41%. Additionally, at the sentence level, the brain-inspired method achieved 90.75% accuracy, outperforming Elmosclstm's 72.18%.

References

  1. M. Bruno and S. Mário. 2004. Spelling correction for search engine queries. In Proceedings of Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), volume 3230, pages 372–383.Google ScholarGoogle Scholar
  2. V. I. Levenshtein. 1965. Binary codes capable of correcting deletions, insertions, and reversals. In Proceedings of Soviet Physics. Doklady, volume 10, pages 707–71.Google ScholarGoogle Scholar
  3. F. J. Damerau. 1964. A technique for computer detection and correction of spelling errors. In Proceedings of Communications of the ACM, volume 7, pages 171–176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Shang and T. H. Merrettal. 1996. Tries for approximate string matching. In Proceedings of IEEE Transactions on Knowledge and Data Engineering, volume 8, number 4, pages 540–547, August.Google ScholarGoogle Scholar
  5. T. M. Miangah. 2014. FarsiSpell: A spell-checking system for Persian using a large monolingual corpus. In Proceedings of Literary and Linguistic Computing, volume 29, number 1, pages 56–73.Google ScholarGoogle ScholarCross RefCross Ref
  6. E. Ukkonen. 1985. Algorithms for Approximate String Matching. In Proceedings of Information and Control, volume 64, numbers 1-3, pages 100–118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Peter Norvig. How to Write a Spelling Corrector. [Online]. Available: https://norvig.com/spell-correct.html.Google ScholarGoogle Scholar
  8. T. Barrus. 2018. Pyspellchecker-Pure python spell checker based on work by Peter Norvig. [Online]. Available: https://pypi.org/project/pyspellchecker.Google ScholarGoogle Scholar
  9. Sanat Sharma, Josep Valls-Vargas, Tracy Holloway King, Francois Guerin, and Chirag Arora. 2023. Contextual Multilingual Spellchecker for User Queries. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23). Association for Computing Machinery, New York, NY, USA, 3395–3399.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. C. Kinaci. 2018. Spelling correction using recurrent neural networks and character level N-gram. In Proceedings of the 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), pages 1–4, Malatya, Turkey.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Sooraj, K. Manjusha, M. Kumar, and K. Soman. 2018. Deep learning based spell checker for Malayalam language. In Journal of Intelligent and Fuzzy Systems, volume 34, pages 1427–1434.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. I. Sutskever, O. Vinyals, and Q. V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems, volume 4.Google ScholarGoogle Scholar
  13. Vishal Kakkar, Chinmay Sharma, Madhura Pande, and Surender Kumar. 2023. Search Query Spell Correction with Weak Supervision in E-commerce. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 687–694, Toronto, Canada. Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  14. Ubrangala, Dayananda & Sharma, Juhi & Kondapalli, Ravi & R, Kiran & Agarwala, Amit & Boué, Laurent. 2023. Domain specificity and data efficiency in typo tolerant spell checkers: the case of search in online marketplaces.Google ScholarGoogle Scholar
  15. Xiangci Li, Hairong Liu, and Liang Huang. 2020. Context-aware Stand-alone Neural Spelling Correction. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 407–414, Online.Google ScholarGoogle Scholar
  16. S. M. Jayanthi, D. Pruthi, and G. Neubig. 2020. NeuSpell: A neural spelling correction toolkit. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 158–164.Google ScholarGoogle Scholar
  17. Thasayu Soisoonthorn, Herwig Unger, and Maleerat Maliyaem. 2023. Spelling Check: A New Cognition-Inspired Sequence Learning Memory. In Journal of Advances in Information Technology, volume 14, number 3, pages 399–410.Google ScholarGoogle ScholarCross RefCross Ref
  18. X. Chen, W. Wang, and W. Li. 2012. An overview of Hierarchical Temporal Memory: A new neocortex algorithm. In Proceedings of the 2012 International Conference on Modelling, Identification and Control, pages 1004–1010, Wuhan, China.Google ScholarGoogle Scholar
  19. H. Jeff and B. Sandra. 2004. On intelligence. Times Books.Google ScholarGoogle Scholar
  20. A. Subutai and H. Jeff. 2015. Properties of sparse distributed representations and their application to hierarchical temporal memory. arXiv.Google ScholarGoogle Scholar
  21. P. B. Kantor and E. M. Voorhees. 2000. The TREC-5 confusion track: Comparing retrieval methods for scanned text. In Proceedings of Information Retrieval, volume 2(2/3), pages 165–176.Google ScholarGoogle Scholar

Index Terms

  1. Spelling Check with Sparse Distributed Representations Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval
      December 2023
      336 pages
      ISBN:9798400709227
      DOI:10.1145/3639233

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 March 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format