skip to main content
10.1145/1390749.1390752acmotherconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

How to cope with questions typed by dyslexic users

Published:24 July 2008Publication History

ABSTRACT

In this paper we propose a way to cope with questions typed by dyslexic users as they are usually a deformation of the intended query that cannot be corrected with classical spellcheckers. We first propose a new model for statistic question answering systems based on a probabilistic information retrieval model and a combination of results. This model allows a multiple weighted terms query as an input. We also introduce a phonology based approach at the sentence level to derive possible intended terms from typed questions. This approach uses the finite state machine framework to go from phonetic hypothesis and spellchecker proposals to hypothesized sentences thanks to a language model. The final weighted queries are obtained thanks to posterior probabilities computation. They are evaluated according to new density and appearance rating measures which adapt recall and precision to non binary data.

References

  1. C. Allauzen and M. Mohri. The design principles and algorithms of a weighted grammar library. International Journal of Foundations of Computer Science, 16(3):403--421, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  2. G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness and selective application of query expansion. In Actes de ECIR'04, Lecture Notes in Computer Science, pages 127--137, Sunerland, 2004. Springer.Google ScholarGoogle Scholar
  3. F. Bechet. Lia_phon - un systeme complet de phonetisation de textes. Traitement Automatique des Langues (T.A.L.), 42 (1), 2001.Google ScholarGoogle Scholar
  4. E. Brill and R. C. Moore. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting of the ACL, pages 286--293, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Actes de SIGIR'02, pages 299--306. ACM, August 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Deorowicz and M. G. Ciura. Correcting spelling errors by modelling their causes. International journal of applied mathematics and computer science, 15(2):275--285, 2005.Google ScholarGoogle Scholar
  8. C. Fairon and S. Paumier. A translated corpus of 30,000 french sms. In In Proceeding of LREC 2006, Genoa, Italy, May 2006.Google ScholarGoogle Scholar
  9. E. A. Fox and J. A. Shaw. Combination of multiple searches. In Proceedings of the 2nd Text REtrieval Conference (TREC-2), pages 243--252, 1994.Google ScholarGoogle Scholar
  10. J. Gao, H. Qi, X. Xia, and J.-Y. Nie. Linear discriminant model for information retrieval. In Proceedings of SIGIR'05, pages 290--297, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. T. Gillon. Phonological Awareness- From Research to Practice. Guilford Press, 2004.Google ScholarGoogle Scholar
  12. J. Grivolla, P. Jourlin, and R. D. Mori. Automatic classification of queries by expected retrieval performance. In Actes de SIGIR'05, Salvador, 2005. ACM Press.Google ScholarGoogle Scholar
  13. A. James and E. Draffan. The accuracy of electronic spell checkers for dyslexic learners. PATOSS bulletin, August 2004.Google ScholarGoogle Scholar
  14. K. L. Kwok. An attempt to identify weakest and strongest queries. In Actes de SIGIR'05, Salvador, 2005. ACM Press.Google ScholarGoogle Scholar
  15. D. Lillis, F. Toolan, R. Collier, and J. Dunnion. Probfuse: a probabilistic approach to data fusion. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. P. W. Loosemore. A neural net model of normal and dyslexic spelling. In International Joint Conference on Neural Networks, volume 2, pages 231--236, Seattle, USA, 1991.Google ScholarGoogle Scholar
  17. C. D. Loupy and P. Bellot. Evaluation of document retrieval systems and query difficulty. In Actes du. LREC'2000 Satellite Workshop "Using Evaluation within HLT Programs: Results and trends", pages 31--38, Athènes, 2000.Google ScholarGoogle Scholar
  18. M. Mohri, F. C. N. Pereira, and M. Riley. Weighted finite-state transducers in speech recognition. Computer Speech and Language, 16(1):69--88, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Mohri, F. C. N. Pereira, and M. D. Riley. At&t fsm librarytm - finite-state machine library, 1997.Google ScholarGoogle Scholar
  20. J. Mothe and L. Tanguy. Linguistic features to predict query difficulty - a case study on previous trec campaigns. In Actes de SIGIR'05, pages 7--10, Salvador, 2005. ACM Press.Google ScholarGoogle Scholar
  21. J.-Y. Nie. Clir as query expansion as logical inference. Technology letters, 4(1):69--76, 2000.Google ScholarGoogle Scholar
  22. J. Pedler. The detection and correction of real-word spelling errors in dyslexic text. In Proceedings of the 4th Annual CLUK Colloquium, 2001.Google ScholarGoogle Scholar
  23. S. E. Robertson, C. J. van Rijsbergen, and M. F. Porter. Probabilistic models of indexing and searching. In 3rd annual ACM conference on Research and development in information retrieval, pages 35--36, Cambridge, England, 1980. Google ScholarGoogle Scholar
  24. Roger. A spelling checker for dyslexic users: user modelling for error recovery. PhD thesis, Human Computer Interaction Group, Department of Computer Science, University of York, Heslington, York, September 1998.Google ScholarGoogle Scholar
  25. L. Sitbon, P. Bellot, and P. Blache. Phonetic based sentence level rewriting of questions typed by dyslexic spellers in an information retrieval context. In Proceedings of Interspeech 2007, Antwerp, Belgium, September 2007.Google ScholarGoogle Scholar
  26. L. Sitbon, P. Bellot, and P. Blache. A corpus of real-life questions for evaluating robustness of qa systems. In Proceedings of the 6th edition of the Language Resources and Evaluation Conference (LREC 2008), Marrakech, Morocco, May 2008.Google ScholarGoogle Scholar
  27. L. Sitbon, L. Gillard, J. Grivolla, P. Bellot, and P. Blache. Vers une prédiction automatique de la difficulté d'une question en langue naturelle. In 13ième conférence Traitement Automatique des Langues Naturelles (TALN), pages 337--346, Louvain, Belgique, 10--13 Avril 2006.Google ScholarGoogle Scholar
  28. K. Toutanova and R. C. Moore. Pronunciation modeling for improved spelling correction. In Proceedings of the 40th annual meeting of ACL, pages 144--151, Philadelphia, July 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. C. Vogt and G. W. Cottrell. Fusion via a linear combination of scores. Information Retrieval, 1(3):151--173, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. E. M. Voorhees and D. Harman. Overview of the eighth text retrieval conference (trec-8). In proceedings of the eighth Text REtrieval Conference, pages 1--24, Gaithersburg, Maryland, USA, November 1999.Google ScholarGoogle Scholar
  31. P. Wolf and B. Raj. The merl spokenquery information retrieval system. In IEEE International Conference on Multimedia and Expo (ICME), volume 2, pages 317--320, Août 2002.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. How to cope with questions typed by dyslexic users

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data
          July 2008
          130 pages
          ISBN:9781605581965
          DOI:10.1145/1390749

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 July 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate15of22submissions,68%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader