research-article

How to cope with questions typed by dyslexic users

Authors:
Laurianne Sitbon

University of Avignon, Avignon, France

University of Avignon, Avignon, France
View Profile

,
Patrice Bellot

University of Avignon, Avignon, France

University of Avignon, Avignon, France
View Profile

AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataJuly 2008Pages 1–8https://doi.org/10.1145/1390749.1390752

Published:24 July 2008Publication History

AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data

Pages 1–8

ABSTRACT

In this paper we propose a way to cope with questions typed by dyslexic users as they are usually a deformation of the intended query that cannot be corrected with classical spellcheckers. We first propose a new model for statistic question answering systems based on a probabilistic information retrieval model and a combination of results. This model allows a multiple weighted terms query as an input. We also introduce a phonology based approach at the sentence level to derive possible intended terms from typed questions. This approach uses the finite state machine framework to go from phonetic hypothesis and spellchecker proposals to hypothesized sentences thanks to a language model. The final weighted queries are obtained thanks to posterior probabilities computation. They are evaluated according to new density and appearance rating measures which adapt recall and precision to non binary data.

References

C. Allauzen and M. Mohri. The design principles and algorithms of a weighted grammar library. International Journal of Foundations of Computer Science, 16(3):403--421, 2005.Google ScholarCross Ref
G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness and selective application of query expansion. In Actes de ECIR'04, Lecture Notes in Computer Science, pages 127--137, Sunerland, 2004. Springer.Google Scholar
F. Bechet. Lia_phon - un systeme complet de phonetisation de textes. Traitement Automatique des Langues (T.A.L.), 42 (1), 2001.Google Scholar
E. Brill and R. C. Moore. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting of the ACL, pages 286--293, 2000. Google ScholarDigital Library
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Actes de SIGIR'02, pages 299--306. ACM, August 2002. Google ScholarDigital Library
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407, 1990.Google ScholarCross Ref
S. Deorowicz and M. G. Ciura. Correcting spelling errors by modelling their causes. International journal of applied mathematics and computer science, 15(2):275--285, 2005.Google Scholar
C. Fairon and S. Paumier. A translated corpus of 30,000 french sms. In In Proceeding of LREC 2006, Genoa, Italy, May 2006.Google Scholar
E. A. Fox and J. A. Shaw. Combination of multiple searches. In Proceedings of the 2nd Text REtrieval Conference (TREC-2), pages 243--252, 1994.Google Scholar
J. Gao, H. Qi, X. Xia, and J.-Y. Nie. Linear discriminant model for information retrieval. In Proceedings of SIGIR'05, pages 290--297, 2005. Google ScholarDigital Library
G. T. Gillon. Phonological Awareness- From Research to Practice. Guilford Press, 2004.Google Scholar
J. Grivolla, P. Jourlin, and R. D. Mori. Automatic classification of queries by expected retrieval performance. In Actes de SIGIR'05, Salvador, 2005. ACM Press.Google Scholar
A. James and E. Draffan. The accuracy of electronic spell checkers for dyslexic learners. PATOSS bulletin, August 2004.Google Scholar
K. L. Kwok. An attempt to identify weakest and strongest queries. In Actes de SIGIR'05, Salvador, 2005. ACM Press.Google Scholar
D. Lillis, F. Toolan, R. Collier, and J. Dunnion. Probfuse: a probabilistic approach to data fusion. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, USA, 2006. ACM Press. Google ScholarDigital Library
R. P. W. Loosemore. A neural net model of normal and dyslexic spelling. In International Joint Conference on Neural Networks, volume 2, pages 231--236, Seattle, USA, 1991.Google Scholar
C. D. Loupy and P. Bellot. Evaluation of document retrieval systems and query difficulty. In Actes du. LREC'2000 Satellite Workshop "Using Evaluation within HLT Programs: Results and trends", pages 31--38, Athènes, 2000.Google Scholar
M. Mohri, F. C. N. Pereira, and M. Riley. Weighted finite-state transducers in speech recognition. Computer Speech and Language, 16(1):69--88, 2002.Google ScholarDigital Library
M. Mohri, F. C. N. Pereira, and M. D. Riley. At&t fsm librarytm - finite-state machine library, 1997.Google Scholar
J. Mothe and L. Tanguy. Linguistic features to predict query difficulty - a case study on previous trec campaigns. In Actes de SIGIR'05, pages 7--10, Salvador, 2005. ACM Press.Google Scholar
J.-Y. Nie. Clir as query expansion as logical inference. Technology letters, 4(1):69--76, 2000.Google Scholar
J. Pedler. The detection and correction of real-word spelling errors in dyslexic text. In Proceedings of the 4th Annual CLUK Colloquium, 2001.Google Scholar
S. E. Robertson, C. J. van Rijsbergen, and M. F. Porter. Probabilistic models of indexing and searching. In 3rd annual ACM conference on Research and development in information retrieval, pages 35--36, Cambridge, England, 1980. Google Scholar
Roger. A spelling checker for dyslexic users: user modelling for error recovery. PhD thesis, Human Computer Interaction Group, Department of Computer Science, University of York, Heslington, York, September 1998.Google Scholar
L. Sitbon, P. Bellot, and P. Blache. Phonetic based sentence level rewriting of questions typed by dyslexic spellers in an information retrieval context. In Proceedings of Interspeech 2007, Antwerp, Belgium, September 2007.Google Scholar
L. Sitbon, P. Bellot, and P. Blache. A corpus of real-life questions for evaluating robustness of qa systems. In Proceedings of the 6th edition of the Language Resources and Evaluation Conference (LREC 2008), Marrakech, Morocco, May 2008.Google Scholar
L. Sitbon, L. Gillard, J. Grivolla, P. Bellot, and P. Blache. Vers une prédiction automatique de la difficulté d'une question en langue naturelle. In 13ième conférence Traitement Automatique des Langues Naturelles (TALN), pages 337--346, Louvain, Belgique, 10--13 Avril 2006.Google Scholar
K. Toutanova and R. C. Moore. Pronunciation modeling for improved spelling correction. In Proceedings of the 40th annual meeting of ACL, pages 144--151, Philadelphia, July 2002. Google ScholarDigital Library
C. C. Vogt and G. W. Cottrell. Fusion via a linear combination of scores. Information Retrieval, 1(3):151--173, 1999. Google ScholarDigital Library
E. M. Voorhees and D. Harman. Overview of the eighth text retrieval conference (trec-8). In proceedings of the eighth Text REtrieval Conference, pages 1--24, Gaithersburg, Maryland, USA, November 1999.Google Scholar
P. Wolf and B. Raj. The merl spokenquery information retrieval system. In IEEE International Conference on Multimedia and Expo (ICME), volume 2, pages 317--320, Août 2002.Google ScholarCross Ref

Index Terms

How to cope with questions typed by dyslexic users

Recommendations

Probabilistic models for answer-ranking in multilingual question-answering

This article presents two probabilistic models for answering ranking in the multilingual question-answering (QA) task, which finds exact answers to a natural language question written in different languages. Although some probabilistic methods have been ...
Read More
Semantic computation in a Chinese Question-Answering system
Abstract
This paper introduces a kind of semantic computation and presents how to combine it into our Chinese Question-Answering (QA) system. Based on two kinds of language resources,Hownet andCilin, we present an approach to computing the similarity and ...
Read More
Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering
Abstract
Question Answering (QA) systems based on Information Retrieval return precise answers to natural language questions, extracting relevant sentences from document collections. However, questions and sentences cannot be aligned ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data
July 2008
130 pages
ISBN:9781605581965
DOI:10.1145/1390749
Conference Chairs:
Daniel Lopresti
Lehigh University
,
Shourya Roy
IBM India Research Lab
,
Klaus Schulz
University of Munich
,
L. Venkata Subramaniam
India Research Lab
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dyslexic users
question-answering
rewriting
robust probabilistic model
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate15of22submissions,68%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 313
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

How to cope with questions typed by dyslexic users

AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Probabilistic models for answer-ranking in multilingual question-answering

Semantic computation in a Chinese Question-Answering system

Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering