Abstract
SMS text messaging is one of the most popular data applications on mobile phones these days. Other than personal communication, text messaging can also be used for various purposes like bill payment, banking, inquiry, etc. However these messages are extremely noisy and contain misspellings, abbreviations, transliterations, etc. Keeping this in mind, FIRE 2011 introduced a new retrieval task called SMS-based FAQ retrieval in English, Hindi and Malayalam. Within-language and cross-language tasks were designed for this retrieval problem. As solutions we propose various data-driven retrieval techniques that includes noise reduction in the SMS queries and the FAQ corpora. Overall, we find that our methods work well for the retrieval experiments in the different languages. For English, the use of Google Spelling Suggestions and term expansion strategies improve retrieval scores. For Hindi and Malayalam retrieval experiments, we find that translation of queries and corpus to English increases retrieval accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Caraway, B.: Online labour markets: an inquiry into odesk providers. Work Organisation, Labour and Globalisation (2010)
Contractor, D., Kothari, G., Faruquie, T.A., Subramaniam, L.V., Negi, S.: Handling noisy queries in cross language faq retrieval. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 87–96. Association for Computational Linguistics, Stroudsburg (2010)
Fox, E.A., Shaw, J.A.: Combination of Multiple Searches, vol. 500-215, pp. 243–252. National Institute for Standards and Technology, NIST Special Publication 500215 (1994)
Howe, J.: The rise of crowdsourcing. Wired Magazine 14(14), 1–5 (2006)
Kothari, G., Negi, S., Faruquie, T.A., Chakaravarthy, V.T., Subramaniam, L.V.: Sms based interface for faq retrieval. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL 2009, vol. 2, pp. 852–860. Association for Computational Linguistics, Stroudsburg (2009)
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language-model based search engine for complex queries. In: Proceedings of the International Conference on Intelligence Analysis (2005)
Whitelaw, C., Hutchinson, B., Chung, G.Y., Ellis, G.: Using the web for language independent spellchecking and autocorrection. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, vol. 2, pp. 890–899. Association for Computational Linguistics, Stroudsburg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bhattacharya, S., Tran, H., Srinivasan, P. (2013). Data-Driven Methods for SMS-Based FAQ Retrieval. In: Majumder, P., Mitra, M., Bhattacharyya, P., Subramaniam, L.V., Contractor, D., Rosso, P. (eds) Multilingual Information Access in South Asian Languages. Lecture Notes in Computer Science, vol 7536. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40087-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-40087-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40086-5
Online ISBN: 978-3-642-40087-2
eBook Packages: Computer ScienceComputer Science (R0)