Skip to main content

Data-Driven Methods for SMS-Based FAQ Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7536))

Abstract

SMS text messaging is one of the most popular data applications on mobile phones these days. Other than personal communication, text messaging can also be used for various purposes like bill payment, banking, inquiry, etc. However these messages are extremely noisy and contain misspellings, abbreviations, transliterations, etc. Keeping this in mind, FIRE 2011 introduced a new retrieval task called SMS-based FAQ retrieval in English, Hindi and Malayalam. Within-language and cross-language tasks were designed for this retrieval problem. As solutions we propose various data-driven retrieval techniques that includes noise reduction in the SMS queries and the FAQ corpora. Overall, we find that our methods work well for the retrieval experiments in the different languages. For English, the use of Google Spelling Suggestions and term expansion strategies improve retrieval scores. For Hindi and Malayalam retrieval experiments, we find that translation of queries and corpus to English increases retrieval accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Caraway, B.: Online labour markets: an inquiry into odesk providers. Work Organisation, Labour and Globalisation (2010)

    Google Scholar 

  2. Contractor, D., Kothari, G., Faruquie, T.A., Subramaniam, L.V., Negi, S.: Handling noisy queries in cross language faq retrieval. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 87–96. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  3. Fox, E.A., Shaw, J.A.: Combination of Multiple Searches, vol. 500-215, pp. 243–252. National Institute for Standards and Technology, NIST Special Publication 500215 (1994)

    Google Scholar 

  4. Howe, J.: The rise of crowdsourcing. Wired Magazine 14(14), 1–5 (2006)

    Google Scholar 

  5. Kothari, G., Negi, S., Faruquie, T.A., Chakaravarthy, V.T., Subramaniam, L.V.: Sms based interface for faq retrieval. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL 2009, vol. 2, pp. 852–860. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

  6. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language-model based search engine for complex queries. In: Proceedings of the International Conference on Intelligence Analysis (2005)

    Google Scholar 

  7. Whitelaw, C., Hutchinson, B., Chung, G.Y., Ellis, G.: Using the web for language independent spellchecking and autocorrection. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, vol. 2, pp. 890–899. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bhattacharya, S., Tran, H., Srinivasan, P. (2013). Data-Driven Methods for SMS-Based FAQ Retrieval. In: Majumder, P., Mitra, M., Bhattacharyya, P., Subramaniam, L.V., Contractor, D., Rosso, P. (eds) Multilingual Information Access in South Asian Languages. Lecture Notes in Computer Science, vol 7536. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40087-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40087-2_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40086-5

  • Online ISBN: 978-3-642-40087-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics