skip to main content
10.1145/3366424.3382731acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Domain-specific Noisy Query Correction using Linguistic Network Community Detection

Published: 20 April 2020 Publication History

Abstract

Noisy queries pose an important challenge for retrieving relevant search results. The importance for query correction increases with increasing use of hand-held devices and technologies such as SMS, tweets to search and access information. The task is further complicated for domain-specific search engines as the amount of query logs may be significantly smaller than general purpose search engines. In this paper, we propose to use the community detection technique from social network analysis for spelling correction of a set of noisy queries such as SMS messages. We focus on the task of identifying relevant questions from a set of Frequently Asked Questions (FAQ) for different domains for a set of incoming noisy queries. Experimental validation shows that the proposed CD-Speller method performs significantly better than Hunspell, the popular and industry-strength spelling correction tool.

References

[1]
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10(2008).
[2]
M. Choudhury, R. Saraf, V. Jain, A. Mukherjee, S. Sarkar, and A. Basu. 2007. Investigation and modeling of the structure of texting language. IJDAR 10(2007).
[3]
HunSpell. 2020. Hunspell Spelling Correction Tool. https://hunspell.github.io and https://en.wikipedia.org/wiki/Hunspell. (URL verified on 14th Jan. 2020).
[4]
D. Jurafsky and J. Martin. 2008. Speech and language processing.
[5]
G. Kothari, S. Negi, T. A. Faruquie, Venkatesan T. Chakaravarthy, and L. Venkata Subramaniam. 2009. SMS based interface for FAQ retrieval. In Proc. of the 47th ACL and the 4th IJCNLP. ACL.
[6]
Lawrence Philips. 1990. Hanging on the metaphone. Computer Language 7, 12 (1990), 39–43.
[7]
FIRE 2013 Shared Task. [n. d.]. FAQ Retrieval using Noisy Queries. http://www.isical.ac.in/ fire/faq-retrieval/2013/faq-retrieval.html.

Index Terms

  1. Domain-specific Noisy Query Correction using Linguistic Network Community Detection
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '20: Companion Proceedings of the Web Conference 2020
        April 2020
        854 pages
        ISBN:9781450370240
        DOI:10.1145/3366424
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 April 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Community detection
        2. Linguistic Networks
        3. Natural language processing
        4. Noisy query processing
        5. Social network analysis

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '20
        Sponsor:
        WWW '20: The Web Conference 2020
        April 20 - 24, 2020
        Taipei, Taiwan

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 91
          Total Downloads
        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 02 Mar 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media