skip to main content
10.1145/3209280.3229111acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

Query Expansion in Enterprise Search

Published:28 August 2018Publication History

ABSTRACT

Although web search remains an active research area, interest in enterprise search has not kept up with the information requirements of the contemporary workforce. To address these issues, this research aims to develop, implement, and study the query expansion techniques most effective at improving relevancy in enterprise search. The case-study instrument was a custom Apache Solr-based search application deployed at a medium-sized manufacturing company. It was hypothesized that a composition of techniques tailored to enterprise content and information needs would prove effective in increasing relevancy evaluation scores. Query expansion techniques leveraging entity recognition, alphanumeric term identification, and intent classification were implemented and studied using real enterprise content and query logs. They were evaluated against a set of test queries derived from relevance survey results using standard relevancy metrics such as normalized discounted cumulative gain (nDCG). Each of these modules produced meaningful and statistically significant improvements in relevancy.

References

  1. Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, David Grossman, David D. Lewis, Abdur Chowdhury, and Aleksandr Kolcz. 2005. Automatic web query classification using labeled and unlabeled training data. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 581--582. http://dl.acm.org/citation.cfm?id=1076138 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ben Clark. 2012. Better Lucene/Solr searches with a boost from an external naive Bayes classifier | Wayfair Engineering. (Oct. 2012). http://engineering.wayfair.com/2012/10/better-lucenesolr-searches-with-a-boost-from-an-external-naive-bayes-classifier/Google ScholarGoogle Scholar
  3. Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita, Hinrich Schütze, and Stefan Rüd. 2014. The SMAPH system for query entity recognition and disambiguation. In ERD '14: Proceedings of the first international workshop on Entity recognition & disambiguation. ACM Press, 25--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tina Costanza. 2013. Global enterprise search market to reach US$4.68bn by 2019 - Frost & Sullivan. (Jan. 2013). https://www.siliconrepublic.com/enterprise/global-enterprise-search-market-to-reach-us4-68bn-by-2019-frost-sullivanGoogle ScholarGoogle Scholar
  5. Brooke Cowan, Sven Zethelius, Brittany Luk, Teodora Baras, Prachi Ukarde, and Daodao Zhang. 2015. Named Entity Recognition in Travel-Related Search Queries.. In AAAI. 3935--3941. https://pdfs.semanticscholar.org/2da4/0f5dda818aea7cca17affa976735c0452cb6.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jeanette Jones. 2013. Various Survey Statistics: Workers Spend Too Much Time Searching for Information. (Nov. 2013). http://www.cottrillresearch.com/various-survey-statistics-workers-spend-too-much-time-searching-for-information/Google ScholarGoogle Scholar
  7. Jinyoung Kim, Xiaobing Xue, and W. Bruce Croft. 2009. A probabilistic retrieval model for semistructured data. In Advances in Information Retrieval. Springer, 228--239. http://link.springer.com/chapter/10.1007/978-3-642-00958-722 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jin Young Kim and W. Bruce Croft. 2012. A field relevance model for structured document retrieval. In European Conference on Information Retrieval. Springer, 97--108. http://link.springer.com/chapter/10.1007/978-3-642-28997-29 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Michal Laclavik, Marek Ciglan, Alex Dorman, Stefan Dlugolinsky, Sam Steingold, and Martin Šeleng. 2014. A search based approach to entity recognition: magnetic and IISAS team at ERD challenge. In ERD '14: Proceedings of the first international workshop on Entity recognition & disambiguation. ACM Press, 63--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jason D. Rennie, Lawrence Shih, Jaime Teevan, and David R. Karger. 2003. Tackling the poor assumptions of naive bayes text classifiers. In ICML, Vol. 3. Washington DC, 616--623. http://www.aaai.org/Papers/ICML/2003/ICML03-081.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Howard Wan. 2016. Query Classification for Solr. (Oct. 2016). https://www.youtube.com/watch?v=ek3ftFfhnWEGoogle ScholarGoogle Scholar

Index Terms

  1. Query Expansion in Enterprise Search

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018
            August 2018
            311 pages
            ISBN:9781450357692
            DOI:10.1145/3209280

            Copyright © 2018 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 August 2018

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate178of537submissions,33%
          • Article Metrics

            • Downloads (Last 12 months)9
            • Downloads (Last 6 weeks)0

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader