Skip to main content

Towards a Novel Association Measure via Web Search Results Mining

  • Conference paper
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5476))

Included in the following conference series:

  • 3118 Accesses

Abstract

Web-based association measure aims to evaluate the semantic similarity between two queries (i.e. words or entities) by leveraging the search results returned by search engines. Existing web-relevance similarity measure usually considers all search results for a query as a coarse-grained single topic and measures the similarity between the term vectors constructed by concatenating all search results into a single document for each query. This paper proposes a novel association measure named WSRCM based on web search results clustering and matching to evaluate the semantic similarity between two queries at a fine-grained level. WSRCM first discovers the subtopics in the search results for each query and then measures the consistency between the sets of subtopics for two queries. Each subtopic for a query is expected to describe a unique facet of the query, and two queries sharing more subtopics are deemed more semantically related. Experimental results demonstrate the encouraging performance of the proposed measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bollegala, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. In: Proceedings of WWW 2007 (2007)

    Google Scholar 

  2. Chen, H.-H., Lin, M.-S., Wei, Y.-C.: Novel association measures using web search with double checking. In: Proceedings of COLING-ACL 2006 (2006)

    Google Scholar 

  3. Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logist. Quart. 2, 83–97 (1955)

    Article  MathSciNet  MATH  Google Scholar 

  4. Matsuo, Y., Sakaki, T., Uchiyama, K., Ishizuka, M.: Graph-based word clustering using web search engine. In: Proc. of EMNLP 2006 (2006)

    Google Scholar 

  5. Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of AAAI 2006 (2006)

    Google Scholar 

  6. Miller, G., Charles, W.: Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1), 1–28 (1998)

    Article  Google Scholar 

  7. Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Indust. Appl. Math. 5, 32–38 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  8. Rubenstein, H., Goodenough, J.B.: Contextual Correlates of Synonymy. Communications of the ACM 8(10), 627–633 (1965)

    Article  Google Scholar 

  9. Sahami, M., Heilman, T.: A web-based kernel function for measuring the similarity of short text snippets. In: Proc. of WWW 2006 (2006)

    Google Scholar 

  10. Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency, vol. A. Springer, Berlin (2003)

    MATH  Google Scholar 

  11. Yih, W.-T., Meek, C.: Improving similarity measures for short segments of text. In: Proceedings of AAAI 2007 (2007)

    Google Scholar 

  12. Zamir, O., Etzioni, O.: Grouper: A dynamic clustering interface to web search results. In: Proceedings of WWW 1999 (1999)

    Google Scholar 

  13. Zeng, H.-J., He, Q.-C., Chen, Z., Ma, W.-Y.: Learning to cluster web search results. In: Proceedings of SIGIR 2004 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wan, X., Xiao, J. (2009). Towards a Novel Association Measure via Web Search Results Mining. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_83

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01307-2_83

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01306-5

  • Online ISBN: 978-3-642-01307-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics