skip to main content
10.1145/1498759.1498815acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Measuring the similarity between implicit semantic relations using web search engines

Published: 09 February 2009 Publication History

Abstract

Measuring the similarity between implicit semantic relations is an important task in information retrieval and natural language processing. For example, consider the situation where you know an entity-pair (e.g. Google, YouTube), between which a particular relation holds (e.g. acquisition), and you are interested in retrieving other entity-pairs for which the same relation holds (e.g. Yahoo, Inktomi). Existing keyword-based search engines cannot be directly applied in this case because in keyword-based search, the goal is to retrieve documents that are relevant to the words used in the query -- not necessarily to the relations implied by a pair of words. Accurate measurement of relational similarity is an important step in numerous natural language processing tasks such as identification of word analogies, and classification of noun-modifier pairs. We propose a method that uses Web search engines to efficiently compute the relational similarity between two pairs of words. Our method consists of three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different semantic relations implied by them, and measuring the similarity between different semantic relations using an inter-cluster correlation matrix. We propose a pattern extraction algorithm to extract a large number of lexical patterns that express numerous semantic relations. We then present an efficient clustering algorithm to cluster the extracted lexical patterns. Finally, we measure the relational similarity between word-pairs using inter-cluster correlation. We evaluate the proposed method in a relation classification task. Experimental results on a dataset covering multiple relation types show a statistically significant improvement over the current state-of-the-art relational similarity measures.

References

[1]
K. Barker and S. Szpakowicz. Semi-automatic recognition of noun modifier relationships. In Proc. of COLING'98, pages 96--102, 1998.
[2]
M. Berland and E. Charniak. Finding parts in very large corpora. In Proc. of ACL'99, pages 57--64, 1999.
[3]
R. Bhagat and D. Ravichandran. Large scale acquisition of paraphrases for learning surface patterns. In Proc. of ACL'08: HLT, pages 674--682, 2008.
[4]
R. C. Bunescu and R. Mooney. Learning to extract relations from the web using minimal supervision. In Proc. of ACL'07, pages 576--583, 2007.
[5]
P. Cimiano and J. Wenderoth. Automatic acquisition of ranked qualia structures from the web. In Proc. of ACL'07, pages 888--895, 2007.
[6]
A. Culotta and J. Sorensen. Dependency tree kernels for relation extraction. In Proc. of ACL'04, pages 423--429, 2004.
[7]
Cutting, R. Douglas, D. R. Karger, and J. O. Pedersen. Constant interaction-time scatter/gather browsing of very large documents collections. In Proceedings of SIGIR'93, 1993.
[8]
D. Davidov and A. Rappoport. Classification of semantic relationships between nominals using pattern clusters. In Proc. of the ACL'08, 2008.
[9]
D. Davidov and A. Rappoport. Unsupervised discovery of generic relationships using pattern clusters and its evaluation by automatically generated sat analogy questions. In Proc. of ACL'08-HLT, pages 692--700, 2008.
[10]
B. Falkenhainer, K. Forbus, and D. Gentner. Structure mapping engine: Algorithm and examples. Artificial Intelligence, 41:1--63, 1989.
[11]
Z. Harris. Distributional structure. Word, 10:146--162, 1954.
[12]
M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of 14th COLING, pages 539--545, 1992.
[13]
D. Lin. Automatic retrieval and clustering of similar words. In Proc. of COLING-ACL'98, pages 768--774, 1998.
[14]
D. Lin and P. Pantel. Dirt: Discovery of inference rules from text. In Proc. of ACM SIGKDD'01, pages 323--328, 2001.
[15]
Z. Marx, D. Ido, B. Joachim, and S. Eli. Coupled clustering: A method for detecting structural correspondance. Journal of Machine Learning Research, 3:747--780, 2002.
[16]
D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1--28, 1991.
[17]
G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introducton to wordnet: An on-line lexical database. International Journal of Lexicography, 3:238--244, 1990.
[18]
P. Nakov and M. Hearst. Solving relational similarity problems using the web as a corpus. In Proc. of ACL'08-HLT, pages 452--460, 2008.
[19]
M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge. In Proc. of AAAI'06, pages 1400--1405, 2006.
[20]
J. Pei, J. Han, B. Mortazavi-Asi, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu. Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Transactions on Knowledge and Data Engineering, 16(11):1424--1440, 2004.
[21]
D. Ravichandran and E. Hovy. Learning surface text patterns for a question answering system. In Proc. of ACL '02, pages 41--47, 2001.
[22]
G. Salton and C. Buckley. Introduction to Modern Information Retreival. McGraw-Hill Book Company, 1983.
[23]
M. Schultz and T. Joachims. Learning a distance metric from relative comparisons. In Proc. of NIPS'03, 2003.
[24]
H. Schutze. Automatic word sense discrimination. Computational Linguistics, 24(1):97--123, 1998.
[25]
R. Snow, D. Jurafsky, and A. Ng. Learning syntactic patterns for automatic hypernym discovery. In Proc. of Advances in Neural Information Processing Systems (NIPS) 17, pages 1297--1304, 2005.
[26]
P. Turney. Measuring semantic similarity by latent relational analysis. In Proc. of IJCAI'05, pages 1136--1141, 2005.
[27]
P. Turney. Expressing implicit semantic relations without supervision. In Proc. of Coling/ACL'06, pages 313--320, 2006.
[28]
P. Turney. Similarity of semantic relations. Computational Linguistics, 32(3):379--416, 2006.
[29]
P. Turney and M. Littman. Corpus-based learning of analogies and semantic relations. Machine Learning, 60:251--278, 2005.
[30]
P. Turney, M. Littman, J. Bigham, and V. Shnayder. Combining independent modules to solve multiple-choice synonym and analogy problems. In Proc. of RANLP'03, pages 482--486, 2003.
[31]
A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1997.
[32]
T. Veale. The analogical thesaurus. In Proc. of 15th Innovative Applications of Artificial Intelligence Conference (IAAI'03), pages 137--142, 2003.
[33]
T. Veale. Wordnet sits the sat: A knowledge-based approach to lexical analogy. In Proc. of ECAI'04, pages 606--612, 2004.
[34]
T. Veale and M. T. Keane. The competence of structure mapping on hard analogies. In Proc. of IJCAI'03, 2003.
[35]
K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. In Proc. of NIPS'05, pages 1473--1480, 2005.
[36]
D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3:1083--1106, 2003.

Cited By

View all
  • (2022)An Application of Knowledge Graph for Enterprise Risk PredictionProceedings of the 12th International Conference on Computer Engineering and Networks10.1007/978-981-19-6901-0_106(1029-1038)Online publication date: 20-Oct-2022
  • (2018)Semantic Similarity between Web Documents Using OntologyJournal of The Institution of Engineers (India): Series B10.1007/s40031-018-0321-099:3(293-300)Online publication date: 13-Mar-2018
  • (2016)Benchmarking Semantic Capabilities of Analogy Querying AlgorithmsDatabase Systems for Advanced Applications10.1007/978-3-319-32025-0_29(463-478)Online publication date: 25-Mar-2016
  • Show More Cited By

Index Terms

  1. Measuring the similarity between implicit semantic relations using web search engines

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining
    February 2009
    314 pages
    ISBN:9781605583907
    DOI:10.1145/1498759
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 February 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. relational similarity measures
    2. web mining

    Qualifiers

    • Research-article

    Conference

    WSDM'09
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)An Application of Knowledge Graph for Enterprise Risk PredictionProceedings of the 12th International Conference on Computer Engineering and Networks10.1007/978-981-19-6901-0_106(1029-1038)Online publication date: 20-Oct-2022
    • (2018)Semantic Similarity between Web Documents Using OntologyJournal of The Institution of Engineers (India): Series B10.1007/s40031-018-0321-099:3(293-300)Online publication date: 13-Mar-2018
    • (2016)Benchmarking Semantic Capabilities of Analogy Querying AlgorithmsDatabase Systems for Advanced Applications10.1007/978-3-319-32025-0_29(463-478)Online publication date: 25-Mar-2016
    • (2015)A Framework for Collocation Error Correction in Web Pages and Text DocumentsACM SIGKDD Explorations Newsletter10.1145/2830544.283054817:1(14-23)Online publication date: 29-Sep-2015
    • (2013)A Web Based Method for Measuring Semantic Relatedness between WordsApplied Mechanics and Materials10.4028/www.scientific.net/AMM.347-350.783347-350(783-787)Online publication date: Aug-2013
    • (2012)Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a ProxyIEICE Transactions on Information and Systems10.1587/transinf.E95.D.2116E95.D:8(2116-2123)Online publication date: 2012
    • (2012)Cross-Language Latent Relational Search between Japanese and English Languages Using a Web CorpusACM Transactions on Asian Language Information Processing10.1145/2334801.233480511:3(1-33)Online publication date: 1-Sep-2012
    • (2012)Smart combination of web measures for solving semantic similarity problemsOnline Information Review10.1108/1468452121127600036:5(724-738)Online publication date: 21-Sep-2012
    • (2012)Chinese Latent Relational Search Based on Relational SimilarityData and Knowledge Engineering10.1007/978-3-642-34679-8_12(115-127)Online publication date: 2012
    • (2011)A Supervised Classification Approach for Measuring Relational Similarity between Word PairsIEICE Transactions on Information and Systems10.1587/transinf.E94.D.2227E94-D:11(2227-2233)Online publication date: 2011
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media