skip to main content
10.1145/3022198.3026357acmconferencesArticle/Chapter ViewAbstractPublication PagescscwConference Proceedingsconference-collections
poster

Possible Confounds in Word-based Semantic Similarity Test Data

Authors Info & Claims
Published:25 February 2017Publication History

ABSTRACT

Semantic similarity or semantic relatedness are features of natural language that contribute to the challenge machines face when analyzing text. Although semantic relatedness is still a complex challenge only few ground truth data set exist. We argue that the available corpora used to evaluate the performance of natural language tools do not capture all elements of the phenomenon. We present a set of simple interventions that illustrate 1) framing effects influence similarity perception, 2) the distribution of similarity across multiple users is important and 3) semantic relatedness is asymmetric.

References

  1. Alexander Budanitsky and Graeme Hirst. 2001. Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. In Proc. Workshop on WordNet and Other Lexical Resources.Google ScholarGoogle Scholar
  2. Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2002. Placing Search in Context: The Concept Revisited. ACM Transactions on Information Systems 20, 1 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In Proc. IJCAI '07. 1606--1611. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jay J. Jiang and David W. Conrath. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. arXiv cmp-lg/9709008 (1997).Google ScholarGoogle Scholar
  5. Philip Resnik. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proc. IJCAI '95. 448--453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Michael Strube and Simone Paolo Ponzetto. 2006. WikiRelate! Computing Semantic Relatedness Using Wikipedia. In Proc. AAAI '16. 1419--1424. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Possible Confounds in Word-based Semantic Similarity Test Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CSCW '17 Companion: Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing
        February 2017
        472 pages
        ISBN:9781450346887
        DOI:10.1145/3022198

        Copyright © 2017 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 February 2017

        Check for updates

        Qualifiers

        • poster

        Acceptance Rates

        CSCW '17 Companion Paper Acceptance Rate183of530submissions,35%Overall Acceptance Rate2,235of8,521submissions,26%

        Upcoming Conference

        CSCW '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader