skip to main content
10.1145/1571941.1572124acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Selecting hierarchical clustering cut points for web person-name disambiguation

Published: 19 July 2009 Publication History

Abstract

Hierarchical clustering is often used to cluster person-names referring to the same entities. Since the correct number of clusters for a given person-name is not known a priori, some way of deciding where to cut the resulting dendrogram to balance risks of over- or under-clustering is needed. This paper reports on experiments in which outcome-specific and result-set measures are used to learn a global similarity threshold. Results on the Web People Search (WePS)-2 task indicate that approximately 85% of the optimal F1 measure can be achieved on held-out data.

References

[1]
J. Artiles, J. Gonzalo, and S. Sekine. The SemEval-2007 WePS Evaluation. Proceedings of SemEval, 2007.
[2]
C. S. Manning and H. Schutze. Foundations of Statistical Natural Language Processing, The MIT Press, 500--512.
[3]
J. Gong and D. Oard. Determine the Entity Number in Hierarchical Clustering for Web Personal Name Disambiguation. Workshop (WePS 2009), 18th WWW Conference, April 2009.

Cited By

View all
  • (2019)A Graph-based Approach to Person Name Disambiguation in WebACM Transactions on Management Information Systems10.1145/331494910:2(1-25)Online publication date: 17-May-2019
  • (2018)Partial Matching of Facial Expression Sequence Using Over-Complete Transition Dictionary for Emotion RecognitionIEEE Transactions on Affective Computing10.1109/TAFFC.2015.24963207:4(389-408)Online publication date: 12-Dec-2018
  • (2015)Dynamic author name disambiguation for growing digital librariesInformation Retrieval Journal10.1007/s10791-015-9261-318:5(379-412)Online publication date: 21-Jul-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
July 2009
896 pages
ISBN:9781605584836
DOI:10.1145/1571941

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. person-name disambiguation

Qualifiers

  • Poster

Conference

SIGIR '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)A Graph-based Approach to Person Name Disambiguation in WebACM Transactions on Management Information Systems10.1145/331494910:2(1-25)Online publication date: 17-May-2019
  • (2018)Partial Matching of Facial Expression Sequence Using Over-Complete Transition Dictionary for Emotion RecognitionIEEE Transactions on Affective Computing10.1109/TAFFC.2015.24963207:4(389-408)Online publication date: 12-Dec-2018
  • (2015)Dynamic author name disambiguation for growing digital librariesInformation Retrieval Journal10.1007/s10791-015-9261-318:5(379-412)Online publication date: 21-Jul-2015
  • (2015)Web Person Disambiguation Using Hierarchical Co-reference ModelComputational Linguistics and Intelligent Text Processing10.1007/978-3-319-18111-0_22(279-291)Online publication date: 2015
  • (2014)Adaptive Centroid-Based Clustering Algorithm for Text Document Data2014 Sixth International Symposium on Parallel Architectures, Algorithms and Programming10.1109/PAAP.2014.13(63-68)Online publication date: Jul-2014
  • (2014)Local age group modeling in unconstrained face images for facial age classification2014 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2014.7025279(1395-1399)Online publication date: Oct-2014
  • (2013)A Three-Stage Clustering Framework Based on Multiple Feature Combination for Chinese Person Name DisambiguationProceedings of the 2013 International Conference on Information Science and Cloud Computing Companion10.1109/ISCC-C.2013.33(103-109)Online publication date: 7-Dec-2013
  • (2011)Document clustering with universumProceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval10.1145/2009916.2010033(873-882)Online publication date: 24-Jul-2011
  • (2010)Exploring personal name disambiguation from name understanding2010 4th International Universal Communication Symposium10.1109/IUCS.2010.5666185(345-349)Online publication date: Oct-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media