poster

A unified framework for name disambiguation

Authors:
Jie Tang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jing Zhang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Duo Zhang

University of Illinois at Urbana Champaign, Urbana, USA

University of Illinois at Urbana Champaign, Urbana, USA
View Profile

,
Juanzi Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

WWW '08: Proceedings of the 17th international conference on World Wide WebApril 2008Pages 1205–1206https://doi.org/10.1145/1367497.1367728

Published:21 April 2008Publication History

WWW '08: Proceedings of the 17th international conference on World Wide Web

Pages 1205–1206

ABSTRACT

Name ambiguity problem has been a challenging issue for a long history. In this paper, we intend to make a thorough investigation of the whole problem. Specifically, we formalize the name disambiguation problem in a unified framework. The framework can incorporate both attribute and relationship into a probabilistic model. We explore a dynamic approach for automatically estimating the person number K and employ an adaptive distance measure to estimate the distance between objects. Experimental results show that our proposed framework can significantly outperform the baseline method.

References

Basu, M. Bilenko, and R. J. Mooney. A Probabilistic Framework for Semi-Supervised Clustering. In Proc. of SIGKDD'2004, pp. 59--68, Seattle, USA, August 2004. Google ScholarDigital Library
Ester, R. Ge, B.J. Gao, Z. Hu, and B. Ben-Moshe. Joint Cluster Analysis of Attribute Data and Relationship Data: the Connected K-center Problem. In Proc. of SDM'2006.Google Scholar
Hammersley and P. Clifford. Markov Fields on Finite Graphs and Lattices. Unpublished manuscript. 1971.Google Scholar
Tang, D. Zhang, and L. Yao. Social network extraction of academic researchers. Proc. of ICDM'2007. pp. 292--301 Google ScholarDigital Library
Zhang, J. Tang, J. Li, and K. Wang. A constraint-based probabilistic framework for name disambiguation. Proc. of CIKM'2007. pp. 1019--1022 Google ScholarDigital Library

Index Terms

A unified framework for name disambiguation
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information retrieval
  2. Information systems applications

Recommendations

A novel multiple layers name disambiguation framework for digital libraries using dynamic clustering

In many types of databases, such as a science bibliography database, the name attribute is the most commonly used identifier to recognize entities. However, names are frequently ambiguous and not always unique, thereby causing problems in various ...
Read More
Web personal name disambiguation based on reference entity tables mined from the web
WIDM '09: Proceedings of the eleventh international workshop on Web information and data management

Ambiguous personal names are common on the Web, which pose a challenge for many different tasks. The traditional disambiguation employs the clustering methods. However, without reference entity tables, the clustering method can only identify whether two ...
Read More
A constraint-based probabilistic framework for name disambiguation
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management

This paper is concerned with the problem of name disambiguation. By name disambiguation, we mean distinguishing persons with the same name. It is a critical problem in many knowledge management applications. Despite much research work has been conducted,...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '08: Proceedings of the 17th international conference on World Wide Web
April 2008
1326 pages
ISBN:9781605580852
DOI:10.1145/1367497
General Chairs:
Jinpeng Huai
Beihang University, China
,
Robin Chen
AT&T Labs, USA
,
Hsiao-Wuen Hon
Microsoft Research Asia, China
,
Yunhao Liu
HK University of Science and Technology, Hong Kong
,
Program Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Andrew Tomkins
Yahoo! Research, USA
,
Xiaodong Zhang
The Ohio State University, USA
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 April 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
digital library
name disambiguation
probabilistic model
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Upcoming Conference
WWW '24

Sponsor:

sigweb

The ACM Web Conference 2024

May 13 - 17, 2024

Singapore , Singapore
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 262
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A unified framework for name disambiguation

WWW '08: Proceedings of the 17th international conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

A novel multiple layers name disambiguation framework for digital libraries using dynamic clustering

Web personal name disambiguation based on reference entity tables mined from the web

A constraint-based probabilistic framework for name disambiguation