Reference Hub1
A Grid and Cloud Based System for Data Grouping Computation and Online Service

A Grid and Cloud Based System for Data Grouping Computation and Online Service

Wing-Ning Li, Donald Hayes, Jonathan Baran, Cameron Porter, Tom Schweiger
Copyright: © 2011 |Volume: 3 |Issue: 4 |Pages: 14
ISSN: 1938-0259|EISSN: 1938-0267|EISBN13: 9781613507247|DOI: 10.4018/jghpc.2011100104
Cite Article Cite Article

MLA

Li, Wing-Ning, et al. "A Grid and Cloud Based System for Data Grouping Computation and Online Service." IJGHPC vol.3, no.4 2011: pp.39-52. http://doi.org/10.4018/jghpc.2011100104

APA

Li, W., Hayes, D., Baran, J., Porter, C., & Schweiger, T. (2011). A Grid and Cloud Based System for Data Grouping Computation and Online Service. International Journal of Grid and High Performance Computing (IJGHPC), 3(4), 39-52. http://doi.org/10.4018/jghpc.2011100104

Chicago

Li, Wing-Ning, et al. "A Grid and Cloud Based System for Data Grouping Computation and Online Service," International Journal of Grid and High Performance Computing (IJGHPC) 3, no.4: 39-52. http://doi.org/10.4018/jghpc.2011100104

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Record linkage deals with finding records that identify the same real world entity, such as an individual or a business, from a given file or set of files. Record linkage problem is also referred to as the entity resolution or record recognition problem. To locate those records identifying the same real world entity, in principle, pairwise record analyses have to be performed among all records. Analytical operations between two records vary from comparing corresponding fields to enhancing records through large knowledge bases and querying large databases. Hence, these operations are complex and take time. To reduce the number of pairwise record comparisons, blocking techniques are introduced to partition the records into blocks. After that records in each block are analyzed against one and another. One of the effective blocking methods is the closure approach, where a “related” equivalence relation is used to partition the records into equivalence classes. This paper introduces the closure problem and describes the design and implementation of a parallel and distributed closure prototype system running in an enterprise grid.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.