Skip to main content

Record Matching

  • Reference work entry
Book cover Encyclopedia of Database Systems

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Arasu A., Chaudhuri S., and Kaushik R. Transformation-based framework for record matching. In Proc. 24th Int. Conf. on Data Engineering, 2008, pp. 40–49.

    Google Scholar 

  2. Arasu A., Ganti V., and Kaushik R. Efficient exact set-similarity joins. In Proc. 32nd Int. Conf. on Very Large Data Bases, 2006, pp. 918–929.

    Google Scholar 

  3. Bilenko M. and Mooney R.J. Adaptive duplicate detection using learnable string similarity measures. In Proc. 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2003, pp. 39–48.

    Google Scholar 

  4. Chaudhuri S., Chen B.C., Ganti V., and Kaushik R. Example-driven design of efficient record matching queries. In Proc. 33rd Int. Conf. on Very Large Data Bases, 2007, pp. 327–338.

    Google Scholar 

  5. Chaudhuri S., Ganjam K., Ganti V., and Motwani R. Robust and efficient fuzzy match for online data cleaning. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2003, pp. 313–324.

    Google Scholar 

  6. Chaudhuri S., Ganti V., and Kaushik R. A primitive operator for similarity joins in data cleaning. In Proc. 22nd Int. Conf. on Data Engineering, 2006.

    Google Scholar 

  7. Cochinwala M., Kurien V., Lalk G., and Shasha D. Efficient data reconciliation. Inf. Sci., 137(1–4):1–15, 2001.

    MATH  Google Scholar 

  8. Cohen W.W. Data integration using similarity joins and a word-based information representation language. ACM Trans. Inform. Syst., 18(3):288–321, 2000.

    Google Scholar 

  9. Elmagarmid A.K., Ipeirotis P.G., and Verykios V.S. Duplicate record detection: a survey. IEEE Trans. Knowl. Data Eng., 19(1):1–16, 2007.

    Google Scholar 

  10. Felligi I.P. and Sunter A.B. A theory for record linkage. J. Am. Stat. Soc., 64(328):1183–1210, 1969.

    Google Scholar 

  11. Hernandez M. and Stolfo S. The merge/purge problem for large databases. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 1995, pp. 127–138.

    Google Scholar 

  12. Jaro M.A. Unimatch: A Record Linkage System: User’s Manual. Tech. rep., US Bureau of the Census, Washington DC, 1976.

    Google Scholar 

  13. Jaro M.A. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Am. Stat. Assoc., 84(406):414–420, 1989.

    Google Scholar 

  14. Koudas N., Sarawagi S., and Srivastava D. Record linkage: similarity measures and algorithms. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2006, pp. 802–803.

    Google Scholar 

  15. McCallum A., Nigam K., and Ungar L.H. Efficient clustering of high-dimensional data sets with application to reference matching. In Proc. 6th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2000, pp. 169–178.

    Google Scholar 

  16. Newcombe H.B., Kennedy J.M., Axford S.J., and James A.P. Automatic linkage of vital records. Science, 130:954–959, 1959.

    Google Scholar 

  17. Sarawagi S. and Bhamidipaty A. Interactive deduplication using active learning. In Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2002, pp. 269–278.

    Google Scholar 

  18. Sarawagi S. and Kirpal A. Efficient set joins on similarity predicates. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2004, pp. 743–754.

    Google Scholar 

  19. Torra V. and Domingo-Ferrer J. Record Linkage methods for multidatabase data mining. In Information Fusion in Data Mining, V. Torra (ed.), Springer, 2003, pp. 101–132.

    Google Scholar 

  20. Winkler W. Improved Decision Rules in the Felligi-Sunter Model of Record Linkage. Tech. rep., Statistical Research Division, US Bureau of the Census, Washington DC, 1993.

    Google Scholar 

  21. Winkler W. The state of record linkage and current research problems. Tech. rep., Statistical Research Division, US Bureau of the Census, Washington DC, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry

Arasu, A., Domingo-Ferrer, J. (2009). Record Matching. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_594

Download citation

Publish with us

Policies and ethics