Abstract
Multiple standards and encodings for names of countries, as well as multiple renderings of the country names themselves cause problems for interoperability. This impacts both human and automated processing. This paper describes an automated method for aligning pairs of country code sets by examining the string similarity between the names of the countries in each set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: KDD 2003: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 39–48. ACM, New York (2003)
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, pp. 73–78 (2003)
French, J.C., Powell, A.L., Schulman, E.: Using clustering strategies for creating authority files. Journal of the American Society for Information Science 51(8), 774–786 (2000)
Kondrak, G.: N-gram similarity and distance. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 115–126. Springer, Heidelberg (2005)
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys (CSUR) 33(1), 31–88 (2001)
Piskorski, J., Sydow, M.: String distance metrics for reference matching and search query correction. In: 10th Business Information Systems Conference, pp. 353–365 (2007)
Siegfried, S.L., Bernstein, J.: Synoname: Getty’s new approach to pattern matching for personal names. Computers and the Humanities 25(4), 211–226 (1991)
van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworths, London (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Richardson, G. (2010). Automated Country Name Disambiguation for Code Set Alignment. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2010. Lecture Notes in Computer Science, vol 6273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15464-5_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-15464-5_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15463-8
Online ISBN: 978-3-642-15464-5
eBook Packages: Computer ScienceComputer Science (R0)