Abstract
Entity Matching is used to identify records representing the same entities in the real world. As e-commerce is developing rapidly, online products grow explosively in both amount and variety. Applying entity matching to e-commerce data and finding records representing the same products make customers convenient to compare prices. This paper proposes an entity matching system for e-commerce data, called EPEMS. Compared with existing systems, we improve an existing sorted neighborhood blocking method, which is used to reduce the number of comparisons. At the same time the similarity of product pictures is used to improve matching results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Transactions on Knowledge and Data Engineering 24(9), 1537–1555 (2012)
Hernández, M.A., Stolfo, S.J.: The merge/purge problem for large databases. ACM SIGMOD Record 24, 127–138 (1995)
Warshall, S.: A theorem on boolean matrices. Journal of the ACM (JACM) 9(1), 11–12 (1962)
Draisbach, U., Naumann, F., Szott, S., Wonneberg, O.: Adaptive windows for duplicate detection. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 1073–1083. IEEE (2012)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168. IEEE (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Gao, L. et al. (2015). EPEMS: An Entity Matching System for E-Commerce Products. In: Cheng, R., Cui, B., Zhang, Z., Cai, R., Xu, J. (eds) Web Technologies and Applications. APWeb 2015. Lecture Notes in Computer Science(), vol 9313. Springer, Cham. https://doi.org/10.1007/978-3-319-25255-1_74
Download citation
DOI: https://doi.org/10.1007/978-3-319-25255-1_74
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25254-4
Online ISBN: 978-3-319-25255-1
eBook Packages: Computer ScienceComputer Science (R0)