Abstract
Entity Resolution (ER) is the task of finding references that refer to the same entity across different data sources. Cleaning a data warehouse and applying ER on it is a computationally demanding task, particularly for large data sets that change dynamically. Therefore, a query-driven approach which analyses a small subset of the entire data set and integrates the results in real-time is significantly beneficial. Here, we present an interactive tool, called HiDER, which allows for query-driven ER in large collections of uncertain dynamic historical data. The input data includes civil registers such as birth, marriage and death certificates in the form of structured data, and notarial acts such as estate tax and property transfers in the form of free text. The outputs are family networks and event timelines visualized in an integrated way. The HiDER is being used and tested at BHIC center(Brabant Historical Information Center, https://www.bhic.nl); despite the uncertainties of the BHIC input data, the extracted entities have high certainty and are enriched by extra information.
Chapter PDF
Similar content being viewed by others
References
Altwaijry, H., Kalashnikov, D.V., Mehrotra, S.: Query-driven approach to entity resolution. Proceedings of the VLDB Endowment 6(14), 1846–1857 (2013)
Efremova, J., Ranjbar-Sahraei, B., Rahmani, H., Oliehoek, F.A., Calders, T., Tuyls, K.: Multi-source entity resolution for genealogical data. In: Population Reconstruction. Springer (2015) (in press)
Rahmani, H., Ranjbar-Sahraei, B., Weiss, G., Tuyls, K.: Entity resolution in disjoint graphs: an application on genealogical data. Intelligent Data Analysis 20(2) (2016) (in press)
Rahmani, H., Ranjbar-Sahraei, B., Weiss, G., Tuyls, K.: Contextual entity resolution approach for genealogical data. In: Workshop on Knowledge Discovery, Data Mining and Machine Learning, Aachen, Germany (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ranjbar-Sahraei, B., Efremova, J., Rahmani, H., Calders, T., Tuyls, K., Weiss, G. (2015). HiDER: Query-Driven Entity Resolution for Historical Data. In: Bifet, A., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9286. Springer, Cham. https://doi.org/10.1007/978-3-319-23461-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-23461-8_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23460-1
Online ISBN: 978-3-319-23461-8
eBook Packages: Computer ScienceComputer Science (R0)