Abstract
Data integration systems aims at facilitating the management of heterogeneous data sources. When huge amount of data have to be integrated, resorting to human validations is not possible. However, completely automatic integration methods may give rise to decision errors and to approximated results. Hence, such systems need explanation modules to enhance the user confidence in the integrated data. In this paper, we focus our study on reference reconciliation methods which compare data descriptions to decide whether they refer to the same real world entity. Numerical reference reconciliation methods that are global and ontology driven, exploit semantic knowledge to model the dependencies between similarities and to propagate them to other references. In order to explain the similarity scores and the reconciliation decisions obtained by such methods, we have developed an explanation model based on Coloured Petri Nets which provides graphical and comprehensive explanations to the user. This model allows to show the relevance of one decision, and to diagnose possible anomalies in the domain knowledge or in the similarity measures that are used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer, New York (2006)
Bilgic, M., Licamele, L., Getoor, L., Shneiderman, B.: D-dupe: An interactive tool for entity resolution in social networks. In: Visual Analytics Science and Technology (VAST), Baltimore (2006)
Borgida, A., Calvanese, D., Rodriguez-Muro, M.: Explanation in the DL − Lite Family of Description Logics. In: Meersman, R., Tari, Z. (eds.) OTM 2008. LNCS, vol. 5332, pp. 1440–1457. Springer, Heidelberg (2008)
Cohen, W.W., Ravikumar, P.D., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of IJCAI-2003 Workshop on Information Integration on the Web (IIWeb-2003), Acapulco, Mexico, August 9-10, pp. 73–78 (2003)
Dong, X., Halevy, A.Y., Madhavan, J.: Reference reconciliation in complex information spaces. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16, pp. 85–96 (2005)
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Trans. on Knowl. and Data Eng. 19(1), 1–16 (2007)
Golub, G.H., Loan, C.F.V.: Matrix computations, 3rd edn., Johns Hopkins University Press, Baltimore (1996)
Jensen, K.: Coloured Petri Nets, Basic Concepts. Springer, London (1997)
McGuinness, D.L., Ding, L., Glass, A., Chang, C., Zeng, H., Furtado, V.: Explanation Interfaces for the Semantic Web: Issues and Models. In: 3rd International Semantic Web User Interaction Workshop (SWUI 2006), Athens, Georgia, USA, November 6 (2006)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)
Robin, D., Yoonkyong, L., AnHai, D., Alon, H., Pedro, D.: iMAP: discovering complex semantic matches between database schemas. In: SIGMOD 2004: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 383–394. ACM, New York (2004)
Saïs, F.: Integration sémantique de données guidée par une ontologie. PhD thesis, Université de paris sud (2007)
Saïs, F., Pernelle, N., Rousset, M.-C.: L2R: A Logical Method for Reference Reconciliation. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, Vancouver, British Columbia, Canada, July 22-26, pp. 329–334 (2007)
Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. J. Data Semantics 12, 66–94 (2009)
Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches, pp. 146–171 (2005)
Shvaiko, P., Giunchiglia, F., da Silva, P.P., McGuinness, D.L.: Web Explanations for Semantic Heterogeneity Discovery. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 303–317. Springer, Heidelberg (2005)
Silva, D., Pinheiro, P., McGuinness, D.L., Richard, F.: A proof markup language for semantic web services. Inf. Syst. 31(4), 381–395 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Gahbiche, S., Pernelle, N., Saïs, F. (2012). Explaining Reference Reconciliation Decisions: A Coloured Petri Nets Based Approach. In: Guillet, F., Ritschard, G., Zighed, D. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25838-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-25838-1_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25837-4
Online ISBN: 978-3-642-25838-1
eBook Packages: EngineeringEngineering (R0)