Abstract
Schema mapping that provides a unified view to the users is essential to manage schema heterogeneity among different sources. Schema mapping can be conducted by machine learning or by knowledge engineering approach. Machine learning approach needs training data set for building models, but usually it is very difficult to obtain training datasets for large datasets. In addition, it is very difficult to change the model by human knowledge. Knowledge engineering approach encodes human knowledge directly, such that the knowledge base can be constructed with limited data, but it needs time consuming knowledge acquisition. This research proposes an incremental schema mapping method that employs Ripple-Down Rules (RDR) with the censored production rules (CPR). Our experimental results show that RDR approach shows comparable performance with the machine learning approaches and RDR knowledge base can be expanded incrementally as the cases classified increase.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cate, B.T., Dalmau, V., Kolaitis, P.G.: Learning schema mappings. In: Proceedings of the 15th International Conference on Database Theory, pp. 182–195. ACM, Berlin (2012)
Glavic, B., Alonso, G., Miller, R.J., Hass, L.M.: TRAMP: Understanding the behavior of schema mappings through provenance. Proceedings of the VLDB Endowment 3(1-2), 1314–1325 (2010)
Ngo, D., Bellahsene, Z., Todorov, K.: Opening the Black Box of Ontology Matching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 16–30. Springer, Heidelberg (2013)
Do, H.H., Rahm, E.: COMA: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 610–621. VLDB Endowment, Hong Kong (2002)
Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM (2005)
Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Learning to map between ontologies on the semantic web. In: Proceedings of the 11th International Conference on World Wide Web. ACM (2002)
Marie, A., Gal, A.: Boosting schema matchers. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 283–300. Springer, Heidelberg (2008)
Richards, D.: Two decades of ripple down rules research. The Knowledge Engineering Review 24(02), 159–184 (2009)
Kim, Y.S., Compton, P., Kang, B.H.: Ripple-down rules with censored production rules. In: Richards, D., Kang, B.H. (eds.) PKAW 2012. LNCS, vol. 7457, pp. 175–187. Springer, Heidelberg (2012)
Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: A machine-learning approach. ACM Sigmod Record (2001)
Embley, D.W., Xu, L., Ding, Y.: Automatic direct and indirect schema mapping: experiences and lessons learned. ACM SIGMod Record 33(4), 14–19 (2004)
Duchateau, F., Coletta, R., Bellahsene, Z., Miller, R.J.: Yam: a schema matcher factory. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM (2009)
Compton, P., Edwards, G., Kang, B., Lazarus, L., Malor, R., Menzies, T., Preston, P., Srinivasan, A., Sammut, S.: Ripple down rules: possibilities and limitations. In: Proceedings of the Sixth AAAI Knowledge Acquisition for Knowledge-Based Systems Workshop, Calgary, Canada, University of Calgary (1991)
Compton, P., Jansen, R.: A philosophical basis for knowledge acquisition. Knowledge Acquisition 2(3), 241–258 (1990)
Kang, B., Compton, P., Preston, P.: Multiple classification ripple down rules: Evaluation and possibilities. In: The 9th Knowledge Acquisition for Knowledge Based Systems Workshop (1995)
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, California (1993)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Pater, N.: Enhancing random forest implementation in WEKA. In: Machine learning conference paper for ECE591Q (2005)
Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: ICML (1999)
Hall, M., Frank, E.: Combining Naive Bayes and Decision Tables. In: FLAIRS Conference (2008)
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization: Papers from the Workshop (1998)
Jimenez, S., Becerra, C., Gelbukh, A., Gonzalez, F.: Generalized mongue-elkan method for approximate text string comparison. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 559–570. Springer, Heidelberg (2009)
Stoilos, G., Stamou, G., Kollias, S.D.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005)
Cheng, W., Lin, H., Sun, Y.: An efficient schema matching algorithm. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3682, pp. 972–978. Springer, Heidelberg (2005)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Anam, S., Kim, Y.S., Liu, Q. (2014). Incremental Schema Mapping. In: Kim, Y.S., Kang, B.H., Richards, D. (eds) Knowledge Management and Acquisition for Smart Systems and Services. PKAW 2014. Lecture Notes in Computer Science(), vol 8863. Springer, Cham. https://doi.org/10.1007/978-3-319-13332-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-13332-4_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13331-7
Online ISBN: 978-3-319-13332-4
eBook Packages: Computer ScienceComputer Science (R0)