Abstract
Relational classifiers use relations between objects to predict the class values. In some cases the relations are explicitly given. In other cases the dataset contains implicit relations, e.g. the relation is hidden inside of noisy attribute values. To apply relational classifiers for this task, the relations have to be extracted. Manually extracting relations by a domain expert is an expensive and time consuming task. In this paper we show how extracting relations in datasets with noisy attribute values can be learned. Our method LRE uses a regression model to learn and predict weighted binary relations. We show that LRE is able to extract both equivalence relations and non-constrained relations. Secondly we show that relational classifiers using relations automatically extracted by LRE achieve comparable classification quality as classifiers using manually labeled relations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lu, Q., Getoor, L.: Link-based text classification. In: Proceedings of IJCAI Workshop on Text Mining and Link Analysis (2003)
Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proceedings of the Multi-relational Data Mining Workshop ACM SIGKDD (2003)
Neville, J., Jensen, D., Friedland, L., Hay, M.: Learning relational probability trees. In: Proceedings of SIGKDD (2003)
Fellegi, I.P., Sunter, A.B.: A theory for record linkage. Journal of the American Statistical Association 64, 1183–1210 (1969)
Cohen, W.W., Richman, J.: Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2002), Edmonton, Alberta, pp. 475–480 (2002)
Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), Washington, DC (2003)
Preisach, C., Rendle, S., Schmidt-Thieme, L.: Relational classification using automatically extracted relations by record linkage. In: Proceedings of the High Level Information Extraction Workshop at the European Conference on Machine Learning (2008)
Preisach, C., Schmidt-Thieme, L.: Ensembles of relational classifiers. Knowledge and Information Systems, 249–272 (2008)
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of the IJCAI 2003 Workshop on Information Integration on the Web, Acapulco, Mexico, pp. 73–78 (August 2003)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines, Software (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Baxter, R., Christen, P., Churches, T.: A comparison of fast blocking methods for record linkage. In: Proceedings of the 2003 ACM SIGKDD Workshop on Data Cleaning, Record Linkage, and Object Consolidation, Washington, DC (2003)
Rendle, S., Schmidt-Thieme, L.: Scaling record linkage to non-uniform distributed class sizes. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS, vol. 5012, pp. 308–319. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rendle, S., Preisach, C., Schmidt-Thieme, L. (2009). Learning to Extract Relations for Relational Classification. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_114
Download citation
DOI: https://doi.org/10.1007/978-3-642-01307-2_114
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01306-5
Online ISBN: 978-3-642-01307-2
eBook Packages: Computer ScienceComputer Science (R0)