Decision Tree Learner in the Presence of Domain Knowledge

Vieira, João; Antunes, Cláudia

doi:10.1007/978-3-662-45495-4_4

João Vieira⁷ &
Cláudia Antunes⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 480))

Included in the following conference series:

Chinese Semantic Web and Web Science Conference

811 Accesses
1 Citations

Abstract

In the era of semantic web and big data, the need for machine learning algorithms able to exploit domain ontologies is undeniable. In the past, two divergent research lines were followed, but with background knowledge represented through domain ontologies, is now possible to develop new ontology-driven learning algorithms. In this paper, we propose a method that adds domain knowledge, represented in OWL 2, to a purely statistical decision tree learner. The new algorithm tries to find the best attributes to test in the decision tree, considering both existing attributes and new ones that can be inferred from the ontology. By exploring the set of axioms in the ontology, the algorithm is then able to determine in run-time the best level of abstraction for each attribute, infer new attributes and decide the ones to be used in the tree. Our experimental results show that our method produces smaller and more accurate trees even on data sets where all features are concrete, but specially on those where some features are only specified at higher levels of abstraction. We also show that our method performs substantially better than traditional decision tree classifiers in cases where only a small number of labeled instances are available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Altman, N.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)
MathSciNet Google Scholar
Antunes, C.: D2PM: domain driven pattern mining. Technical report, Project report, Technical Report 1530, IST, Lisboa (2011)
Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Blockeel, H., De Raedt, L.: Top-down induction of first-order logical decision trees. Artif. Intell. 101(1), 285–297 (1998)
Article MATH Google Scholar
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Google Scholar
Bramer, M.: Using J-pruning to reduce overfitting in classification trees. Knowl.-Based Syst. 15(5), 301–308 (2002)
Article Google Scholar
Domingos, P., Kok, S., Poon, H., Richardson, M., Singla, P.: Unifying logical and statistical AI. In: AAAI (2006)
Google Scholar
Dzeroski, S., Jacobs, N., Molina, M., Moure, C., Muggleton, S., Laer, W.V.: Detecting traffic problems with ILP. In: Page, D.L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 281–290. Springer, Heidelberg (1998)
Chapter Google Scholar
Hawkins, D.M.: The problem of overfitting. J. Chem. Inf. Comput. Sci. 44(1), 1–12 (2004)
Article Google Scholar
Kazakov, Y., Krtzsch, M., Simančík, F.: The incredible ELK. J. Autom. Reasoning 53(1), 1–61 (2014). http://dx.doi.org/10.1007/s10817-013-9296-3
Google Scholar
Lincoff, G., Nehring, C.: National Audubon Society Field Guide to North American Mushrooms. Knopf, New York (1997)
Google Scholar
Maimon, O., Rokach, L. (eds.): Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, New York (2010)
MATH Google Scholar
Motik, B., Patel-Schneider, P.F., Parsia, B., Bock, C., Fokoue, A., Haase, P., Hoekstra, R., Horrocks, I., Ruttenberg, A., Sattler, U., et al.: Owl 2 web ontology language: structural specification and functional-style syntax. W3C recommendation 27, 17 (2009)
Google Scholar
Muggleton, S., De Raedt, L., Poole, D., Bratko, I., Flach, P., Inoue, K., Srinivasan, A.: ILP turns 20. Mach. Learn. 86(1), 3–23 (2012)
Article MATH MathSciNet Google Scholar
Núñez, M.: The use of background knowledge in decision tree induction. Mach. Learn. 6(3), 231–250 (1991)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R., Cameron-Jones, R.M.: Foil: a midterm report. In: Brazdil, P.B. (ed.) ECML 1993. LNCS, vol. 667, pp. 1–20. Springer, Heidelberg (1993)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning, vol. 1. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Roberts, S., Jacobs, N., Muggleton, S., Broughton, J., et al.: A comparison of ILP and propositional systems on propositional traffic data. In: Page, D.L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 291–299. Springer, Heidelberg (1998)
Chapter Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 1, 213 (2002)
Google Scholar
Srinivasan, A., King, R.D., Muggleton, S.: The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program. Trans. Knowl. Data Eng. (1999)
Google Scholar
White, A.P., Liu, W.Z.: Technical note: Bias in information-based measures in decision tree induction. Mach. Learn. 15(3), 321–329 (1994)
MATH Google Scholar
Zhang, J., Kang, D.K., Silvescu, A., Honavar, V.: Learning accurate and concise naïve bayes classifiers from attribute value taxonomies and data. Knowl. Inf. Syst. 9(2), 157–179 (2006)
Article Google Scholar
Zhang, J., Silvescu, A., Honavar, V.G.: Ontology-driven induction of decision trees at multiple levels of abstraction. In: Koenig, S., Holte, R. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 316–323. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
João Vieira & Cláudia Antunes

Authors

João Vieira
View author publications
You can also search for this author in PubMed Google Scholar
Cláudia Antunes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cláudia Antunes .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Dongyan Zhao
Guangdong University of Foreign Studies, Guangzhou, China
Jianfeng Du
East China University, Shanghai, China
Haofen Wang
Southeast University, Nanjing, China
Peng Wang
Wuhan University, Wuhan, China
Donghong Ji
The University of Aberdeen, Aberdeen, United Kingdom
Jeff Z. Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vieira, J., Antunes, C. (2014). Decision Tree Learner in the Presence of Domain Knowledge. In: Zhao, D., Du, J., Wang, H., Wang, P., Ji, D., Pan, J. (eds) The Semantic Web and Web Science. CSWS 2014. Communications in Computer and Information Science, vol 480. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45495-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-662-45495-4_4
Published: 18 November 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45494-7
Online ISBN: 978-3-662-45495-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics