Skip to main content

Using Local Rules in Random Forests of Decision Trees

  • Conference paper
  • First Online:
Book cover Future Data and Security Engineering (FDSE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9446))

Included in the following conference series:

  • 911 Accesses

Abstract

We propose to use local labeling rules in random forests of decision trees for effectively classifying data. The decision rules use the majority vote for labeling at terminal-nodes in decision trees, maybe making the classical random forest algorithm degrade the classification performance. Our investigation aims at replacing the majority rules with the local ones, i.e. support vector machines to improve the prediction correctness of decision forests. The numerical test results on 8 datasets from UCI repository and 2 benchmarks of handwritten letters recognition showed that our proposal is more accurate than the classical random forest algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    Two classifiers are diverse if they make different errors on new datapoints [16].

  2. 2.

    We remark that we tried to vary the number of decision trees from 20 to 500 for finding the best experimental results.

  3. 3.

    the time complexity of learning a SVM model is in the order of \(n^2\) where n is the number of individuals [38].

References

  1. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.: Classification and Regression Trees. Wadsworth International, Belmont (1984)

    MATH  Google Scholar 

  2. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  3. Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2007)

    Article  Google Scholar 

  4. Rokach, L., Maimon, O.Z.: Data Mining with Decision Trees: Theory and Applications, vol. 69. World Scientific Pub Co Inc, Singapore (2008)

    MATH  Google Scholar 

  5. Cutler, A., Cutler, D.R., Stevens, J.R.: Tree-based methods. In: Li, X., Xu, R. (eds.) High-Dimensional Data Analysis in Cancer Research. Applied Bioinformatics and Biostatistics in Cancer Research, pp. 1–19. Springer, New York (2009)

    Google Scholar 

  6. Berry, M.J., Linoff, G.: Data Mining Techniques: For Marketing, Sales, and Customer Support. Wiley, New York (2011)

    Google Scholar 

  7. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    Book  MATH  Google Scholar 

  8. Michie, D., Spiegelhalter, D.J., Taylor, C.C. (eds.): Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York (1994)

    MATH  Google Scholar 

  9. Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)

    MATH  Google Scholar 

  10. Valiant, L.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)

    Article  MATH  Google Scholar 

  11. Kearns, M., Valiant, L.G.: Learning boolean formulae or finite automata is as hard as factoring. Technical report TR 14–88, Harvard University Aiken Computation Laboratory (1988)

    Google Scholar 

  12. Kearns, M., Valiant, L.: Cryptographic limitations on learning boolean formulae and finite automata. J. ACM 41(1), 67–95 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  13. Geman, S., Bienenstock, E., Doursat, R.: Neural networks and the bias/variance dilemma. Neural Comput. 4(1), 1–58 (1992)

    Article  Google Scholar 

  14. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  15. Domingos, P.: A unified bias-variance decomposition. In: Proceedings of 17th International Conference on Machine Learning, pp. 231–238. Morgan Kaufmann, Stanford, CA (2000)

    Google Scholar 

  16. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  17. Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Computational Learning Theory: Proceedings of the Second European Conference, pp. 23–37 (1995)

    Google Scholar 

  18. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  19. Asuncion, A., Newman, D.: UCI repository of machine learning databases (2007)

    Google Scholar 

  20. van der Maaten, L.: A new benchmark dataset for handwritten character recognition (2009). http://homepage.tudelft.nl/19j49/Publications_files/characters.zip

  21. LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., Jackel, L.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  22. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, New York (2009)

    Book  MATH  Google Scholar 

  23. Breiman, L.: Arcing classifiers. Ann. Stat. 26(3), 801–849 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  24. Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)

    Article  Google Scholar 

  25. Ho, T.K.: Random decision forest. In: Proceedings of the Third International Conference on Document Analysis and Recognition, pp. 278–282 (1995)

    Google Scholar 

  26. Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  27. Vapnik, V.: Principles of risk minimization for learning theory. In: Moody, J.E., Hanson, S.J., Lippmann, R.P. (eds.) Advances in Neural Information Processing Systems, vol. 4, pp. 831–838. Morgan Kaufmann, San Mateo (1991)

    Google Scholar 

  28. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York (2000)

    Book  MATH  Google Scholar 

  29. Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: Proceedings of the Seventh European Symposium on Artificial Neural Networks, pp. 219–224 (1999)

    Google Scholar 

  30. Guermeur, Y.: Svm multiclasses, théorie et applications (2007)

    Google Scholar 

  31. Kreßel, U.: Pairwise classification and support vector machines. In: Smola, A., et al. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 255–268. MIT Press, Cambridge (1999)

    Google Scholar 

  32. Platt, J., Cristianini, N., Shawe-Taylor, J.: Large margin dags for multiclass classification. In: Solla, S.A., Leen, T.K., Müller, K. (eds.) Advances in Neural Information Processing Systems, vol. 12, pp. 547–553. MIT Press, Cambridge (2000)

    Google Scholar 

  33. Vural, V., Dy, J.: A hierarchical method for multi-class support vector machines. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 831–838 (2004)

    Google Scholar 

  34. Benabdeslem, K., Bennani, Y.: Dendogram-based svm for multi-class classification. J. Comput. Inf. Technol. 14(4), 283–289 (2006)

    Article  Google Scholar 

  35. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9(4), 1871–1874 (2008)

    MATH  Google Scholar 

  36. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 1–27 (2011)

    Article  Google Scholar 

  37. Vapnik, V., Bottou, L.: Local algorithms for pattern recognition and dependencies estimation. Neural Comput. 5(6), 893–909 (1993)

    Article  Google Scholar 

  38. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  39. Lin, C.: A practical guide to support vector classification (2003)

    Google Scholar 

  40. Murthy, S., Kasif, S., Salzberg, S., Beigel, R.: OC1: randomized induction of oblique decision trees. In: Proceedings of the Eleventh National Conference on Artificial Intelligence, pp. 322–327 (1993)

    Google Scholar 

  41. Wu, W., Bennett, K., Cristianini, N., Shawe-Taylor, J.: Large margin trees for induction and transduction. In: Proceedings of the Sixth International Conference on Machine Learning, pp. 474–483 (1999)

    Google Scholar 

  42. Rokach, L., Maimon, O.: Top-down induction of decision trees classifiers - a survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 35(4), 476–487 (2005)

    Article  Google Scholar 

  43. Bennett, K.P., Mangasarian, O.L.: Multicategory discrimination via linear programming. Optim. Meth. Softw. 3, 27–39 (1994)

    Article  Google Scholar 

  44. Loh, W.Y., Vanichsetakul, N.: Tree-structured classification via generalized discriminant analysis (with discussion). J. Am. Stat. Assoc. 83, 715–728 (1988)

    Article  MATH  Google Scholar 

  45. Yildiz, O., Alpaydin, E.: Linear discriminant trees. Int. J. Pattern Recogn. Artif. Intell. 19(3), 323–353 (2005)

    Article  Google Scholar 

  46. Cutler, A., Guohua, Z.: PERT - perfect random tree ensembles. Comput. Sci. Stat. 33, 490–497 (2001)

    Google Scholar 

  47. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  48. Do, T.-N., Lenca, P., Lallich, S., Pham, N.-K.: Classifying very-high-dimensional data with random forests of oblique decision trees. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. SCI, vol. 292, pp. 39–55. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  49. Robnik-Sikonja, M.: Improving random forests. In: Proceedings of the Fifth European Conference on Machine Learning, pp. 359–370 (2004)

    Google Scholar 

  50. Friedman, J.H., Kohavi, R., Yun, Y.: Lazy decision trees. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of ArtificialIntelligence Conference, AAAI 1996, IAAI 1996, vol. 1, pp. 717–724, Portland, Oregon, 4–8 Aug 1996

    Google Scholar 

  51. Kohavi, R., Kunz, C.: Option decision trees with majority votes. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), pp. 161–169, Nashville, Tennessee, USA, 8–12 Jul 1997

    Google Scholar 

  52. Marcellin, S., Zighed, D., Ritschard, G.: An asymmetric entropy measure for decision trees. In: IPMU 2006, Paris, France, pp. 1292–1299 (2006)

    Google Scholar 

  53. Lenca, P., Lallich, S., Do, T.-N., Pham, N.-K.: A comparison of different off-centered entropies to deal with class imbalance for decision trees. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 634–643. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  54. Do, T., Lenca, P., Lallich, S.: Enhancing network intrusion classification through the kolmogorov-smirnov splitting criterion. In: ICTACS 2010, pp. 50–61, Vietnam (2010)

    Google Scholar 

  55. Kohavi, R.: Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 202–207, Portland, Oregon, USA (1996)

    Google Scholar 

  56. Seewald, A.K., Petrak, J., Widmer, G.: Hybrid decision tree learners with alternative leaf classifiers: an empirical study. In: Inernational Florida Artificial Intelligence Research Society Conference, pp. 407–411 (2000)

    Google Scholar 

  57. Pham, N.K., Do, T.N., Lenca, P., Lallich, S.: Using local node information in decision trees: coupling a local decision rule with an off-centered. In: International Conference on Data Mining, pp. 117–123, Las Vegas, Nevada, USA, CSREA Press (2008)

    Google Scholar 

  58. Chang, F., Guo, C.Y., Lin, X.R., Lu, C.J.: Tree decomposition for large-scale SVM problems. J. Mach. Learn. Res. 11, 2935–2972 (2010)

    MathSciNet  MATH  Google Scholar 

  59. Ritschard, G., Marcellin, S., Zighed, D.A.: Arbre de décision pour données déséquilibrées : sur la complémentarité de l’intentisé d’implication et de l’entropie décentrée. In: Analyse Statistique Implicative - Une méthode d’analyse de données pour la recherche de causalités, pp. 207–222 (2009)

    Google Scholar 

  60. Lerman, I., Gras, R., Rostam, H.: Elaboration et évaluation d’un indice d’implication pour données binaires. Math. Sci. Hum. 74, 5–35 (1981)

    MATH  Google Scholar 

  61. Geurts, P., Wehenkel, L., d’Alché Buc, F.: Kernelizing the output of tree-based methods. In: Cohen, W.W., Moore, A., (eds.) Proceedings of the Twenty-Third International Conference (ICML 2006), Pittsburgh, Pennsylvania, USA, 25–29 Jun 2006, ACM International Conference Proceeding Series, vol. 148, pp. 345–352, ACM (2006)

    Google Scholar 

  62. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)

    Article  Google Scholar 

  63. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. Roy. Stat. Soc. B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  64. Bottou, L., Vapnik, V.: Local learning algorithms. Neural Comput. 4(6), 888–900 (1992)

    Article  Google Scholar 

  65. Vincent, P., Bengio, Y.: K-local hyperplane and convex distance nearest neighbor algorithms. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 985–992. The MIT Press, Cambridge (2001)

    Google Scholar 

  66. Boley, D., Cao, D.: Training support vector machines using adaptive clustering. In: Berry, M.W., Dayal, U., Kamath, C., Skillicorn, D.B. (eds.) Proceedings of the Fourth SIAM International Conference on Data Mining, pp. 126–137, Lake Buena Vista, Florida, USA, 22–24 Apr 2004, SIAM (2004)

    Google Scholar 

  67. Zhang, H., Berg, A., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2126–2136 (2006)

    Google Scholar 

  68. Yang, T., Kecman, V.: Adaptive local hyperplane classification. Neurocomputing 71(1315), 3001–3004 (2008)

    Article  Google Scholar 

  69. Segata, N., Blanzieri, E.: Fast and scalable local kernel machines. J. Mach. Learn. Res. 11, 1883–1926 (2010)

    MathSciNet  MATH  Google Scholar 

  70. Cheng, H., Tan, P.N., Jin, R.: Efficient algorithm for localized support vector machine. IEEE Trans. Knowl. Data Eng. 22(4), 537–549 (2010)

    Article  Google Scholar 

  71. Kecman, V., Brooks, J.: Locally linear support vector machines and other local models. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–6 (2010)

    Google Scholar 

  72. Ladicky, L., Torr, P.H.S.: Locally linear support vector machines. In: Getoor, L., Scheffer, T., (eds.) Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pp. 985–992, Bellevue, Washington, USA, Jun 28–Jul 2 2011, Omnipress (2011)

    Google Scholar 

  73. Gu, Q., Han, J.: Clustered support vector machines. In: Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2013, Scottsdale, AZ, USA, Apr 29–May 1 2013, JMLR Proceedings, vol. 31, pp. 307–315 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh-Nghi Do .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Do, TN. (2015). Using Local Rules in Random Forests of Decision Trees. In: Dang, T., Wagner, R., Küng, J., Thoai, N., Takizawa, M., Neuhold, E. (eds) Future Data and Security Engineering. FDSE 2015. Lecture Notes in Computer Science(), vol 9446. Springer, Cham. https://doi.org/10.1007/978-3-319-26135-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26135-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26134-8

  • Online ISBN: 978-3-319-26135-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics