Skip to main content
Log in

Case study of inaccuracies in the granulation of decision trees

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Cybernetics studies information process in the context of interaction with physical systems. Because such information is sometimes vague and exhibits complex interactions; it can only be discerned using approximate representations. Machine learning provides solutions that create approximate models of information and decision trees are one of its main components. However, decision trees are susceptible to information overload and can get overly complex when a large amount of data is inputted in them. Granulation of decision tree remedies this problem by providing the essential structure of the decision tree, which can decrease its utility. To evaluate the relationship that exists between granulation and decision tree complexity, data uncertainty and prediction accuracy, the deficiencies obtained by nursing homes during annual inspections were taken as a case study. Using rough sets, three forms of granulation were performed: (1) attribute grouping, (2) removing insignificant attributes and (3) removing uncertain records. Attribute grouping significantly reduces tree complexity without having any strong effect upon data consistency and accuracy. On the other hand, removing insignificant features decrease data consistency and tree complexity, while increasing the error in prediction. Finally, decrease in the uncertainty of the dataset results in an increase in accuracy and has no impact on tree complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Bargiela A, Pedrycz W (2003) Granular computing: an introduction. Kluwer Academic Publishers, Dordrecht

    MATH  Google Scholar 

  • Cherkauer KJ, Shavlik JW (1996) Growing simpler decision trees to facilitate knowledge discovery. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 315–318

  • Fierens D, Ramon J, Blockeel H, Bruynooghe M (2005) A comparison of approaches for learning first-order logical probability estimation trees. LNCS 3720:556–563

    Google Scholar 

  • Hall LO, Chawla N, Bowyer KW (1998) Decision tree learning on very large data sets. IEEE Int Conf Syst Man Cybern 3:2579–2584

    Google Scholar 

  • Han SW, Kim JY (2008) A new decision tree algorithm based on rough set theory. Int J Innov Comput Inf Control 4:2749–5757

    MathSciNet  Google Scholar 

  • Huang L, Huang M, Guo B, Zhang Z (2007) A new method for constructing decision tree based on rough set theory. IEEE Int Conf Granular Comput 241–244

  • John M (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4:227–243

    Article  Google Scholar 

  • Kweku-Muata O-B (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34:3331–3345

    Article  MATH  Google Scholar 

  • Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht

    MATH  Google Scholar 

  • Refaat M (2007) Data Preparation for Data Mining Using SAS, Morgan Kaufmann

  • Tusar T (2007) Optimizing accuracy and size of decision trees. In: Proceedings of the sixteenth international electronical and computer science conference-ERK 2007, pp 81–84

  • Wang C, Ou F (2008) An algorithm for decision tree construction based on rough set theory. In: International conference on computer science and information technology, pp 295–298

  • Wittien IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, California

    Google Scholar 

  • Yellasiri R, Rao CR, Reddy V (2005) Decision tree induction using rough set theory-comparative study. J Theor Appl Inf Technol 3:110–114

    Google Scholar 

  • Zhou X, Zhang D, Jiang Y (2008) A new credit scoring method based on rough sets and decision tree. LNCS 5012:1081–1089

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Bargiela.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Badr, S., Bargiela, A. Case study of inaccuracies in the granulation of decision trees. Soft Comput 15, 1129–1136 (2011). https://doi.org/10.1007/s00500-010-0587-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-010-0587-x

Keywords

Navigation