Skip to main content

A Novel Prototype Decision Tree Method Using Sampling Strategy

  • Conference paper
  • First Online:
Computational Science and Its Applications -- ICCSA 2015 (ICCSA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9155))

Included in the following conference series:

  • 1129 Accesses

Abstract

Data Mining is a popular knowledge discovery technique. In data mining decision trees are of the simple and powerful decision making models. One of the limitations in decision trees is towards the data source which they tackle. If data sources which are given as input to decision tree are of imbalance nature then the efficiency of decision tree drops drastically, we propose a decision tree structure which mimics human learning by performing balance of data source to some extent. In this paper, we propose a novel method based on sampling strategy. Extensive experiments, using C4.5 decision tree as base classifier, show that the performance measures of our method is comparable to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Juanli, H., Deng, J., Sui, M.: A new approach for decision tree based on principal component analysis. In: Proceedings of Conference on Computational Intelligence and Software Engineering, pp. 1–4 (2009)

    Google Scholar 

  2. Bergsma, S.: Large-scale semi-supervised learning for natural language processing. PhD Thesis, University of Alberta (2010)

    Google Scholar 

  3. Durkin, J.: Expert systems: design and development. Prentice Hall, Englewood Clis (1994)

    Google Scholar 

  4. Quinlan, J.: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)

    Google Scholar 

  5. Purdila, V., Pentiuc, S.-G.: MR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees. Journal of Applied Computer Science & Mathematics, 16(8) (2014). Suceava

    Google Scholar 

  6. Farid, D.M., Harbi, N., Mohammad Zahidur, R.: Combining naive bayes and decision tree for adaptive intrusion detect. International Journal of Network Security & Its Applications (IJNSA), 2(2) (April 2010)

    Google Scholar 

  7. Mohammad, K., Mahmood, A.: The Use of Genetic Algorithm, Clustering and Feature Selection Techniques in Constrcution of Decision Tree Models for Credit Scoring. International Journal of Managing Information Technology (IJMIT) 5(4) (November 2013). doi:10.5121/ijmit.2013.5402

  8. Dianhong, W., Xingwen, L., Liangxiao, J., Xiaoting, Z., Yongguang, Z.: Rough Set Approach to Multivariate Decision Trees Inducing? Journal of Computers, 7(4) (April 2012)

    Google Scholar 

  9. Xinmeng, Z., Shengyi, J.: A Splitting Criteria Based on Similarity in Decision Tree Learning. Journal of Software, 7(8) (August 2012)

    Google Scholar 

  10. Ying, W., Xinguang, P., Jing, B.: Computer Crime Forensics Based on Improved Decision Tree Algorithm. Journal of Networks, 9(4) (April 2014)

    Google Scholar 

  11. Dong-sheng, L., Shujiang, F.: A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem. Scientific World Journal, Article ID 468324, 11 (2014). Hindawi Publishing Corporation. http://dx.doi.org/10.1155/2014/468324

  12. Win-Tsung, L., Yue-Shan, C., Ruey-Kai, S., Chun-Chieh, C., Shyan-Ming, Y.: CUDT: A CUDA Based Decision Tree Algorithm. Scientific World Journal, Article ID 745640, 12 (2014). Hindawi Publishing Corporation. http://dx.doi.org/10.1155/2014/745640

  13. Tarun, C., Jayashri, V.: Fault Diagnosis in Benchmark Process Control System Using Stochastic Gradient Boosted Decision Trees. International Journal of Soft Computing and Engineering (IJSCE), 1(3) (July 2011). ISSN: 2231-2307

    Google Scholar 

  14. Ganga Devi, S.V.S.: Fuzzy Rule Extraction for Fruit Data Classification. Compusoft, An international journal of advanced computer technology, 2(12) (December 2013)

    Google Scholar 

  15. Hamilton, A., Asuncion, D., Newman.: UCI Repository of Machine Learning Database (School of Information and Computer Science). Univ. of California, Irvine (2007). http://www.ics.uci.edu/∼mlearn/MLRepository.html

  16. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

  17. Quinlan, J.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  18. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  19. Chawla, N.V., et al.: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 16, 321–357 (2002)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tai-hoon Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Battula, B.P., Bhattacharyya, D., Prasad, C.V.P.R., Kim, Th. (2015). A Novel Prototype Decision Tree Method Using Sampling Strategy . In: Gervasi, O., et al. Computational Science and Its Applications -- ICCSA 2015. ICCSA 2015. Lecture Notes in Computer Science(), vol 9155. Springer, Cham. https://doi.org/10.1007/978-3-319-21404-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21404-7_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21403-0

  • Online ISBN: 978-3-319-21404-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics