Skip to main content

Data Reduction Algorithm for Machine Learning and Data Mining

  • Conference paper
New Frontiers in Applied Artificial Intelligence (IEA/AIE 2008)

Abstract

The paper proposes an approach to data reduction. The data reduction procedures are of vital importance to machine learning and data mining. To solve the data reduction problems the agent-based population learning algorithm was used. The proposed approach has been used to reduce the original dataset in two dimensions including selection of reference instances and removal of irrelevant attributes. To validate the approach the computational experiment has been carried out. Presentation and discussion of experiment results conclude the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aarts, E.H.L., Korst, J.: Simulated Annealing and Boltzmann Machines. John Wiley, Chichester (1989)

    MATH  Google Scholar 

  2. Cano, J.R., Herrera, F., Lozano, M.: On the Combination of Evolutionary Algorithms and Stratified Strategies for Training Set Selection in Data Mining. Pattern Recognition Letters. Elsevier, Amsterdam (in press, 2004)

    Google Scholar 

  3. Czarnowski, I., Jȩdrzejowicz, P.: An Approach to Instance Reduction in Supervised Learning. In: Coenen, F., Preece, A., Macintosh, A. (eds.) Research and Development in Intelligent Systems, vol. XX, pp. 267–282. Springer, London (2004)

    Google Scholar 

  4. Dash, M., Liu, H.: Feature Selection for Classification. Intelligence Data Analysis 1(3), 131–156 (1997)

    Article  Google Scholar 

  5. Duch, W.: Results - Comparison of Classification. Nicolaus Copernicus University (2002), http://www.is.umk.pl/projects/datasets.html

  6. Glover, F.: Tabu search. Part I and II. ORSA Journal of Computing 1(3), Summer (1990) and 2(1) Winter (1990)

    Google Scholar 

  7. Ishibuchi, H., Nakashima, T., Nii, M.: Learning of Neural Networks with GA-based Instance Selection. In: IFSA World Congress and 20th NAFIPS International Conference, vol. 4, pp. 2102–2107 (2001)

    Google Scholar 

  8. Barbucha, D., Czarnowski, I., Jȩdrzejowicz, P., Ratajczak-Ropel, E., Wierzbowska, I.: JADE-Based A-Team as a Tool for Implementing Population-Based Algorithms. In: Chen, Y., Abraham, A. (eds.) Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications (ISDA 2006), vol. 3, pp. 144–149. IEEE Computer Society, Los Alamitos (2006)

    Chapter  Google Scholar 

  9. Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  10. Meiri, R., Zahavi, J.: Using Simulated Annealing to Optimize the Feature Selection Problem in Marketing Applications. European Journal of Operational Research 17(3), 842–858 (2006)

    Article  Google Scholar 

  11. Merz, C.J., Murphy, M.: UCI Repository of Machine Learning Databases. University of California, Department of Information and Computer Science, Irvine, CA (1998), http://www.ics.uci.edu/mlearn/MLRepository.html

    Google Scholar 

  12. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, SanMateo

    Google Scholar 

  13. Raman, B., Ioerger, T.R.: Enhancing learning using feature and example selection. Journal of Machine Learning Research (in press, 2003)

    Google Scholar 

  14. Rozsypal, A., Kubat, M.: Selecting Representative Examples and Attributes by a Genetic Algorithm. Intelligent Data Analysis 7(4), 291–304 (2003)

    MATH  Google Scholar 

  15. Skalak, D.B.: Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithm. In: International Conference on Machine Learning, pp. 293–301 (1994)

    Google Scholar 

  16. Talukdar, S., Baerentzen, L., Gove, A., de Souza, P.: Asynchronous Teams: Co-operation Schemes for Autonomous. Computer-Based Agents. Technical Report EDRC 18-59-96, Carnegie Mellon University, Pittsburgh (1996)

    Google Scholar 

  17. Wroblewski, J.: Adaptacyjne metody klasyfikacji obiektów. PhD thesis, University of Warsaw, Warsaw (in Polish, 2001)

    Google Scholar 

  18. Wilson, D.R., Martinez, T.R.: Reduction Techniques for Instance-based Learning Algorithm. Machine Learning 33-3, 257–286 (2000)

    Article  Google Scholar 

  19. Zongker, D., Jain, A.: Algorithm for Feature Selection: An Evaluation. In: International Conference on Pattern Recognition, ICPR 1996, pp. 18–22 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ngoc Thanh Nguyen Leszek Borzemski Adam Grzech Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Czarnowski, I., Jȩdrzejowicz, P. (2008). Data Reduction Algorithm for Machine Learning and Data Mining. In: Nguyen, N.T., Borzemski, L., Grzech, A., Ali, M. (eds) New Frontiers in Applied Artificial Intelligence. IEA/AIE 2008. Lecture Notes in Computer Science(), vol 5027. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69052-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69052-8_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69045-0

  • Online ISBN: 978-3-540-69052-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics