Skip to main content

Beyond Homemade Artificial Data Sets

  • Conference paper
Hybrid Artificial Intelligence Systems (HAIS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5572))

Included in the following conference series:

Abstract

One of the most important challenges in supervised learning is how to evaluate the quality of the models evolved by different machine learning techniques. Up to now, we have relied on measures obtained by running the methods on a wide test bed composed of real-world problems. Nevertheless, the unknown inherent characteristics of these problems and the bias of learners may lead to inconclusive results. This paper discusses the need to work under a controlled scenario and bets on artificial data set generation. A list of ingredients and some ideas about how to guide such generation are provided, and promising results of an evolutionary multi-objective approach which incorporates the use of data complexity estimates are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

  2. Basu, M., Ho, T.K.: Data Complexity in Pattern Recognition. Springer, Heidelberg (2006)

    Book  MATH  Google Scholar 

  3. Bernadó-Mansilla, E., Ho, T.K., Orriols-Puig, A.: Data complexity and evolutionary learning. In: Data Complexity in Pattern Recognition, pp. 115–134. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd edn. Springer, New York (2007)

    MATH  Google Scholar 

  5. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE TEC 6, 182–197 (2002)

    Google Scholar 

  6. Ho, T.K.: Data complexity analysis: Linkage between context and solution in classification. In: Proceedings of the Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2008) and Statistical Techniques in Pattern Recognition, SPR 2008 (2008)

    Google Scholar 

  7. Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Transactions on PAMI 24(3), 289–300 (2002)

    Article  Google Scholar 

  8. Jeske, D.R., Samadi, B., Lin, P.J., Ye, L.: Generation of synthetic data sets for evaluating the accuracy of knowledge discovery systems. In: 11th International Conference on Knowledge Discovery in Data mining, pp. 756–762 (2005)

    Google Scholar 

  9. Macià, N., Bernadó-Mansilla, E., Orriols-Puig, A.: Preliminary approach on synthetic datasets generation for classification. In: 2008 International Conference on Pattern Recognition. LNCS, vol. 5342, pp. 986–995. Springer, Heidelberg (2008)

    Google Scholar 

  10. Macià, N., Orriols-Puig, A., Bernadó-Mansilla, E.: Genetic-based synthetic data sets for the analysis of classifiers’ behavior. In: Proceedings of the 2008 Hybrid Intelligent Systems Conference, pp. 507–512 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Macià, N., Orriols-Puig, A., Bernadó-Mansilla, E. (2009). Beyond Homemade Artificial Data Sets. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds) Hybrid Artificial Intelligence Systems. HAIS 2009. Lecture Notes in Computer Science(), vol 5572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02319-4_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02319-4_73

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02318-7

  • Online ISBN: 978-3-642-02319-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics