Skip to main content

High Dimensional Visual Data Classification

  • Conference paper
Book cover Pixelization Paradigm (VIEW 2006)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4370))

Included in the following conference series:

Abstract

We present new visual data mining algorithms for interactive decision tree construction with large datasets. The size of data stored in the world is constantly increasing but the limits of current visual data mining (and visualization) methods concerning the number of items and dimensions of the dataset treated are well known (even with pixellisation methods). One solution to improve these methods is to use a higher-level representation of the data, for example a symbolic data representation. Our new interactive decision tree construction algorithms deal with interval and taxonomical data. With such a representation, we are able to deal with potentially very large datasets because we do not use the original data but higher-level data representation. Interactive algorithms are examples of new data mining approach aiming at involving more intensively the user in the process. The main advantages of this user-centered approach are the increased confidence and comprehensibility of the obtained model, because the user was involved in its construction and the possible use of human pattern recognition capabilities. We present some results we obtained on very large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.: Towards Effective and Interpretable Data Mining by Visual Interaction. SIKDD Explorations 3(2), 11–22, http://www.acm.org/sigkdd/explorations/

  2. Ankerst, M.: Visual Data Mining, PhD Thesis, Faculty of Mathematics and Computer Science, Univ. of Munich (2000)

    Google Scholar 

  3. Ankerst, M., Ester, M., Kriegel, H-P.: Toward an Effective Cooperation of the Computer and the User for Classification. In: Proc. of KDD’2001, pp. 179–188 (2001)

    Google Scholar 

  4. Asseraf, M., Mballo, C., Diday, E.: Binary decision trees for interval and taxonomical variables. A Statistical Journal for Graduate Students, Presses Académiques de Neuchâtel 5(1), 13–28 (2004)

    Google Scholar 

  5. Blake, C., Merz, C.: UCI Repository of machine learning databases. University of California Irvine, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html

  6. Bock, H.H., Diday, E.: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, Heidelberg (2000)

    Google Scholar 

  7. Carr, D., Littlefield, R., Nicholson, W., Littlefield, J.: Scatterplot Matrix Techniques for Large N. Journal of the American Statistical Association 82(398), 424–436 (1987)

    Article  MathSciNet  Google Scholar 

  8. Ciampi, A., Diday, E., Lebbe, J., Périnel, E., Vignes, R.: Growing a tree classifier with imprecise data. Pattern Recognition Letters 21, 787–803 (2000)

    Article  Google Scholar 

  9. Do, T-N., Poulet, F.: Interval Data Mining with Kernel Methods and Visualization. In: Proc. of ASMDA’2005, XIth International Symposium on Applied Stochastic Models and Data Analysis, Brest, France, May, pp. 345–354 (2005)

    Google Scholar 

  10. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park (1996)

    Google Scholar 

  11. Han, J., Cercone, N.: Interactive Construction of Decision Trees. In: Cheung, D., Williams, G.J., Li, Q. (eds.) PAKDD 2001. LNCS (LNAI), vol. 2035, pp. 575–580. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Inselberg, A.: The plane with parallel coordinates. Special Issue on Computational Geometry 1, 69–97 (1985)

    MATH  Google Scholar 

  13. Keim, D., Kriegel, H-P., Ankerst, M.: Recursive Pattern: A Technique for Visualizing Very Large Amount of Data. In: Proc. of Visualization’95, Atlanta, USA, pp. 279–286 (1995)

    Google Scholar 

  14. Mballo, C., Diday, E.: The criterion of Kolmogorov-Smirnov for binary decision tree: application to interval valued variables. In: Brito, P., Noirhomme-Fraiture, M. (eds.) Proc. of ECML/PKDD’2004 Workshop on Symbolic and Spatial Data Analysis, pp. 79–90 (2004)

    Google Scholar 

  15. Poulet, F.: CIAD: Interactive Decision Tree Construction (in french). In: Proc. of XXXIIIe Journées de Statistiques, Nantes, May (2001)

    Google Scholar 

  16. Poulet, F.: Full-View: A Visual Data-Mining Environment. IJIG: International Journal of Image and Graphics 2(1), 127–144 (2002)

    Article  Google Scholar 

  17. Poulet, F.: SVM and Graphical Algorithms: A Cooperative Approach. In: ICDM 2004, Brighton, UK, Nov. 2004, pp. 499–502. IEEE, Los Alamitos (2004)

    Chapter  Google Scholar 

  18. Schneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. Information Visualization 1(1), 5–12 (2002)

    Article  Google Scholar 

  19. Ware, M., Franck, E., Holmes, G., Hall, M., Witten, I.: Interactive Machine Learning: Letting Users Build Classifiers. International Journal of Human-Computer Studies 55, 281–292 (2001)

    Article  MATH  Google Scholar 

  20. Wong, P.: Visual Data Mining. IEEE Computer Graphics and Applications 19(5), 20–21 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Pierre P Lévy Bénédicte Le Grand François Poulet Michel Soto Laszlo Darago Laurent Toubiana Jean-François Vibert

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Poulet, F. (2007). High Dimensional Visual Data Classification. In: Lévy, P.P., et al. Pixelization Paradigm. VIEW 2006. Lecture Notes in Computer Science, vol 4370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71027-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71027-1_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71026-4

  • Online ISBN: 978-3-540-71027-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics