Skip to main content
Log in

A new visualization tool for data mining techniques

  • Regular Paper
  • Published:
Progress in Artificial Intelligence Aims and scope Submit manuscript

Abstract

Clustering techniques and classification trees are two of the main techniques used in data mining but, at present, there is still a lack of visualization methods for these tools. Many graphs associated with clustering, also with hierarchical clustering, do not give any information about the values of the centroids’ attributes and the relationships among them. In classification trees, graphical procedures can also be developed to help simplify their interpretation and to obtain a better understanding, but more visualization methods to support this tool are needed. This paper presents a novel visualization technique called sectors on sectors (SonS), and an extended version called multidimensional sectors on sectors (MDSonS), for improving the interpretation of several data mining algorithms. These methods are applied for visualizing the results of: (a) hierarchical clustering, which makes it possible to extract all the existing relationships among centroids’ attributes at any hierarchy level; (b) growing hierarchical self-organizing maps (GHSOM), a variant of the well-known self-organizing maps (SOM), by means of which it is possible to visualize, simultaneously, the data information at each hierarchy level compactly and extract relationships among variables; (c) classification trees, in which the SonS is used for representing the input data information for each class presented in each terminal node of a classification tree providing extra information for a better understanding of the problem. These methods are tested by means of several data sets (real and synthetic). The achieved results show the suitability and usefulness of the proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. http://mloss.org/software/view/469/.

  2. Each variable is scaled between [0, 1] before carrying out the clustering to avoid a biased model. Moreover, the scaling makes the relevance of each variable (represented by the size of the radius) to be independent of its range, that is, the relevance is not higher even though the variable has a higher range. The use of scaled variables guarantees that the radius is relevant to the variable within the cluster.

  3. This is automatically extensible to other measures such as the median, which is a much more adequate prototype measure in the presence of outliers, for instance. Therefore, SonS is not restricted to the use of a particular prototype measure.

  4. As in SonS, each variable is scaled between [0, 1] before carrying out the clustering to avoid a biased model.

  5. http://cran.R-project.org.

  6. http://archive.ics.uci.edu/ml.

References

  1. Andrews, K.: Visual exploration of large hierarchies with information pyramids. In: Information visualisation, international conference on, p. 793 (2002)

  2. Andrews, K., Heidegger, H.: Information slices: visualising and exploring large hierarchies using cascading, semi-circular discs. In: Information visualization, 1998. Proceedings IEEE Symposium on (1998)

  3. Andrews, K., Kasanicka, J.: A comparative study of four hierarchy browsers using the hierarchical visualisation testing environment (hvte). In: Information visualization, 2007. IV ’07. 11th international conference, pp. 81–86 (2007)

  4. Ankerst, M., Ester, M., Kriegel, H.P.: Towards an effective cooperation of the user and the computer for classification. In: KDD, pp. 179–188 (2000)

  5. Baehrecke, E., Dang, N., Babaria, K., Shneiderman, B.: Visualization and analysis of microarray and gene ontology data with treemaps. BMC Bioinform. 5(1), 84 (2004)

    Article  Google Scholar 

  6. Beaudoin, L., Parent, M.A., Vroomen, L.: Cheops: a compact explorer for complex hierarchies. In: Visualization ’96. Proceedings., pp. 87–92 (1996)

  7. Berthold, M., Hands D.J. (eds): Intelligent data analysis, 2nd edn. Springer (2002)

  8. Borg, I., Groenen, P.J.: Modern multidimensional scaling, theory and applications. Springer (2005)

  9. Card, S.K., Mackinlay, J.D., Shneiderman, B. (eds.): Readings in information visualization: using vision to think. Morgan Kaufmann Publishers Inc., San Francisco (1999)

    Google Scholar 

  10. Chen, C.H., Hardle, W., Unwin, A.: Handbook of data visualization (Springer Handbooks of Computational Statistics). Springer-Verlag TELOS, Santa Clara (2008)

  11. Dittenbach, M., Merkl, D., Rauber, A.: The growing hierarchical self-organizing map. In: neural networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS international joint conference on, vol. 6, pp. 15–19 (2000)

  12. Dittenbach, M., Rauber, A., Merkl, D.: Uncovering hierarchical structure in data using the growing hierarchical self-organizing map. Neurocomputing 48(1), 199–216(18) (2002)

  13. Forina, M., Armanino, C., Lanteri, S., Tiscornia, E.: Classification of olive oils from their fatty acid composition. In: Food Research and Data Analysis. Applied Science Publishers, London, pp. 189–214 (1983)

  14. Hastie, T., Tibshirani, R., Friedman, J.H.: The elements of statistical learning, corrected edn. Springer (2003)

  15. Holten, D.: Hierarchical edge bundles: visualization of adjacency relations in hierarchical data. Vis Comput Graph IEEE Trans 12(5), 741–748 (2006)

    Article  Google Scholar 

  16. Jeong, C.S., Pang, A.: Reconfigurable disc trees for visualizing large hierarchical information space. In: Information visualization, 1998. Proceedings. IEEE Symposium on, vol. 149, pp. 19–25 (1998)

  17. Kesselman, M., Krieger, J., Joseph, W.: Introduction to Comparative politics: political challenges and changing agendas. Wadsworth (2009)

  18. Kohonen, T.: Self-organizing maps, 3rd edn. Springer, Berlin Heidelberg (2001)

    Book  MATH  Google Scholar 

  19. Kreuseler, M., Schumann, H.: Information visualization using a new focus+context technique in combination with dynamic clustering of information space. In: NPIVM ’99: Proceedings of the 1999 workshop on new paradigms in information visualization and manipulation in conjunction with the eighth ACM internation conference on Information and knowledge management, pp. 1–5. ACM, New York(1999)

  20. Leisch, F.: Neighborhood graphs, stripes and shadow plots for cluster visualization. Stat. Comput. (2009)

  21. Makanju, A., Brooks, S., Zincir-Heywood, A., Milios, E.: Logview: visualizing event log clusters. In: Privacy, security and trust, 2008. PST ’08. Sixth annual conference on, pp. 99–108 (2008)

  22. Martínez-Martínez, J.M., Escandell-Montero, P., Soria-Olivas, E., Martín-Guerrero, J.D., Gómez-Sanchis, J., Vila-Francés, J.: Growing hierarchical sectors on sectors. In: European symposium on artificial neural networks, computational intelligence and machine learning, 2011. ESANN ’11, pp. 239–244 (2011)

  23. Martínez-Martínez, J.M., Escandell-Montero, P., Soria-Olivas, E., Martín-Guerrero, J.D., Martínez-Sober, M., Gómez-Sanchis, J.: Sectors on sectors (sons): a new hierarchical clustering visualization tool. In: Computational intelligence and data mining, 20011. CIDM ’11. IEEE Symposium on, pp. 304–309 (2011)

  24. McConnell, P., Johnson, K., Lin, S.: Applications of Tree-Maps to hierarchical biological data. Bioinformatics 18(9), 1278 (2002)

    Article  Google Scholar 

  25. Miikkulainen, R.: Script recognition with hierarchical feature maps. Connect. Sci. 2, 83–101 (1990)

    Article  Google Scholar 

  26. Munzner, T.: H3: laying out large directed graphs in 3d hyperbolic space. In: Information visualization, 1997. Proceedings., IEEE symposium on, pp. 2–10 (1997)

  27. Reingold, E., Tilford, J.: Tidier drawings of trees. Softw Eng IEEE Trans SE-7(2), 223–228 (1981)

  28. Robertson, G.G., Mackinlay, J.D., Card, S.K.: Cone trees: animated 3d visualizations of hierarchical information. In: CHI ’91: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 189–194. ACM, New York (1991)

  29. Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20(1), 53–65 (1987)

    Article  MATH  Google Scholar 

  30. Shneiderman, B.: Tree visualization with tree-maps: a 2-d space-filling approach. ACM Trans. Graph. 11, 92–99 (1991)

    Article  MATH  Google Scholar 

  31. Spence, R.: Information visualization. ACM Press Bks, Addison-Wesley (2001)

  32. Theodoridis, S., Koutroumbas, K.: Pattern recognition, 4th edn. Academic Press (2008)

  33. Thomas, J.J., Tajudin, D.A.: Visualizing the examination timetabling data using clustering method and treemaps. In: Proceedings of the 2nd IMT-GT regional conference on mathematics, statistics and applications (2006)

  34. Ware, C.: Information visualization: perception for design. Interactive Technologies. Elsevier Science and Technology (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José M. Martínez-Martínez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martínez-Martínez, J.M., Escandell-Montero, P., Soria-Olivas, E. et al. A new visualization tool for data mining techniques. Prog Artif Intell 5, 137–154 (2016). https://doi.org/10.1007/s13748-015-0079-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-015-0079-4

Keywords

Navigation