Skip to main content
Log in

Hierarchical visual data mining for large-scale data

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

An increasingly important problem in exploratory data analysis and visualization is that of scale; more and more data sets are much too large to analyze using traditional techniques, either in terms of the number of variables or the number of records. One approach to addressing this problem is the development and use of multiresolution strategies, where we represent the data at different levels of abstraction or detail through aggregation and summarization. In this paper we present an overview of our recent and current activities in the development of a multiresolution exploratory visualization environment for large-scale multivariate data. We have developed visualization, interaction, and data management techniques for effectively dealing with data sets that contain millions of records and/or hundreds of dimensions, and propose methods for applying similar approaches to extend the system to handle nominal as well as ordinal data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. D. Brodbeck, M. Chalmers, A. Lunzer, and P. Cotture, Domesticating bead: Adapting an information visualization system to a financial institution., Proc. Information Visualization 97, pp. 73–80, 1997.

  2. Y. Fua, M. O. Ward, and E. A. Rundensteiner, Hierarchical parallel coordinates for visualizing large multivariate data sets, Proc. Visualization 99, pp. 43–50, October, 1999.

    Google Scholar 

  3. Y. Fua, M. O. Ward, and E. A. Rundensteiner, Navigating hierarchies with structure-based brushes, Proc. Information Visualization 99, pp. 58–64, October, 1999.

  4. Y. Fua, M. O. Ward, and E. A. Rundensteiner, Structure-based brushes: a mechanism for navigating hierarchically organized data and information spaces, IEEE Trans. Visualization and Computer Graphics, V. 6, pp. 150–159, 2000.

    Article  Google Scholar 

  5. J. Jolliffe. Principal Component Analysis. Springer Verlag, 1986.

  6. S. Kaski, Dimensionality reduction by random mapping: Fast similarity computation for clustering, Proc. IJCNN, p. 413–418, 1998.

  7. D. Keim, H. Kriegel, and M. Ankerst, Recursive pattern: a technique for visualizing very large amounts of data, Proc. of Visualization 95, p. 279–86, 1995.

    Article  Google Scholar 

  8. T. Kohonen. Self Organizing Maps. Springer Verlag, 1995.

  9. A. Mead, Review of the development of multidimensional scaling methods, The Statistician, Vol. 33, p. 27–35, 1992.

    Article  Google Scholar 

  10. B. Shneiderman, Tree visualization with tree-maps: A 2d space-filling approach, ACM Transactions on Graphics, Vol. 11(1), p. 92–99, Jan. 1992.

    Article  Google Scholar 

  11. D. Stroe, E. A. Rundensteiner, and M. O. Ward, Scalable visual hierarchy exploration, Proc. DEXA 2000, September, 2000.

  12. M. O. Ward, XmdvTool: integrating multiple methods for visualizing multivariate data. Proc. Visualization 94, pp. 326–333, October, 1994.

    Article  Google Scholar 

  13. M. O. Ward and A. R. Martin, High dimensional brushing for interactive exploration of multivariate data. Proc. Visualization 95, pp. 271–278, 1995.

    Google Scholar 

  14. M. O. Ward, Creating and manipulating N-dimensional brushes, Proc. Joint Statistical Meeting, pp. 6–14, August, 1997.

  15. M. O. Ward, Y. Jing, and E. A. Rundensteiner, Hierarchical exploration of large multivariate data spaces, Proc. Dagstuhl Seminar on Scientific Visualization, May, 2000.

  16. G. Wills, An interactive view for hierarchical clustering, Proc. Information Visualization 98, p. 26–31, 1998.

    Article  Google Scholar 

  17. J. A. Wise, J. J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow, Visualizing the non-visual: Spatial analysis and interaction with information from text documents, Proc. Information Visualization 95, pp. 51–58, 1995.

    Google Scholar 

  18. J. A. Wise, The ecological approach to text visualization, JASIS, Vol. 50, No. 13, p. 1224–1233, 1999.

    Article  Google Scholar 

  19. E. Wegman and Q. Luo, High dimensional clustering using parallel coordinates and the grand tour, Computing Science and Statistics, Vol. 28, p. 361–8, 1997.

    Google Scholar 

  20. P. Wong and R. Bergeron, Multiresolution multidimensional wavelet brushing, Proc. Visualization 96, p. 141–8, 1996.

    Google Scholar 

  21. J. Yang, M. Ward, and E. Rundensteiner, Interactive hierarchical displays: a general framework for visualization and exploration of large multivariate data sets, Computers and Graphics, in press.

  22. J. Yang, M. Ward, and E. Rundensteiner, Interring: a radial, spacefilling hierarchy visualization system with a set of navigation, modification, and selection tools, IEEE Symposium on Information Visualization (InfoVis 02), pp. 77–84, 2002.

  23. J. Yang, M. O. Ward and E. A. Rundensteiner, “Visual hierarchical dimension reduction for exploration of high dimensional datasets, Technical Report #WPI-CS-TR-02-22, 2002.

  24. J. York, S. Bohn, K. Pennock, and D. Lantrip, Clustering and dimensionality reduction in Spire. Proc. Symposium on Advanced Intelligence Processing and Analysis, p. 73, 1995.

Download references

Acknowledgements

This work has been supported by U.S. National Science Foundation under grants IIS-9732897, IIS-0119276, and EIA-9729878.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ward, M., Peng, W. & Wang, X. Hierarchical visual data mining for large-scale data. Computational Statistics 19, 147–158 (2004). https://doi.org/10.1007/BF02915281

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02915281

Keywords

Navigation