Abstract
In this paper we describe an interactive, visual knowledge discovery tool for analyzing numerical data sets. The tool combines a visual clustering method, to hypothesize meaningful structures in the data, and a classification machine learning algorithm, to validate the hypothesized structures. A two-dimensional representation of the available data allows a user to partition the search space by choosing shape or density according to criteria he deems optimal. A partition can be composed by regions populated according to some arbitrary form, not necessarily spherical. The accuracy of clustering results can be validated by using a decision tree classifier, included in the mining tool.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
E. Beltrami. Sulle funzioni bilineari [on bilinear functions]. Giornale di Matematiche ad Uso degli Studenti delle Università, 11:98–106, 1873.
S. Berchtold, H.V. Jagadish, and K.A. Ross. Independence Diagrams: A Technique for Visual Data Mining. In Proceedings of Fourth Int. Conf. on Knowledge Discovery and Data Mining, 1998.
C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.
K.C. Cox, S.G. Eick, G.J. Wills, and R. J. Brachman. Visual Data Mining: Recognizing Telephone Calling Fraud. Data Mining and Knowledge Discovery, 1(2):225–231, 1997.
R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, 1973.
U. Fayyad, G.G. Grinstein, and A. Wierse. Infomation Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann, 2002.
U.M. Fayyad, G. Piatesky-Shapiro, and P. Smith. From Data Mining to Knowledge Discovery: an overview. In U. Fayyad et al., editors, Advances in Knowledge Discovery and Data Mining, pages 1–34. AAAI/MIT Press, 1996.
G.H. Golub and C.F. Van Loan. Matrix Computation. The Johns Hopkins University Press, 1989.
M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On Clustering Validation Techniques. Journal of Intelligent Information Systems. To appear. Available at http://www.db-net.aueb.gr/mhalk/papers/validity_survey.pdf.
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufman, 2000.
A.K. Jain and R.C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.
I.T. Jolliffe. Principal Component Analysis. Springer Verlag, 1986.
D.A. Keim and S. Eick. Proceedings Workshop on Visual Data Mining. ACM SIGKDD, 2001.
D.A. Keim and H.P. Kriegel. Visualization Techniques for Mining Large Databases: A Comparison. IEEE Transaction on Knowledge and Data Engineering, 8(6):923–938, 1996.
F. Korn et al. Quantifiable Data Mining Using Principal Component Analysis. VLDB Journal, 8(3–4):254–266, 2000.
F. Korn, H.V. Jagadish, and C. Faloutsos. Efficient Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In Proceedings of the ACM Sigmod Conf. on Magagment of Data, 1997.
M. Macedo, D. Cook, and T.J. Brown. Visual Data Mining In Atmospheric Science Data. Data Mining and Knowledge Discovery, 4(1):68–80, 2000.
G.J. MacLahan and T. Krishnan. The EM Algorithm and Extensions. Wiley, 1997.
W. H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Receips in C: The Art of Computing. Cambridge University Press, 1992.
R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
G. Strang. Linear Algebra and its Applications. Academic Press, 1980.
Telcal Team. Analisi della struttura produttiva ed occupazionale della regione calabria: Risultati. Technical report, Piano Telematico Calabria, 2001. in italian.
S. Theodoridis and K. Koutroubas. Pattern Recognition. Academic Press, 1999.
I. Witten and E. Frank. Data Mining: Practical Machine Learning Tools with Java Implementation. Morgan-Kaufman, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Manco, G., Pizzuti, C., Talia, D. (2002). Eureka!: A Tool for Interactive Knowledge Discovery. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2002. Lecture Notes in Computer Science, vol 2453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46146-9_38
Download citation
DOI: https://doi.org/10.1007/3-540-46146-9_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44126-7
Online ISBN: 978-3-540-46146-3
eBook Packages: Springer Book Archive