Abstract
Data mining involves useful knowledge discovery using a data matrix consisting of records and attributes or variables. Not all the attributes may be useful in knowledge discovery, as some of them may be redundant, irrelevant, noisy or even opposing. Furthermore, using all the attributes increases the complexity of solving the problem. The Minimum Attribute Subset Selection Problem (MASSP) has been studied for well over three decades and researchers have come up with several solutions In this paper a new technique is proposed for the MASSP based on the crossing minimization paradigm from the domain of graph drawing using biclustering. Biclustering is used to quickly identify those attributes that are significant in the data matrix. The attributes identified are then used to perform one-way clustering and generate pixelized visualization of the clustered results. Using the proposed technique on two real datasets has shown promising results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abdullah, A., Hussain, A.: A new biclustering technique based on crossing minimization. The Neurocomputing Journal (to appear, 2006)
Abdullah, A., Brobst, S.: Clustering by recursive noise removal. In: Proc. Atlantic Symposium on Computational Biology and Genome Informatics, North Carolina, pp. 973–977 (2003)
Traina, C., Wu, L., Traina, A., Faloutsos, C.: Fast Feature Selection Using Fractal Dimension. In: XV Brazilian Symposium on Databases (SBBD), Paraiba, Brazil, October (2000)
Badjio, E.F., Poulet, F.: Dimension Reduction for Visual Data Mining. ESIEA Recherche, Parc Universitaire de Laval-Change, 38 Rue des Docteurs Calmette et Guerin, 53000 Laval, France (2005)
Pappa, G.L., Freitas, A.A., Kaestner, C.A.A.: A multiobjective Genetic Algorithm for Attribute Selection. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 280–290. Springer, Heidelberg (2002)
John, G.H.: Enhancements to the Data Mining Process. PhD Dissertation, Stanford University (1997)
Liu, H., Motoda, H.: Feature selection for knowledge discovery and data mining. Kluwer International Series in Engineering and Computer Science, Secs. Kluwer Academic Publishers, Dordrecht (1998)
Inza, I., Larranga, P., Etxeberria, R., Sierra, B.: Feature Subset Selection by Bayesian networks based optimization. Artificial Intelligence 123(1-2), 157–184 (2000)
Sugiyama, K., Tagawa, S., Toda, M.: Methods for Visual Understanding of Hierarchical Systems. IEEE Trans. Syst. Man Cybern (SMC) 11(2), 109–125 (1981)
Boudjeloud, L., Poulet, F.: Attribute Selection for High Dimensional Data Clustering. ESIEA Recherche, Parc Universitaire de Laval-Change, 38 Rue des Docteurs Calmette et Guerin, 53000 Laval, France (2005)
Hall, M.A., Holmes, G.: Benchmarking Attribute Selection Techniques for Discrete Class Data Mining. IEEE Trans. on knowledge discovery and Data Engineering 15(3) (2003)
Dash, M., Liu, H.: Feature Selection for Classification. Intelligent data Analysis 1(3) (1997)
Dong, M.: A New Measure of Classifiability and Its Applications. PhD Dissertation Department of Electrical & Computer Engineering and Computer Science of the College of Engineering, University of Cincinnati (2001)
Garey, M.R., Johnson, D.S.: Crossing number is NP-Complete. SIAM J. Algebraic Discrete Methods 4, 312–316 (1983)
Eades, P., Wormald, N.: The Median Heuristic for Drawing 2-layers Networks, Tech. Report 69, Department of Computer Science, University of Queensland, Brisbane, Australia (1986)
Marti, R., Laguna, M.: Heuristics and Meta Heuristics for 2-layer Straight Line Crossing Minimization. Discrete Applied Mathematics 127(Issue 3), 665–678 (2001)
Madeira, S.C., Oliveira, A.L.: Biclustering Algorithms for Biological Data Analysis: A Survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2004)
Fayyad, U., Uthurusamy, R.: Data Mining and Knowledge Discovery in Databases. Comm. ACM 39(11), 24–27 (1996)
Kim, Y., Street, W.N., Menczer, F.: Evolutionary model selection in unsupervised learning, in vol. 6, pp. 531–556. IOS Press (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Abdullah, A., Hussain, A. (2007). Using Biclustering for Automatic Attribute Selection to Enhance Global Visualization. In: Lévy, P.P., et al. Pixelization Paradigm. VIEW 2006. Lecture Notes in Computer Science, vol 4370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71027-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-71027-1_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71026-4
Online ISBN: 978-3-540-71027-1
eBook Packages: Computer ScienceComputer Science (R0)