Abstract
The outgrowth of technology in geographical databases has enhanced the growth of spatial databases, to deal with such enlarging databases scientists are laying down enormous efforts that can efficiently process these databases. Spatial data mining techniques has been collaboratively applied to extract implicit knowledge from spatial as well as non-spatial attributes. These techniques are efficiently applied in several fields such as healthcare, environmental, marketing and remote sensing databases to improve planning and decision making process. In this paper, we have designed and implemented SpaGRID framework for detection of spatial clusters. The framework has unprecedented efficiency to extract implicit knowledge of spatial data, due to its accessibility to handle and discover hidden patterns from spatial databases. We have also illustrated the usage of spatial variations among the United States men with prevalence of prostate cancer disease. The data of age group was taken from (15-65+) years in this group prostate cancers were examined and several stages of disease diagnosis was taken into account. The population of data was characterized by white, black and others were too small to be taken into account. Numerous challenges were encountered due to complexity of spatial datasets hence being resolved by certain statistical measures. The approach is to discover knowledge from spatial databases and design different aspects of knowledge discovery process from spatial databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In: Proceedings of the 1998 ACM-SIGMOD International Conference on Management of Data, Seattle, Washington, pp. 94–105 (June 1998)
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: Ordering Points To Identify the Clustering Structure. In: Proceedings of the ACM SIGMOD Conference, Philadelphia, PA, USA, pp. 49–60 (1999)
Borah, B., Bhattacharyya, D.K.: An Improved Sampling-based DBSCAN for Large Spatial Databases. In: Proceedings of the International Conference on Intelligent Sensing and Information, p. 92 (2004)
Xu, X., Ester, M., Kriegel, H.-P., Sander, J.: A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases. In: Proceedings of the International Conference on Data Engineering (ICDE 1998), Orlando, FL, pp. 324–331. AAAIPress (1998)
Giha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp.73–84 (June 1998)
Goil, S., Nagesh, H., Choudhary, A.: MAFIA: Efficient and Scalable Clustering for very large data sets. Technical Report No. CPDC – TR – 9906 – 010 ©1999 Center for Parallel and distributed Computing (June 1999)
Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. In: Proceedings of the 15th International Conference on Data Engineering, pp. 512–521 (March 1999)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of SIGMOD International Conference, pp. 103–114 (1996)
Hinneburg, A., Keim, A.D.: An Efficient Approach to Clustering in Large Multimedia Databases with Noise. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD 1998), New York, NY, USA, pp. 58–65 (August 1998)
Kaur, H., Wasan, S.K.: An Integrated Approach in Medical Decision Making for Eliciting Knowledge, Web-based Applications in Health Care & Biomedicine. In: Lazakidou, A. (ed.) Annals of Information Systems (AoIS). Springer, Heidelberg (2009)
Karypis, G., Eui-Hong, H., Kumar, V.: CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling. IEEE Computer 32(8), 68–75 (1999)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster in Analysis. John Wiley and Sons (1990)
Ng, R.T., Han, J.: Efficient and Effective clustering methods for spatial data mining. In: Bocca, J., Jarke, M., Zaniolo, C. (eds.) 20th International Conference on Very Large Data Bases, pp. 4–155. Morgan Kaufmann Publishers, USA (1994)
Miller, H.J., Han, J.: Geographic data mining and knowledge discovery. Taylor and Francis (2001)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A Multi Resolution Clustering Approach for Very Large Spatial Databases. In: Proceedings of 24th Very Large Databases Conference (VLDB 1998), New York, NY, USA, pp. 428–439 (1998)
Wang, W., Yang, J., Muntz, R.: STING: A Statistical Information Grid Approach to Spatial Data Mining. In: Proceedings of the 23rd VLDB Conference, Athens, Greece, pp. 186–195 (1997)
Laurini, R., Thompson, D.: Fundamentals of spatial Information systems. Academic Press (1992)
Cressie, N.: Statistics for a Spatial Data, revised edn. Wiley, NY (1990)
Chauhan, R., Kaur, H., Alam, M.A.: Data Clustering Method for discovering clusters in Spatial Cancer Databases. International Journal of Computer Applications (2010)
Kaur, H., Chauhan, R., Alam, M.A.: An Optimal Categorization of Feature Selection Methods for Knowledge Discovery. In: Zhang, Segall, Cao (eds.) Visual Analytics and Interactive Technologies:Data, Text and Web Mining Applications. IGI Publishers Inc. (2010)
Seeger, B., Kriegel, H.P.: Techniques for design and implementation of spatial access methods. In: Proceedings of 14th International Conference on Very Large Databases, pp. 360–371 (1988)
Kohavi, R., John, G.: Wrappers for feature subset election. Artificial Intelligence 1-2, 273–324 (1997)
Dash, M., Liu, H.: Feature selection methods for classifications. Intelligent Data Analysis. An International Journal 1, 131–156 (1997)
Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, New Orleans (1994)
Peucker, T.K., Chrisman, N.: Cartographic data structures. American Cartographer 2, 55–69 (1975)
Overmars, M.H., Leeuwen, J.V.: Dynamic multi-dimensional data structures based on quad tree and k-d-trees. Acta Informatica 17(3), 267–285 (1982)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)
Dy, J.G., Brodley, C.E.: Feature Subset Selection and Order Identification for Unsupervised Learning. In: Proceedings of the 17th International Conference on Machine Learning, pp. 247–254. Stanford University, CA (2000)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS, pp. 507–514 (2005)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell (1998)
Zhang, C., Murayama, Y.: Testing local spatial autocorrelation using k-order neighbors. International Journal of Geographical Information Science 14, 681–692 (2000)
Kang, I.S., Kim, T.W., Li, K.J.: A spatial data mining method by Delaunay triangulation. In: The 5th International Workshop on Advances in Geographic Information Systems, LasVegas, Nevada (1997)
Harel, D., Koren, Y.: Clustering spatial data using random walks. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California (2001)
Corchado, E., Graña, M., Wozniak, M.: New trends and applications on hybrid artificial intelligence systems. Neurocomputing 75(1), 61–63 (2012)
Mann, S., Benwell, G.L.: The integration of ecological, neural and spatial modeling for monitoring and prediction for semi-arid landscapes. Computers and Geosciences 22(9), 1003–1012 (1996)
Jacquez, G.: Spatial analysis in epidemiology: Nascent science or a failure of GIS? Journal of Geographical Systems 2, 91–97 (2000)
Corchado, E., Abraham, A., Carvalho, A.: Hybrid intelligent algorithms and applications. Information Sciences 180(14), 2633–2634 (2010)
Cohen, J., Cohen, P.: Applied multiple regression/correlation analysis for the behavioral sciences, 2nd edn. Erlbaum, Hillsdale (1983)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaur, H., Chauhan, R., Alam, M.A., Aljunid, S., Salleh, M. (2012). SpaGRID: A Spatial Grid Framework for High Dimensional Medical Databases. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_62
Download citation
DOI: https://doi.org/10.1007/978-3-642-28942-2_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28941-5
Online ISBN: 978-3-642-28942-2
eBook Packages: Computer ScienceComputer Science (R0)