Abstract
We propose an association analysis-based strategy for exploration of multi-attribute spatial datasets possessing naturally arising classification. Proposed strategy, ESTATE (Exploring Spatial daTa Association patTErns), inverts such classification by interpreting different classes found in the dataset in terms of sets of discriminative patterns of its attributes. It consists of several core steps including discriminative data mining, similarity between transactional patterns, and visualization. An algorithm for calculating similarity measure between patterns is the major original contribution that facilitates summarization of discovered information and makes the entire framework practical for real life applications. Detailed description of the ESTATE framework is followed by its application to the domain of ecology using a dataset that fuses the information on geographical distribution of biodiversity of bird species across the contiguous United States with distributions of 32 environmental variables across the same area.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Swami, A.N.: Fast algorithms for mining association rules. In: Proc. VLDB, pp. 487–499 (1994)
Bay, S.D., Pazzani, M.J.: Detecting change in categorical data: Mining contrast sets. In: Knowledge Discovery and Data Mining, pp. 302–306 (1999)
Bayardo Jr., R.J.: Efficiently mining long patterns from databases. In: SIGMOD 1998: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, United States, pp. 85–93 (1998)
Brunsdon, C.A., Fotheringham, A.S., Charlton, M.B.: Geographically weighted regression: a method for exploring spatial nonstationarity. Geographical Analysis 28, 281–298 (1996)
Burdick, D., Calimlim, M., Gehrke, J.: Mafia: a maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany (2001)
Calders, T., Goethals, B.: Non-derivable itemset mining. Data Min. Knowl. Discov. 14(1), 171–206 (2007)
Cheng, J., Masser, I.: Urban growth pattern modeling: a case study of wuhan city, PR China. Landscape and Urban Planning 62(4), 199–217 (2003)
Demar, U., Fotheringham, S.A., Charlton, M.: Combining geovisual analytics with spatial statistics: the example of Geographically Weighted Regression. The Cartographic Journal 45(3), 182–192 (2008)
Ding, W., Stepinski, T.F., Salazar, J.: Discovery of geospatial discriminating patterns from remote sensing datasets. In: Proceedings of SIAM International Conference on Data Mining (2009)
Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: KDD 1999: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, United States, pp. 43–52 (1999)
Dong, J., Perrizo, W., Ding, Q., Zhou, J.: The application of association rule mining to remotely sensed data. In: 345 (ed.) Proc. of the 2000 ACM Symposium on Applied Computing (2000)
Fotheringham, A.S., Brunsdon, C., Charlton, M.: Geographically Weighted Regression: the analysis of spatially varying relationships. Wiley, Chichester (2002)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: ICDM 2002: Proceedings of the 2002 IEEE International Conference on Data Mining, Washington, DC, USA, p. 211 (2002)
Hu, Z., Lo, C.: Modeling urban growth in Atlanta using logistic regression. Computers, Environment and Urban Systems 31(6), 667–688 (2007)
Jenks, G.F.: The data model concept in statistical mapping. International Yearbook of Cartography 7, 186–190 (1967)
Jin, R., Abu-Ata, M., Xiang, Y., Ruan, N.: Effective and efficient itemset pattern summarization: regression-based approaches. In: KDD 2008: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, pp. 399–407 (2008)
Lin, D.: An information-theoretic definition of similarity. In: International Conference on Machine Learning, Madison, Wisconsin (July 1998)
McQuitty, L.: Similarity analysis by reciprocal pairs for discrete and continuous data. Educational and Psychological Measurement 26, 825–831 (1966)
Mennis, J., Liu, J.W.: Mining association rules in spatio-temporal data: An analysis of urban socioeconomic and land cover change. Transactions in GIS 9(1), 5–17 (2005)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Rajasekar, U., Weng, Q.: Application of association rule mining for exploring the relationship between urban land surface temperature and biophysical/social parameters. Photogrammetric Engineering & Remote Sensing 75(3), 385–396 (2009)
Stepinski, T., Salazar, J., Ding, W.: Discovering spatio-social motifs of electoral support using discriminative pattern mining. In: Proceedings of COM.geo. 2010 1st International Conference on Computing for Geospatial Reserch & Applications (2010)
Stepinski, T.F., Ding, W., Eick, C.F.: Controlling patterns of geospatial phenomena. submitted to Geoinformatica (2010)
Theobald, D.M., Hobbs, N.T.: Forecasting rural land use change: a comparison of regression and spatial transition-based models. Geographical and Environmental Modeling 2, 65–82 (1998)
Wang, C., Parthasarathy, S.: Summarizing itemset patterns using probabilistic models. In: KDD 2006: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, pp. 730–735 (2006)
White, D., Preston, B., Freemark, K., Kiester, A.: A hierarchical framework for conserving biodiversity. In: Klopatek, J., Gardner, R. (eds.) Landscape Ecological Analysis: Issues and Applications, pp. 127–153. Springer, New York (1999)
White, D., Sifnenos, J.C.: Regression tree cartography. J. Computational and Graphical Statistics 11(3), 600–614 (2002)
Wilkinson, L., Friendly, M.: The history of the cluster heat map. The American Statistician 63(2), 179–184 (2009)
Wu, B., Huang, B., Fung, T.: Projection of land use change patterns using kernel logistic regression. Photogrammetric Engineering & Remote Sensing 75(8), 971–979 (2009)
Wu, F., Yeh, A.G.: Changing spatial distribution and determinants of land development in Chinese cities in the transition from a centrally planned economy to a socialist market economy: A case study of Guangzhou. Urban Studies 34(11), 1851–1879 (1997)
Xin, D., Han, J., Yan, X., Cheng, H.: Mining compressed frequent-pattern sets. In: VLDB 2005: Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, pp. 709–720 (2005)
Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing itemset patterns: a profile-based approach. In: KDD 2005: Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, Illinois, USA, pp. 314–323 (2005)
Yang, K., Carr, D., O’Connor, R.: Smoothing of breeding bird survey data to produce national biodiversity estimates. In: Proceeding of the 27th Symposium on the Interface Computing Science and Statistics, pp. 405–409 (1995)
Zaki, M., Ogihara, M.: Theoretical foundations of association rules. In: 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stepinski, T.F., Salazar, J., Ding, W., White, D. (2010). ESTATE: Strategy for Exploring Labeled Spatial Datasets Using Association Analysis. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds) Discovery Science. DS 2010. Lecture Notes in Computer Science(), vol 6332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16184-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-16184-1_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16183-4
Online ISBN: 978-3-642-16184-1
eBook Packages: Computer ScienceComputer Science (R0)