Abstract
Spatial clustering has been identified as an important technique in data mining owing to its various applications. In the conventional spatial clustering methods, data points are clustered mainly according to their geographic attributes. In real applications, however, the obtained data points consist of not only geographic attributes but also non-geographic ones. In general, geographic attributes indicate the data locations and non-geographic attributes show the characteristics of data points. It is thus infeasible, by using conventional spatial clustering methods, to partition the geographic space such that similar data points are grouped together. In this paper, we propose an effective and efficient algorithm, named incremental clustering toward the Bound INformation of Geography and Optimization spaces, abbreviated as BINGO, to solve the problem. The proposed BINGO algorithm combines the information in both geographic and non-geographic attributes by constructing a summary structure and possesses incremental clustering capability by appropriately adjusting this structure. Furthermore, most parameters in algorithm BINGO are determined automatically so that it is easy to be applied to applications without resorting to extra knowledge. Experiments on synthetic are performed to validate the effectiveness and the efficiency of algorithm BINGO.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ben-Hur, A., et al.: Support vector clustering. J. Machine Learning Research 2 (2001)
Chen, M.-S., Han, J., Yu, P.S.: Data mining: An overview from database perspective. IEEE TKDE 8(6) (1996)
Ester, M., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. of KDD (1996)
Estivill-Castro, V., Lee, I.: Autoclust+: Automatic clustering of point-data sets in the presence of obstacles. In: Roddick, J.F., Hornsby, K. (eds.) TSDM 2000. LNCS (LNAI), vol. 2007, Springer, Heidelberg (2001)
Fayyad, U.M., et al.: Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)
Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: Proc. of SIGMOD (1998)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2000)
Hush, D., Scovel, C.: Polynomial-time decomposition algorithms for support vector machines. In: Machine Learning (2003)
King, B.: Step-wise clustering procedures. J. Am. Statistical Assoc. 69 (1967)
Lin, C.-R., Liu, K.-H., Chen, M.-S.: Dual clustering: Integrating data clustering over optimization and constraint domains. IEEE TKDE 17(5) (2005)
Nanopoulos, A., Theodoridis, Y., Manolopoulos, Y.: C2p: Clustering based on closest pairs. In: Proc. of VLDB (2001)
Ng, R., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Proc. of VLDB (1994)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: Wavecluster: A wavelet based clustering approach for spatial data in very large database. VLDBJ 8(3/4) (1999)
Tung, A.K.H., et al.: Spatial clustering in the presence of obstacles. In: Proc. of ICDE (2001)
Zaiane, O.R., et al.: On data clustering analysis: Scalability, constraints, and validation. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, Springer, Heidelberg (2002)
Zhang, J., Hsu, W., Lee, M.L.: Clustering in dynamic spatial database. Journal of Intelligent Information System 24(1) (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Tai, CH., Dai, BR., Chen, MS. (2007). Incremental Clustering in Geography and Optimization Spaces. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)