Abstract
Sensor networks monitor physical phenomena over large geographic regions. Scientists can gain valuable insight into these phenomena, if they understand the underlying data distribution. Such data characteristics can be efficiently extracted through spatial clustering, which partitions the network into a set of spatial regions with similar observations. The goal of this paper is to perform such a spatial clustering, specifically δ-clustering, where the data dissimilarity between any two nodes inside a cluster is at most δ. We present an in-network clustering algorithm ELink that generates good δ-clusterings for both synchronous and asynchronous networks in \(O(\sqrt{N} {\rm log}N)\) time and in O(N) message complexity, where N denotes the network size. Experimental results on both real world and synthetic data sets show that ELink’s clustering quality is comparable to that of a centralized algorithm, and is superior to other alternative distributed techniques. Furthermore, ELink performs 10 times better than the centralized algorithm, and 3-4 times better than the distributed alternatives in communication costs. We also develop a distributed index structure using the generated clusters that can be used for answering range queries and path queries. The query algorithms direct the spatial search to relevant clusters, leading to performance gains of up to a factor of 5 over competing techniques.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
The EROS data center for geological survey., http://edc.usgs.gov/geodata/
Tropical atmosphere ocean project., http://www.pmel.noaa.gov/tao/
Crossbow, Inc. Wireless sensor networks, http://www.xbow.com/
Awerbuch, B.: Complexity of network synchronization. JACM 32(4), 804–823 (1985)
Berry, J., Fleischer, L., Hart, W.E., Phillips, C.A.: Sensor placement in municipal water networks. World Water and Environmental Resources Congress (2003)
Chintalapudi, K.K., Govindan, R.: Localized edge detection in a sensor field. In: SNPA (2003)
Ciaccia, P., Patella, M., Zevula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB (1997)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the expectation-maximization algorithm. Journal of Royal Statistical Society 9(1), 1–38 (1999)
Deshpande, A., Guestrin, C., Hong, W., Madden, S.: Exploiting correlated attributes in acquisitonal query processing. In: ICDE (2005)
Elnahrawy, E., Nath, B.: Context-aware sensors. In: Karl, H., Wolisz, A., Willig, A. (eds.) EWSN 2004. LNCS, vol. 2920, pp. 77–93. Springer, Heidelberg (2004)
Estrin, D., Govindan, R., Heidemann, J.: Next century challenges: Scalable coordination in sensor networks. In: MOBICOM (1999)
Ganesan, D., Estrin, D., Heidemann, J.: Dimensions:Why do we need a new data handling architecture for sensor networks? In: SIGCOMM (2003)
Ghanti, V., Ramakrishnan, R., Gehrke, J.: Clustering large datasets in arbitrary metric spaces. In: ICDE (1999)
Guestrin, C., Bodik, P., Thibaux, R., Paskin, M., Madden, S.: Distributed regression: An efficient framework for modeling sensor network data. In: IPSN (2004)
Han, J., Kamber, M.: Data mining: Concepts and techniques. Morgan Kaufmann, San Francisco (2001)
Karp, B., Kung, H.T.: GPSR: Greedy perimeter stateless routing for wireless networks. In: MOBICOM (2003)
Kotidis, Y.: Snapshot queries: Towards data-centric sensor networks. In: ICDE (2005)
Li, Q., DeRosa, M., Rus, D.: Distributed algorithms for guiding navigation across a sensor network. In: MOBICOM (2003)
Lund, C., Yannakakis, M.: On the hardness of approximating minimization problems. JACM 41(5), 960–981 (1997)
Madden, S., Franklin, M., Hellerstein, J., Hong, W.: The design of an acquisitional query processor for sensor networks. In: SIGMOD (2003)
Meka, A., Singh, A.K.: Distributed algorithms for discovering and mining spatial clusters in sensor networks. UCSB TechReport (2005)
Ng, A.Y., Jordan, M., Weiss, Y.: On spectral clustering: Analysis and an algorithm. NIPS (2002)
Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: VLDB (1994)
Ng, R.T., Han, J.: Efficient clustering methods for spatial data mining. In: VLDB (1997)
Olston, C., Loo, B.T., Widom, J.: Adaptive precision setting for cached approximate values. In: SIGMOD (2001)
Pourahmadi, M.: Foundations of time series analysis and prediction theory. Wiley, Chichester (2001)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-resolution clustering approach for very large spatial databases. In: VLDB (1998)
Wang, W., Yang, J., Muntz, R.R.: STING: A statistical information grid approach to spatial data mining. In: VLDB (1997)
Yi, B.K., Sidiropoulos, N.D., Johnson, T., Jagadish, H.V., Faloutsos, C.: Online data mining for co-evolving time sequences. In: ICDE (2000)
Younis, O., Fahmy, S.: HEED: A hybrid energy-efficient distributed clustering for adhoc sensor networks. INFOCOM (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meka, A., Singh, A.K. (2006). Distributed Spatial Clustering in Sensor Networks. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_57
Download citation
DOI: https://doi.org/10.1007/11687238_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)