Abstract
In many sensor network applications, it is essential to get the data distribution of the attribute value over the network. Such data distribution can be got through clustering, which partitions the network into contiguous regions, each of which contains sensor nodes of a range of similar readings. This paper proposes a method named Distributed, Hierarchical Clustering (DHC) for online data analysis and mining in senior networks. Different from the acquisition and aggregation of raw sensory data, DHC clusters sensor nodes based on their current data values as well as their geographical proximity, and computes a summary for each cluster. Furthermore, these clusters, together with their summaries, are produced in a distributed, bottom-up manner. The resulting hierarchy of clusters and their summaries facilitates interactive data exploration at multiple resolutions. It can also be used to improve the efficiency of data-centric routing and query processing in sensor networks. We also design and evaluate the maintenance mechanisms for DHC to make it be able to work on evolving data. Our simulation results on real world datasets as well as synthetic datasets show the effectiveness and efficiency of our approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Volgyesi P, Nadas A, Koutsoukos X, Ledeczi A. Air quality monitoring with Sensor Map. In Proc. IPSN 2008, St. Louis, USA, Apr. 22-24, 2008, pp.529–530.
Barrenetxea G, Ingelrest F, Schaefer G, Vetterli M. SensorScope: Out-of-the-box environmental monitoring. In Proc. IPSN 2008, St. Louis, USA, Apr. 22-24, 2008, pp.332–343.
Michel S, Salehi A, Luo L, Dawes N, Aberer K, Barrenetxea G, Bavay M, Kansal A, Kumar A, Nath S, Parlange M, Tansley S, Ingen C V, Zhao F, Zhou Y. Environmental monitoring 2.0. In Proc. ICDE 2009, Shanghai, China, Mar. 29-Apr. 2, 2009, pp.1507–1510.
Krause A, Leskovec J, Guestrin C, Van Briesen J, Faloutsos C. Efficient sensor placement optimization for securing large water distribution networks. Journal of Water Resources Planning and Management, 2008, 134(6): 516–526.
Xue W, Luo Q, Chen L, Liu Y. Contour map matching for event detection in sensor networks. In Proc. SIGMOD 2006, Chicago, USA, Jun. 27-29, pp.145–156.
Meka A, Singh A K. Distributed spatial clustering in sensor networks. In Proc. EDBT 2006, Munich, Germany, Mar. 26-31, 2006, pp.980–1000.
Guestrin C, Bodik P, Thibaux R, Paskin M, Madden S. Distributed regression: An efficient framework for modeling sensor network data. In Proc. IPSN 2004, Berkeley, USA, Apr. 26-27, 2004, pp.1–10.
Yin J, Gaber M M. Clustering distributed time series in sensor networks. In Proc. ICDM 2008, Pisa, Italy, Dec. 15-19, 2008, pp.678–687.
Ma X, Li S, Luo Q, Yang D, Tang S. Distributed, hierarchical clustering and summarization in sensor networks. In Proc. APWeb 2007/WAIM 2007, Huangshan, China, Jun. 16-18, 2007, pp.168–175.
Han J, Kamber M. Data Mining: Concepts and Techniques, Second Edition. Morgan Kaufmann Publishers, 2006.
Zhang T, Ramakrishnan R, Livny M. BIRCH: An efficient data clustering method for very large databases. In Proc. SIGMOD 1996, Montreal, Canada, Jun. 4-6, 1996, pp.103–114.
Johnson D B, aMaltz D A. Dynamic Source Routing in AdHoc Wireless Networks. Mobile Computing, Kluwer Academic Publishers, 1996, pp.153–181.
Madden S, Franklin M J, Hellerstein J M, Hong W. Tag: A tiny aggregation service for ad hoc sensor networks. In Proc. OSDI 2002, Boston, USA, Dec. 9-11, 2002.
Olston C, Jiang J, Widom J. Adaptive filters for continuous queries over distributed data streams. In Proc. SIGMOD 2003, San Diego, USA, Jun. 9-12, pp.563–574.
Deligiannakis A, Kotidis Y, Roussopoulos N. Hierarchical innetwork data aggregation with quality guarantees. In Proc. EDBT 2004, Crete, Greece, Mar. 14-18, 2004, pp.658–675.
CRU data. http://www.cru.uea.ac.uk/cru/data, Jul. 2009.
Intel Lab data. http://berkeley.intel-research.net/labdata/, Sept. 2008.
Jindal A, Psounis K. Modeling spatially-correlated sensor network data. In Proc. SECON 2004, Santa Clara, USA, Oct. 4-7, 2004, pp.162–171.
Madden S, Franklin M J, Hellerstein J M, Hong W. The design of an acquisitional query processor for sensor networks. In Proc. SIGMOD 2003, San Diego, USA, Jun. 9-12, pp.491–502.
Breunig M M, Kriegel H, Kroger P, Sander J. Data bubbles: Quality preserving performance boosting for hierarchical clustering. In Proc. SIGMOD 2001, Santa Barbara, USA, May 21-24, 2001, pp.79–90.
Bandyopadhyay S, Coyle E J. An energy efficient hierarchical clustering algorithm for wireless sensor networks. In Proc. INFOCOM 2003, San Francisco, USA, Mar. 30-Apr. 3, 2003, pp.1713–1723.
Zhang Q, Liu J, Wang W. Approximate clustering on distributed data streams. In Proc. ICDE 2008, Cancun, Mexico, Apr. 7-12, 2008, pp.1131–1139.
Hua M, Lau MK, Pei J, Wu K. Continuous K-means monitoring with low reporting cost in sensor networks. IEEE Transaction on Knowledge and Data Engineering, 2009, 21(12): 1679–1691.
Liu C, Wu K, Pei J. An energy-efficient data collection framework for wireless sensor networks by exploiting spatiotemporal correlation. IEEE Transaction on Parallel and Distributed Systems, July 2007, 18(7): 1010–1023.
Kotidis Y. Snapshot queries: Towards data-centric sensor networks. In Proc. ICDE 2005, Tokyo, Japan, Apr. 5-8, 2005, pp.131–142.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ma, XL., Hu, HF., Li, SF. et al. DHC: Distributed, Hierarchical Clustering in Sensor Networks. J. Comput. Sci. Technol. 26, 643–662 (2011). https://doi.org/10.1007/s11390-011-1165-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-011-1165-0