skip to main content
10.1145/1601966.1601975acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

EDISKCO: energy efficient distributed in-sensor-network k-center clustering with outliers

Published: 28 June 2009 Publication History

Abstract

Clustering is an established data mining technique for grouping objects based on similarity. For sensor networks one aims at grouping sensor measurements in groups of similar measurements. As sensor networks have limited resources in terms of available memory and energy, a major task sensor clustering is efficient computation on sensor nodes. As a dominating energy consuming task, communication has to be reduced for a better energy efficiency. Considering memory, one has to reduce the amount of stored information on each sensor node.
For in-network clustering, k-center based approaches provide k representatives out of the collected sensor measurements. We propose EDISKCO, an outlier aware incremental method for efficient detection of k-center clusters. Our novel approach is energy aware and reduces amount of required transmissions while producing high quality clustering results. In thorough experiments on synthetic and real world data sets, we show that our approach outperforms a competing technique in both clustering quality and energy efficiency. Thus, we achieve overall significantly better life times of our sensor networks.

References

[1]
I. Assent, R. Krieger, E. Müller, and T. Seidl. INSCY: Indexing subspace clusters with in-process-removal of redundancy. In Proc. IEEE ICDM, pages 719--724, 2008.
[2]
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is nearest neighbors meaningful. In Proc. IDBT, pages 217--235, 1999.
[3]
M. Charikar, C. Chekuri, T. Feder, and R. Motwani. Incremental clustering and dynamic information retrieval. In Proc. ACM STOC:, pages 626--635, 1997.
[4]
M. Charikar, S. Khuller, D. M. Mount, and G. Narasimhan. Algorithms for facility location problems with outliers. In Proc. SODA, pages 642--651, 2001.
[5]
M. Charikar, L. O'Callaghan, and R. Panigrahy. Better streaming algorithms for clustering problems. In Proc. ACM STOC, pages 30--39, 2003.
[6]
G. Cormode, S. Muthukrishnan, and W. Zhuang. Conquering the divide: Continuous clustering of distributed data streams. In Proc. IEEE ICDE, pages 1036--1045, 2007.
[7]
T. Feder and D. Greene. Optimal algorithms for approximate clustering. In Proc. ACM STOC, pages 434--444, 1988.
[8]
A. R. Ganguly, J. Gama, O. A. Omitaomu, M. M. Gaber, and R. R. Vatsavai. Knowledge Discovery from Sensor Data. 2008.
[9]
T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(2--3):293--306, 1985.
[10]
S. Guha. Tight results for clustering and summarizing data streams. In Proc. ICDT, pages 268--275, 2009.
[11]
W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energy-efficient communication protocol for wireless microsensor networks. In Proc. HICSS, 2000.
[12]
D. Hochbaum and D. Shmoys. A best possible approximation algorithm for the k-centre problem. Math. of Operations Research, 10:180--184, 1985.
[13]
R. Matthew Mccutchen and S. Khuller. Streaming algorithms for k-center clustering with outliers and with anonymity. In Proc. Workshop APPROX / RANDOM, pages 165--178, 2008.
[14]
E. Müller, I. Assent, R. Krieger, S. Günnemann, and T. Seidl. DensEst: Density estimation for data mining in high dimensional spaces. In Proc. SIAM SDM, pages 173--184, 2009.
[15]
E. Müller, I. Assent, U. Steinhausen, and T. Seidl. OutRank: ranking outliers in high dimensional data. In Proc. DBRank Workshop at IEEE ICDE, pages 600--603, 2008.
[16]
S. F. Ossama Younis. Heed: A hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks. IEEE Transactions on Mobile Computing, 3(4):366--379, 10 2004.
[17]
A. S. Tanenbaum, C. Gamage, and B. Crispo. Taking sensor networks from the lab to the jungle. Computer, 39(8):98--100, 2006.
[18]
R. Tibshirani and G. Walther. Cluster validation by prediction strength. Journal of Computational&Graphical Statistics, 14(3):511--528, September 2005.

Cited By

View all
  • (2023)Fair $k$-Center Problem with Outliers on Massive DataTsinghua Science and Technology10.26599/TST.2023.901001328:6(1072-1084)Online publication date: Dec-2023
  • (2022)Distributed Fair k-Center Clustering Problems with OutliersParallel and Distributed Computing, Applications and Technologies10.1007/978-3-030-96772-7_39(430-440)Online publication date: 16-Mar-2022
  • (2018)Overview of Efficient Clustering Methods for High-Dimensional Big Data StreamsClustering Methods for Big Data Analytics10.1007/978-3-319-97864-2_2(25-42)Online publication date: 28-Oct-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SensorKDD '09: Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
June 2009
150 pages
ISBN:9781605586687
DOI:10.1145/1601966
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

KDD09
Sponsor:

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Fair $k$-Center Problem with Outliers on Massive DataTsinghua Science and Technology10.26599/TST.2023.901001328:6(1072-1084)Online publication date: Dec-2023
  • (2022)Distributed Fair k-Center Clustering Problems with OutliersParallel and Distributed Computing, Applications and Technologies10.1007/978-3-030-96772-7_39(430-440)Online publication date: 16-Mar-2022
  • (2018)Overview of Efficient Clustering Methods for High-Dimensional Big Data StreamsClustering Methods for Big Data Analytics10.1007/978-3-319-97864-2_2(25-42)Online publication date: 28-Oct-2018
  • (2017)Using internal evaluation measures to validate the quality of diverse stream clustering algorithmsVietnam Journal of Computer Science10.1007/s40595-016-0086-94:3(171-183)Online publication date: 1-Aug-2017
  • (2015)Fast distributed k-center clustering with outliers on massive dataProceedings of the 29th International Conference on Neural Information Processing Systems - Volume 110.5555/2969239.2969358(1063-1071)Online publication date: 7-Dec-2015
  • (2015)Subspace clustering of data streamsJournal of Intelligent Information Systems10.1007/s10844-014-0319-245:3(319-335)Online publication date: 1-Dec-2015
  • (2014)Efficient Streaming Detection of Hidden Clusters in Big Data Using Subspace Stream ClusteringDatabase Systems for Advanced Applications10.1007/978-3-662-43984-5_11(146-160)Online publication date: 11-Jul-2014
  • (2013)Outlier Detection Based On Similar Flocking Model In Wireless Sensor NetworksInternational Journal on Smart Sensing and Intelligent Systems10.21307/ijssis-2017-5266:1(18-37)Online publication date: 20-Feb-2013
  • (2013)Effective Evaluation Measures for Subspace Clustering of Data StreamsRevised Selected Papers of PAKDD 2013 International Workshops on Trends and Applications in Knowledge Discovery and Data Mining - Volume 786710.1007/978-3-642-40319-4_30(342-353)Online publication date: 14-Apr-2013
  • (2012)Density-Based projected clustering of data streamsProceedings of the 6th international conference on Scalable Uncertainty Management10.1007/978-3-642-33362-0_24(311-324)Online publication date: 17-Sep-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media