Skip to main content
Log in

Resilient k-d trees: k-means in space revisited

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

We propose a k-d tree variant that is resilient to a pre-described number of memory corruptions while still using only linear space. While the data structure is of independent interest, we demonstrate its use in the context of high-radiation environments. Our experimental evaluation demonstrates that the resulting approach leads to a significantly higher resiliency rate compared to previous results. This is especially the case for large-scale multi-spectral satellite data, which renders the proposed approach well-suited to operate aboard today’s satellites.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Castano R, Mazzoni D, Tang N, Doggett T, Chien S, Greeley R, Cichy B, Davies A. Learning classifiers for science event detection in remote sensing imagery. In: Proceedings of the 8th International Symposium on Artificial Intelligence, Robotics and Automation in Space. 2005

  2. Wagstaff K L, Bornstein B. K-means in space: a radiation sensitivity evaluation. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 1097–1104

  3. Wagstaff K L, Bornstein B. How much memory radiation protection do onboard machine learning algorithms require? In: Proceedings of the IJCAI-09/SMC-IT-09/IWPSS-09 Workshop on Artificial Intelligence in Space. 2009

  4. May T C, Woods M H. Alpha-particle-induced soft errors in dynamic memories. IEEE Transactions on Electron Devices, 1979, 26(1): 2–9

    Article  Google Scholar 

  5. Kopetz H. Mitigation of transient faults at the system level — the TTA approach. In: Proceedings of the 2nd Workshop on System Effects of Logic Soft Errors, 2006

  6. Finocchi I, Grandoni F, Italiano G F. Designing reliable algorithms in unreliable memories. Computer Science Review, 2007, 1(2): 77–87

    Article  Google Scholar 

  7. MacQueen J B. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967, 281–297

  8. Vattani A. k-means requires exponentially many iterations even in the plane. In: Proceedings of the 25th Annual Symposium on Computational Geometry. 2009, 324–332

  9. Arthur D, Manthey B, Röglin H. k-means has polynomial smoothed complexity. In: Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science. 2009, 405–414

  10. Ding C, He X. k-means clustering via principal component analysis. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 225–232

  11. Elkan C. Using the triangle inequality to accelerate k-means. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 147–153

  12. Frahling G, Sohler C. A fast k-means implementation using coresets. International Journal of Computational Geometry and Applications, 2008, 18(6): 605–625

    Article  MathSciNet  MATH  Google Scholar 

  13. Jin R, Goswami A, Agrawal G. Fast and exact out-of-core and distributed k-means clustering. Knowledge and Information Systems, 2006, 10(1): 17–40

    Article  Google Scholar 

  14. Kanungo T, Mount D M, Netanyahu N S, Piatko C D, Silverman R, Wu A Y. An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881–892

    Article  Google Scholar 

  15. Sakuma J, Kobayashi S. Large-scale k-means clustering with usercentric privacy-preservation. Knowledge and Information Systems, 2010, 25(2): 253–279

    Article  Google Scholar 

  16. Bentley J L. Multidimensional binary search trees used for associative searching. Communications of the ACM, 1975, 18(9): 509–517

    Article  MathSciNet  MATH  Google Scholar 

  17. de Berg M, Cheong O, van Kreveld M, Overmars M. Computational Geometry: Algorithms and Applications. 3rd ed. Santa Clara: Springer-Verlag, 2008

    MATH  Google Scholar 

  18. Dickerson M, Duncan C A, Goodrich M T. k-d trees are better when cut on the longest side. In: Proceedings of the 8th Annual European Symposium. 2000, 179–190

  19. Brodal G S, Fagerberg R, Finocchi I, Grandoni F, Italiano G, Jørgensen A G, Moruz G, Mølhave T. Optimal resilient dynamic dictionaries. In: Proceedings of the 15th Annual European Symposium on Algorithms. 2007, 347–358

  20. Finocchi I, Grandoni F, Italiano G F. Optimal resilient sorting and searching in the presence of dynamic memory faults. Theoretical Computer Science, 2009, 410(44): 4457–4470

    Article  MathSciNet  MATH  Google Scholar 

  21. Jørgensen A G, Moruz G, Mølhave T. Priority queues resilient to memory faults. In: Proceedings of the 10th International Workshop on Algorithms and Data Structure. 2007, 127–138

  22. Petrillo U F, Finocchi I, Italiano G F. The price of resiliency: a case study on sorting with memory faults. In: Proceedings of the 14th Annual European Symposium on Algorithms. 2006, 768–779

  23. Caminiti S, Finocchi I, Fusco E G. Local dependency dynamic programming in the presence of memory faults. In: Proceedings of the 28th International Symposium on Theoretical Aspects of Computer Science. 2011, 45–56

  24. Brodal G S, Jørgensen A G, Moruz G, Mølhave T. Counting in the presence of memory faults. In: Proceedings of the 20th Annual International Symposium on Algorithms and Computation. 2009, 842–851

  25. Brodal G S, Jørgensen A G, Mølhave T. Fault tolerant external memory algorithms. In: Proceedings of the 11th International Symposium on Algorithms and Data Structures. 2009, 411–422

  26. Boyer R S, Moore J S. MJRTY: a fast majority vote algorithm. In: Automated Reasoning: Essays in Honor of Woody Bledsoe. 1991, 105–118

  27. Bose P, Maheshwari A, Morin P, Morrison J, Smid M, Vahrenhold J. Space-efficient geometric divide-and-conquer algorithms. Computational Geometry: Theory and Applications, 2007, 37(3): 209–227

    Article  MathSciNet  MATH  Google Scholar 

  28. Frank A, Asuncion A. UCI machine learning repository. 2010, http://archive.ics.uci.edu/ml

  29. Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193–218

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabian Gieseke.

Additional information

Fabian Gieseke received his Diplomas in Mathematics and Computer Science in 2006 fromthe University of Münster, Germany. He is currently working towards his PhD in the field of machine learning. His research interests include various related topics including support vector machines and its extensions to semi- and unsupervised learning settings as well as applications of machine learning techniques to problems in the field of astronomy.

Gabriel Moruz received his PhD in Computer Science from the University of Aarhus in 2007. Since 2007 he has been a post doctoral researcher in the group of Prof. Ulrich Meyer (Chair for Algorithm Engineering) at the Goethe University, Frankfurt am Main. His primary research interest concerns the design of efficient algorithms and data structures in practice. This involves designing and implementing algorithms that are aware of various hardware issues that have a great impact in practice. These include cacheaware and cache-oblivious algorithms, streaming algorithms, algorithms performing few branch mispredictions, as well as resilient algorithms, i.e., algorithms aware of memory corruption.

Jan Vahrenhold received his Diploma in Mathematics and his PhD and Habilitation in Computer Science from the University of Münster, Germany, in 1996, 1999, and 2004, respectively. Since 2006, he has been an Associate Professor with the Faculty of Computer Science, Technische Universität Dortmund, Germany. His current research interests include I/O-and cache-efficient algorithms and data structures, computational geometry, and computer science education.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gieseke, F., Moruz, G. & Vahrenhold, J. Resilient k-d trees: k-means in space revisited. Front. Comput. Sci. 6, 166–178 (2012). https://doi.org/10.1007/s11704-012-2870-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-012-2870-8

Keywords

Navigation