Abstract
We propose a k-d tree variant that is resilient to a pre-described number of memory corruptions while still using only linear space. While the data structure is of independent interest, we demonstrate its use in the context of high-radiation environments. Our experimental evaluation demonstrates that the resulting approach leads to a significantly higher resiliency rate compared to previous results. This is especially the case for large-scale multi-spectral satellite data, which renders the proposed approach well-suited to operate aboard today’s satellites.
Similar content being viewed by others
References
Castano R, Mazzoni D, Tang N, Doggett T, Chien S, Greeley R, Cichy B, Davies A. Learning classifiers for science event detection in remote sensing imagery. In: Proceedings of the 8th International Symposium on Artificial Intelligence, Robotics and Automation in Space. 2005
Wagstaff K L, Bornstein B. K-means in space: a radiation sensitivity evaluation. In: Proceedings of the 26th International Conference on Machine Learning. 2009, 1097–1104
Wagstaff K L, Bornstein B. How much memory radiation protection do onboard machine learning algorithms require? In: Proceedings of the IJCAI-09/SMC-IT-09/IWPSS-09 Workshop on Artificial Intelligence in Space. 2009
May T C, Woods M H. Alpha-particle-induced soft errors in dynamic memories. IEEE Transactions on Electron Devices, 1979, 26(1): 2–9
Kopetz H. Mitigation of transient faults at the system level — the TTA approach. In: Proceedings of the 2nd Workshop on System Effects of Logic Soft Errors, 2006
Finocchi I, Grandoni F, Italiano G F. Designing reliable algorithms in unreliable memories. Computer Science Review, 2007, 1(2): 77–87
MacQueen J B. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967, 281–297
Vattani A. k-means requires exponentially many iterations even in the plane. In: Proceedings of the 25th Annual Symposium on Computational Geometry. 2009, 324–332
Arthur D, Manthey B, Röglin H. k-means has polynomial smoothed complexity. In: Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science. 2009, 405–414
Ding C, He X. k-means clustering via principal component analysis. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 225–232
Elkan C. Using the triangle inequality to accelerate k-means. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 147–153
Frahling G, Sohler C. A fast k-means implementation using coresets. International Journal of Computational Geometry and Applications, 2008, 18(6): 605–625
Jin R, Goswami A, Agrawal G. Fast and exact out-of-core and distributed k-means clustering. Knowledge and Information Systems, 2006, 10(1): 17–40
Kanungo T, Mount D M, Netanyahu N S, Piatko C D, Silverman R, Wu A Y. An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881–892
Sakuma J, Kobayashi S. Large-scale k-means clustering with usercentric privacy-preservation. Knowledge and Information Systems, 2010, 25(2): 253–279
Bentley J L. Multidimensional binary search trees used for associative searching. Communications of the ACM, 1975, 18(9): 509–517
de Berg M, Cheong O, van Kreveld M, Overmars M. Computational Geometry: Algorithms and Applications. 3rd ed. Santa Clara: Springer-Verlag, 2008
Dickerson M, Duncan C A, Goodrich M T. k-d trees are better when cut on the longest side. In: Proceedings of the 8th Annual European Symposium. 2000, 179–190
Brodal G S, Fagerberg R, Finocchi I, Grandoni F, Italiano G, Jørgensen A G, Moruz G, Mølhave T. Optimal resilient dynamic dictionaries. In: Proceedings of the 15th Annual European Symposium on Algorithms. 2007, 347–358
Finocchi I, Grandoni F, Italiano G F. Optimal resilient sorting and searching in the presence of dynamic memory faults. Theoretical Computer Science, 2009, 410(44): 4457–4470
Jørgensen A G, Moruz G, Mølhave T. Priority queues resilient to memory faults. In: Proceedings of the 10th International Workshop on Algorithms and Data Structure. 2007, 127–138
Petrillo U F, Finocchi I, Italiano G F. The price of resiliency: a case study on sorting with memory faults. In: Proceedings of the 14th Annual European Symposium on Algorithms. 2006, 768–779
Caminiti S, Finocchi I, Fusco E G. Local dependency dynamic programming in the presence of memory faults. In: Proceedings of the 28th International Symposium on Theoretical Aspects of Computer Science. 2011, 45–56
Brodal G S, Jørgensen A G, Moruz G, Mølhave T. Counting in the presence of memory faults. In: Proceedings of the 20th Annual International Symposium on Algorithms and Computation. 2009, 842–851
Brodal G S, Jørgensen A G, Mølhave T. Fault tolerant external memory algorithms. In: Proceedings of the 11th International Symposium on Algorithms and Data Structures. 2009, 411–422
Boyer R S, Moore J S. MJRTY: a fast majority vote algorithm. In: Automated Reasoning: Essays in Honor of Woody Bledsoe. 1991, 105–118
Bose P, Maheshwari A, Morin P, Morrison J, Smid M, Vahrenhold J. Space-efficient geometric divide-and-conquer algorithms. Computational Geometry: Theory and Applications, 2007, 37(3): 209–227
Frank A, Asuncion A. UCI machine learning repository. 2010, http://archive.ics.uci.edu/ml
Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193–218
Author information
Authors and Affiliations
Corresponding author
Additional information
Fabian Gieseke received his Diplomas in Mathematics and Computer Science in 2006 fromthe University of Münster, Germany. He is currently working towards his PhD in the field of machine learning. His research interests include various related topics including support vector machines and its extensions to semi- and unsupervised learning settings as well as applications of machine learning techniques to problems in the field of astronomy.
Gabriel Moruz received his PhD in Computer Science from the University of Aarhus in 2007. Since 2007 he has been a post doctoral researcher in the group of Prof. Ulrich Meyer (Chair for Algorithm Engineering) at the Goethe University, Frankfurt am Main. His primary research interest concerns the design of efficient algorithms and data structures in practice. This involves designing and implementing algorithms that are aware of various hardware issues that have a great impact in practice. These include cacheaware and cache-oblivious algorithms, streaming algorithms, algorithms performing few branch mispredictions, as well as resilient algorithms, i.e., algorithms aware of memory corruption.
Jan Vahrenhold received his Diploma in Mathematics and his PhD and Habilitation in Computer Science from the University of Münster, Germany, in 1996, 1999, and 2004, respectively. Since 2006, he has been an Associate Professor with the Faculty of Computer Science, Technische Universität Dortmund, Germany. His current research interests include I/O-and cache-efficient algorithms and data structures, computational geometry, and computer science education.
Rights and permissions
About this article
Cite this article
Gieseke, F., Moruz, G. & Vahrenhold, J. Resilient k-d trees: k-means in space revisited. Front. Comput. Sci. 6, 166–178 (2012). https://doi.org/10.1007/s11704-012-2870-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-012-2870-8