A fault-tolerant computing method for Xdraw parallel algorithm

Dou, Wanfeng; Li, Yanan

doi:10.1007/s11227-018-2321-x

A fault-tolerant computing method for Xdraw parallel algorithm

Published: 17 March 2018

Volume 74, pages 2776–2800, (2018)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Wanfeng Dou¹ &
Yanan Li¹

208 Accesses
6 Citations
Explore all metrics

Abstract

Viewshed analysis has widely been used in various spatial analysis applications. But the expense of viewshed computation remains high both in time and space complexity for large-scale terrain data, so parallel computing technique has been introduced to improve their performance. However, the failure in such a parallel computing system with a lot of computing nodes or processors may lead to an increase in execution time and cost of running viewshed computation. Highly fault-tolerant parallel computing will greatly enhance the reliability of the algorithm without losing its performance. In this article, we present a fault-tolerant computing framework for parallel viewshed computation in a parallel computing system using redundancy computing strategy. Two schedule strategies, layer and axis direction schedule, are adopted, respectively, as primary process and slave process to check whether or not there are errors to occur during the computation. A rollback and re-computation process is presented to correct these errors, while an error is found by comparing the results of the primary process and its slave process. The fault-tolerant algorithm in this article is implemented using process-level and thread-level parallelization. Our method can make full use of multiple processors providing by parallel computing environment without losing the computation efficiency of the algorithm. To illustrate the usefulness of our approach, several experiments are executed by using Xdraw viewshed algorithm. The results demonstrate that our approach achieves the 14.91 of speedup ratio with 16 processes and the 99.4% of average precision rate in comparison with simple checkpoint Xdraw algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fine-granularity scheduling algorithm for parallel XDraw viewshed analysis

Article 20 March 2018

Wanfeng Dou, Yanan Li & Yanli Wang

Research on the Fast Parallel Recomputing for Parallel Digital Terrain Analysis

Parallel Kirchhoff Pre-Stack Depth Migration on Large High Performance Clusters

References

Ammendola R, Biagioni A, Frezza O et al (2015) A hierarchical watchdog mechanism for systemic fault awareness on distributed systems. Future Gener Compu Syst 53:90–99
Article Google Scholar
Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Int J Eurogr Assoc Comput Graph Forum 5(3):179–188
Article Google Scholar
Balasubramanian P, Arisaka R, Arabnia HR, (2012) RB_DSOP: a rule based disjoint sum of products synthesis methods. In: Proceedings of International Conference on Computer Design (CDES’12), July, Las Vegas, USA, pp 39–43
Bongers J, Arkush E, Harrower M (2012) Landscapes of death: GISbased analyses of chullpas in the western Lake Titicaca basin. J Archaeol Sci 39(6):1687–1693
Article Google Scholar
Bronevetsky G, Marques D, Pingali K, Stodghill P (2003) Automated application-level checkpoint of MPI program. In: ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), San Diego, CA, 11–13 June, pp 84–94
Cauchi-Saunders AJ, Lewis IJ (2015) GPU enabled Xdraw viewshed analysis. J Parallel Distrib Comput 84:87–93
Article Google Scholar
Chen Z, Dongarra J (2009) Highly scalable self-healing algorithms for high performance scientific computing. IEEE Trans Comput 58(11):1512–1524
Article MathSciNet MATH Google Scholar
De Floriani L, Magillo P (1994) Visibility algorithms on triangulated digital terrain models. Int J Geogr Inf Syst 8(1):13–41
Article Google Scholar
Dou W, Miao S (2016) A fast parallel re-computation with redundancy mechanism for parallel digital terrain analysis. J Cluster Comput 19(4):69–1785
Article Google Scholar
Dou W, Miao S, Li Y (2015) Fault-tolerant parallel computing for DEM data blocks with layered dependent relationships based on redundancy mechanism. Int J High Perform Comput Netw 8(4):337–344
Article Google Scholar
Du Y, Tang Y, Xie X (2013) A new parallel recomputing code design methodology for fast failure recovery. J Comput Electr Eng 9(4):95–1113
Google Scholar
Engelmann C, Geist A (2003) A diskless check-pointing algorithm for super-scale architectures applied to the fast Fourier transform. In: IEEE 1st International Workshop on Challenges of Large Applications in Distributed Environments(CLADE), Seattle, WA, 21 June, pp 47–52
Egwutuoha IP, Levy D, Selic B, Chen S (2013) A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J Supercomput 65:1302–1326
Article Google Scholar
Fang C, Yang C, Chen Z, Yao X, Guo H (2011) Parallel algorithm for viewshed analysis on a modern GPU. Int J Digit Earth 4(6):471–486
Article Google Scholar
Fishman J, Haverkort H, Toma L (2009) Improved visibility computation on massive grid terrains. In: Agrawal D, Aref WG et al (eds) Proceedings of the 17th ACM Conference on Advances in Geographic Information Systems, 4–6, 2009, Seattle. ACM, New York, pp 121–130
Franklin WR, Ray CK, Mehta S (1994) Geometric algorithms for siting of air defense missile batteries. Research Project for Battle, Columbus Division, Delivery Order No. 2756
Gao Y, Yu H, Liu Y, Liu Y, Liu M, Zhao Y (2011) Optimization for viewshed analysis on GPU. In: Li X, Bao SM (eds) 19th IEEE International Conference on Geoinformatics, 24–26 June 2011, Shanghai, China. IEEE, pp 1–5
Germain D, Laurendeau D, Vézina G (1996) Visibility analysis on a massively data-parallel computer. Concurr Pract Exp 8(6):475–487
Article Google Scholar
Goiri I, Julia F, Guitart J, Torres J, (2010) Checkpoint-based fault-tolerance infrastructure for virtualized service providers. In: IEEE/IFIP Network Operations and Management Symposium, April, Osaka. IEEE, pp 455–462
Gopineedi PD, Thapliyal H, Srinivas MB, Arabnia HR (2006) Novel and efficient 4:2 and 5:2 compressors with minimum number of transistors designed for low-power operations. In: Proceedings of International Conference on Embedded Systems and Applications (ESA’06), Las Vegas USA, June 26–29, pp 160–166
Haider S, Nazir B (2016) Fault tolerance in computational grids: perspectives, chanllenges, and issues. SpringerPlus 5:1991
Article Google Scholar
Jiao Z, Zhang Y, Wang Y, Wang B, Jin J, Wang XY (2017) A novel multilayer correlation maximization model for improving CCA-based frequency recognition in SSVEP brain-computer interface. Int J Neural Syst 27(8):1750039
Google Scholar
Lu CD (2005) Scalable diskless checkpointing for large parallel systems. Ph.D. dissertation, University of Illinois at Urbana-Champaign
Massive ML, Chun BN, Culler DE (2004) The ganglia distributed monitoring system design, implementation, and experience. Parallel Comput 30(7):817–840
Article Google Scholar
Mills K, Fox G, Heimbach R (1992) Implementing an intervisibility analysis model on a parallel computing system. Comput Geosci 18(8):1047–1054
Article Google Scholar
Plank J, Li K, Puening M (1998) Diskless check-pointing. IEEE Trans Parallel Distrib Syst 9(10):972–986
Article Google Scholar
Qin CZ, Zhan LJ et al (2014) A strategy for raster-based geocomputation under different parallel computing platform. Int J Geogr Inf Sci 28(11):2127–2144
Article Google Scholar
Song X, Dou W, Tang G, Yang K, Qian K (2014) A diskless check-pointing algorithm for cluster architectures applied to geospatial raster data processing. J Algorithms Comput Technol 8(4):369–387
Article Google Scholar
Song X, Tang G, Liu X, Dou W, Li F (2016) Parallel viewshed analysis on a PC cluster system using triple-based irregular partition scheme. Earth Sci Inf 10(5):511–523
Article Google Scholar
Tabik S, Zapata EL, Romera LF (2012) Simultaneous computation of total viewshed on high resolution grids. Int J Geogr Inf Sci 27(4):804–814
Article Google Scholar
Tang G, D’azevedo EF, Zhang F et al (2010) Application of a hybrid MPI/OpenMP approach for parallel groundwater model calibration using multi-core computers. Comput Geosci 36(11):1451–1460
Article Google Scholar
Thapliyal H, Srinivas MB, Arabnia HR (2005) Reversible logic synthesis of half, full and parallel subtractors. In: Proceedings of 2005 International Conference on Embedded Systems and Applications (ESA’05), June, Las Vegas, pp 165–172
Thapliyal H, Arabnia HR (2006) Reversible programmable logic array (RPLA) using fredkin and Feynman gates for industrial electronics and applications. In: Proceedings of 2006 International Conference on Computer Design & Conference on Computing in Nanotechnology (CDES’06), June 26–29, Las Vegas, USA, pp 70–74
Thapliyal H, Arabnia HR, Bajpai R, Sharma KK (2007) Combined integer and variable precision (CIVP) floating point multiplication architecture for FPGAs. In: Proceedings of 2007 International Conference on Parallel & Distributed Processing Techniques & Applications (PDPTA’07), USA, pp 449–450
Thapliyal H, Arabnia HR, Srinivas MB (2009) Efficient reversible logic design of BCD subtractors. In: Transactions in Computational Science Journal, III, LNCS 5300. Springer, Berlin, pp 99–121
Thapliyal H, Jayashree HV, Nagamani AN, Arabnia HR (2013) Process in reversible processor design: a novel methodology for reversible carry look ahead adder. In: Gavrilova ML, Tan CJK (eds) Transactions in Computational Science, XVII, LNCS 7420. Springer, Berlin, pp 73–97
Wang HQ, Zhang Y, Waytowich NR, Krusienski DJ, Zhou GX, Jin J, Wang XY, Cichocki A (2016) Discriminative feature extraction via multivariate linear regression for SSVEP-based BCI. IEEE Trans Neural Syst Rehabil Eng 24(4):532–541
Article Google Scholar
Wang K, Lo C-P, Brook GA, Arabnia HR (2001) Comparison of existing triangulation methods for regularly and irregularly spaced fields. Int J Geogr Inf Sci 15(8):743–762
Article Google Scholar
Wang S, Armstrong MP (2003) A quadtree approach to domain decomposition for spatial interpolation in Grid computing environments. Parallel Comput 29(10):1481–1504
Article Google Scholar
Yang X, Du Y, Wang P et al (2007) The fault tolerant parallel algorithm: the parallel re-computing based failure recovery. In: 16th International Conference on Parallel Architecture and Compilation Techniques (PACT), Brasov, Romania, 15–19 September, pp 199–209
Zhang Y, Zhou GX, Jin J, Zhao QB, Wang XY, Cichocki A (2016) Sparse Bayesian classification of EEG for brain-computer interface. IEEE Trans Neural Netw Learn Syst 27(11):2256–2267
Article MathSciNet Google Scholar
Zhang Y, Wang Y, Zhou GX, Jin J, Wang B, Wang XY, Cichocki A (2018) Multi-kernel extreme learning machine for EEG classification in brain-computer interface. Expert Syst Appl 96:302–310
Article Google Scholar

Download references

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China (No. 41771411). We also thank the reviewers’ pertinent comments to provide a qualified paper for readers.

Author information

Authors and Affiliations

School of Computer Science and Technology, Nanjing Normal University, Jiangsu Intelligent Information Technology and Software Engineering Laboratory, Nanjing, 210023, Jiangsu, China
Wanfeng Dou & Yanan Li

Authors

Wanfeng Dou
View author publications
You can also search for this author in PubMed Google Scholar
Yanan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wanfeng Dou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dou, W., Li, Y. A fault-tolerant computing method for Xdraw parallel algorithm. J Supercomput 74, 2776–2800 (2018). https://doi.org/10.1007/s11227-018-2321-x

Download citation

Published: 17 March 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11227-018-2321-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fault-tolerant computing method for Xdraw parallel algorithm

Abstract

Access this article

Similar content being viewed by others

A fine-granularity scheduling algorithm for parallel XDraw viewshed analysis

Research on the Fast Parallel Recomputing for Parallel Digital Terrain Analysis

Parallel Kirchhoff Pre-Stack Depth Migration on Large High Performance Clusters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fault-tolerant computing method for Xdraw parallel algorithm

Abstract

Access this article

Similar content being viewed by others

A fine-granularity scheduling algorithm for parallel XDraw viewshed analysis

Research on the Fast Parallel Recomputing for Parallel Digital Terrain Analysis

Parallel Kirchhoff Pre-Stack Depth Migration on Large High Performance Clusters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation