AsyIter: tolerating computational skew of synchronous iterative applications via computing decomposition

Zhang, Yu; Liao, Xiaofei; Jin, Hai; Zhou, Bing Bing

doi:10.1007/s10115-014-0748-9

AsyIter: tolerating computational skew of synchronous iterative applications via computing decomposition

Regular Paper
Published: 06 May 2014

Volume 41, pages 379–400, (2014)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yu Zhang¹,
Xiaofei Liao¹,
Hai Jin¹ &
…
Bing Bing Zhou²

265 Accesses
3 Citations
Explore all metrics

Abstract

Iterative computing is pervasive in web applications, data mining and scientific computing. Many parallel algorithms for such applications are synchronous algorithms which need strict synchronization between iterations to ensure their correctness, making the performance sensitive to computational skews in each iteration. Current load balancing approaches may alleviate the effect of computational skew, but cannot completely solve the problem. As a result, for many applications, the skews in each iteration still exist and they are accumulated, seriously affecting the completion time of these applications. In this paper, we propose an effective approach to make synchronous iterative computing applications themselves have the ability to tolerate the negative effects of unresolved computational skews. This approach divides a large computational task in a computing node or worker into a number of sub-tasks which only depend on the states of a few objects from the previous iteration. This allows the sub-tasks in subsequent iterations to proceed in advance whenever the states of related data objects are available. Consequently, the idle time caused by strict synchronization is reduced and the overall performance is thus enhanced. Experimental results show that this approach can improve the overall performance by up to \(2.45\times \) in comparison with the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable Loop Self-Scheduling Schemes for Large-Scale Clusters and Cloud Systems

Article 11 May 2016

PI-sqrt: novel parallel implementations of in-place sequence rotation on multicore systems

Article 18 April 2022

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

Article 30 July 2022

References

Zhang Y, Gao Q, Gao L, Wang C (2012) Accelerate large-scale iterative computation through asynchronous accumulative updates. In: Proceedings of the 3rd workshop on Scientific Cloud Computing Date. ACM, Delft, Netherlands, pp 13–22
Kambatla K, Rapolu N, Jagannathan S, Grama A (2010) Asynchronous algorithms in mapreduce. In Proceedings of the 2010 IEEE international conference on cluster computing. IEEE Computer society, Heraklion, Crete, Greece, pp 245–254
Low Y, Gonzalez J, Kyrola A, Bickson D, Guestrin C, Hellerstein JM (2010) Graphlab: a new framework for parallel machine learning. In: Proceedings of the 26th conference on uncertainty in artificial intelligence. AUAI, Los Angeles, CA, USA, pp 1–10
Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein JM (2012) Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc VLDB Endow 5(8):716–727
Article Google Scholar
Zhang Y, Gao Q, Gao L, Wang C (2011) Priter: a distributed framework for prioritized iterative computations. In Proceedings of the 2nd ACM symposium on cloud computing. ACM, Cascais, Portugal, pp 1–13
Byna S, Chou J, Rübel O, Karimabadi H, Daughton WS, Roytershteyn V, Bethel E, Howison M, Hsu K-J, Lin K-W et al (2012) Parallel i/o, analysis, and visualization of a trillion particle simulation. In: Proceedings of the 2012 international conference on high performance computing, networking, storage and analysis. IEEE Computer society, Salt Lake City, Utah, USA, pp 1–12
Banerjee S, Agarwal N (2012) Analyzing collective behavior from blogs using swarm intelligence. Knowl Inf Syst 33(3):523–547
Article Google Scholar
Wang G, Salles MV, Sowell B, Wang X, Cao T, Demers A, Gehrke J, White W (2010) Behavioral simulations in mapreduce. Proc VLDB Endow 3(1):952–963
Article Google Scholar
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Article Google Scholar
Jing L, Ng MK, Huang JZ (2007) An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1026–1041
Article Google Scholar
Alimi J-M, Bouillot V, Rasera Y, Reverdy V, Corasaniti P-S, Balmes I, Requena S, Delaruelle X, Richet J-N (2012) First-ever full observable universe simulation. In: Proceedings of the 2012 international conference on high performance computing, networking, storage and analysis. IEEE Computer society, Salt Lake City, Utah, USA, pp 1–11
Makino J, Daisaka H (2012) Grape-8: An accelerator for gravitational n-body simulation with 20.5gflops/w performance. In Proceedings of the 2012 international conference on high performance computing, networking, storage and analysis. IEEE Computer society, Salt Lake City, Utah, USA, pp 1–10
Kwon Y, Balazinska M, Howe B, Rolia J (2010) Skew-resistant parallel processing of feature-extracting scientific user-defined functions. In Proceedings of the 1st ACM symposium on Cloud computing. ACM, Indianapolis, IN, USA, pp 75–86
Lifflander J, Krishnamoorthy S, Kale LV (2012) Work stealing and persistence-based load balancers for iterative overdecomposed applications. In Proceedings of the 21st international ACM symposium on high-performance parallel and distributed computing. ACM, Delft, the Netherlands, pp 137–148
Zhang Y, Gao Q, Gao L, Wang C (2011) imapreduce: a distributed computing framework for iterative computation. In: Proceedings of the 2011 IEEE international symposium on parallel and distributed processing workshops and Phd forum. IEEE Computer society, Anchorage, Alaska, USA, pp 1112–1121
Bu Y, Howe B, Balazinska M, Ernst MD (2010) Haloop: efficient iterative data processing on large clusters. Proc VLDB Endow 3(1):285–296
Article Google Scholar
Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G (2010) Twister: a runtime for iterative mapreduce. In Proceedings of the 19th International ACM symposium on high performance distributed computing. ACM, Chicago, Illinois, USA, pp 810–818
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Article Google Scholar
Power R, Li J (2010) Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX conference on Operating systems design and implementation. USENIX Association, Vancouver, BC, Canada, pp 1–14
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on Hot topics in cloud computing. USENIX Association, Berkeley, CA, USA, pp 1–10
Murray DG, Schwarzkopf M, Smowton C, Smith S, Madhavapeddy A, Hand S (2011) Ciel: a universal execution engine for distributed data-flow computing. In: Proceedings of the 8th USENIX conference on networked systems design and implementation. USENIX Association, Boston, MA, USA, pp 1–9
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, Indianapolis, IN, USA, pp 135–146
Pearce O, Gamblin T, de Supinski BR, Schulz M, Amato NM (2012) Quantifying the effectiveness of load balance algorithms. In: Proceedings of the 26th ACM international conference on supercomputing. ACM, Venice, Italy, pp 185–194
Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX conference on operating systems design and implementation. USENIX Association, Hollywood, CA, USA, pp 17–30
Ananthanarayanan G, Kandula S, Greenberg A, Stoica I, Lu Y, Saha B, Harris E (2010) Reining in the outliers in map-reduce clusters using mantri. In: Proceedings of the 9th USENIX conference on Operating systems design and implementation. USENIX Association, Vancouver, BC, Canada, pp 1–16
Kwon Y, Balazinska M, Howe B, Rolia J (2012) Skewtune: mitigating skew in mapreduce applications. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data. ACM, Scottsdale, AZ, USA, pp 25–36
Couzin ID, Krause J, Franks NR, Levin SA (2005) Effective leadership and decision-making in animal groups on the move. Nature 433(7025):513–516
Article Google Scholar
Raney B, Nagel K (2004) Iterative route planning for large-scale modular transportation simulations. Future Gener Comput Syst 20(7):1101–1118
Article Google Scholar
TS etc. (2012) Biological modeling and simulation. http://zool33.uni-graz.at/schmickl/index.html
Schrank D, Eisele B, Lomax T (2012) Tti’s 2012 urban mobility report. In: Proceedings of the 2012 annual urban mobility report. Texas A&M Transportation Institute, Texas, USA

Download references

Acknowledgments

This work was supported by National High-tech Research and Development Program of China (863 Program) under Grant No. 2012AA010905, China National Natural Science Foundation under Grant No. 61322210, 61272408, Doctoral Fund of Ministry of Education of China under Grant No. 20130142110048 and Natural Science Foundation of Hubei under Grant No. 2012FFA007.

Author information

Authors and Affiliations

Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Yu Zhang, Xiaofei Liao & Hai Jin
School of Information Technologies, The University of Sydney, Sydney, NSW, 2006, Australia
Bing Bing Zhou

Authors

Yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofei Liao
View author publications
You can also search for this author in PubMed Google Scholar
Hai Jin
View author publications
You can also search for this author in PubMed Google Scholar
Bing Bing Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai Jin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Liao, X., Jin, H. et al. AsyIter: tolerating computational skew of synchronous iterative applications via computing decomposition. Knowl Inf Syst 41, 379–400 (2014). https://doi.org/10.1007/s10115-014-0748-9

Download citation

Received: 05 May 2013
Revised: 06 January 2014
Accepted: 19 April 2014
Published: 06 May 2014
Issue Date: November 2014
DOI: https://doi.org/10.1007/s10115-014-0748-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AsyIter: tolerating computational skew of synchronous iterative applications via computing decomposition

Abstract

Access this article

Similar content being viewed by others

Scalable Loop Self-Scheduling Schemes for Large-Scale Clusters and Cloud Systems

PI-sqrt: novel parallel implementations of in-place sequence rotation on multicore systems

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

AsyIter: tolerating computational skew of synchronous iterative applications via computing decomposition

Abstract

Access this article

Similar content being viewed by others

Scalable Loop Self-Scheduling Schemes for Large-Scale Clusters and Cloud Systems

PI-sqrt: novel parallel implementations of in-place sequence rotation on multicore systems

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation