research-article

Co-sites: the autonomous distributed dataflows in collaborative scientific discovery

Authors:

Karsten Schwan,

Greg Eisenhauer,

Scott KlaskyAuthors Info & Claims

WORKS '15: Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science

Article No.: 5, Pages 1 - 11

https://doi.org/10.1145/2822332.2822337

Published: 15 November 2015 Publication History

Abstract

Online "big data" processing applications have seen increasing importance in the high performance computing domain, including online analytics of large volumes of data output by various scientific applications.

This work contributes to answering the question of how to promote efficient collaborative science in face of unpredictable analytics workloads and dynamics in available resources? It proposes the Co-Sites solution employing online resource management at the sites participating online collaboration, including geographically distributed sites that may spread across large distances. Co-Sites operates by each site observing its local progress and making its own decisions to better utilize local resources and to maintain acceptable rates of global progress. Co-Sites further enriches such distributed data flows to permit just-in-time data sharing to better leverage collaborators' diverse domain expertise.

Experiments with a combustion workflow demonstrate the Co-Sites solution with (i) improved end-to-end completion times, (ii) good scalability, and (iii) with good data sharing latencies.

References

[1]

J. F. Lofstead, M. Polte, G. A. Gibson, S. Klasky, K. Schwan, R. Oldfield, M. Wolf, and Q. Liu, "Six degrees of scientific data: reading patterns for extreme scale science io," in HPDC, 2011, pp. 49--60.

Digital Library

[2]

W. X. Wang, Z. Lin, W. M. Tang, W. W. Lee, S. Ethier, J. L. V. Lewandowski, G. Rewoldt, T. S. Hahm, and J. Manickam, "Gyro-Kinetic simulation of global turbulent transport properties in tokamak experiments," Physics of Plasmas, vol. 13, no. 9, p. 092505, 2006.

[3]

S. Plimpton, R. Pollock, and M. Stevens, "Particle-mesh ewald and rrespa for parallel molecular dynamics simulations," in PPSC. SIAM, 1997.

[4]

M. Wolf, Z. Cai, W. Huang, and K. Schwan, "Smartpointers: personalized scientific data portals in your hand," in SC, 2002.

Digital Library

[5]

R. Bhartia, "Amazon kinesis and apache storm," https://d0.awsstatic.com/whitepapers/building-sliding-window-analysis-of-clickstream-data-kinesis.pdf, 2014.

[6]

A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, N. Bhagat, S. Mittal, and D. Ryaboy, "Storm@twitter," in Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '14. New York, NY, USA: ACM, 2014.

Digital Library

[7]

"S4: Distributed stream computing platform," http://incubator.apache.org/s4.

[8]

T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle, "Millwheel: Fault-tolerant stream processing at internet scale," in Very Large Data Bases, 2013, pp. 734--746.

Digital Library

[9]

"Apache samza," http://samza.apache.org.

[10]

"Spark streaming," https://spark.apache.org/streaming.

[11]

E. Deelman, G. Singh, M. hui Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz, "Pegasus: a framework for mapping complex scientific workflows onto distributed systems," Scientific Programming Journal, vol. 13, pp. 219--237, 2005.

Digital Library

[12]

"Storm," https://storm.apache.org, 2014.

[13]

D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi, "Naiad: A timely dataflow system," in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP '13. New York, NY, USA: ACM, 2013, pp. 439--455.

Digital Library

[14]

D. Peng and F. Dabek, "Large-scale incremental processing using distributed transactions and notifications," in Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI'10. USENIX Association, 2010.

Digital Library

[15]

Z. Qian, Y. He, C. Su, Z. Wu, H. Zhu, T. Zhang, L. Zhou, Y. Yu, and Z. Zhang, "Timestream: Reliable stream computation in the cloud," in Proceedings of the 8th ACM European Conference on Computer Systems, ser. EuroSys '13. ACM, 2013.

Digital Library

[16]

M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica, "Discretized streams: Fault-tolerant streaming computation at scale," in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP '13. ACM, 2013.

Digital Library

[17]

J. Dayal, J. Cao, G. Eisenhauer, K. Schwan, M. Wolf, F. Zheng, H. Abbasi, S. Klasky, N. Podhorszki, and J. Lofstead, "I/O Containers: Managing the Data Analytics and Visualization Pipelines of High End Codes," in IPDPSW, 2013.

Digital Library

[18]

E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S. Patil, M. hui Su, K. Vahi, and M. Livny, "Pegasus: Mapping scientific workflows onto the grid," 2004.

[19]

R. Duan, R. Goh, Q. Zheng, and Y. Liu, "Scientific workflow partitioning and data flow optimization in hybrid clouds," Cloud Computing, IEEE Transactions on, 2014.

[20]

M. D. Beynon, R. Ferreira, T. M. Kurç, A. Sussman, and J. H. Saltz, "Datacutter: Middleware for filtering very large scientific datasets on archival storage systems," in IEEE Symposium on Mass Storage Systems, 2000, pp. 119--134.

[21]

S. Lakshminarasimhan, N. Shah, S. Ethier, S. Klasky, R. Latham, R. B. Ross, and N. F. Samatova, "Compressing the incompressible with isabela: In-situ reduction of spatio-temporal data," in Euro-Par (1), 2011.

Digital Library

[22]

H. Abbasi, M. Wolf, G. Eisenhauer, S. Klasky, K. Schwan, and F. Zheng, "Datastager: scalable data staging services for petascale applications," Cluster Computing 2010, 2010.

Digital Library

[23]

A. Marshall et al., "Measurements of Leading Point Conditioned Statistics of High Hydrogen Content Fuels," in The 8th U.S. National Combustion Meeting, 2013.

[24]

Q. Liu et al., "Runtime I/O Re-Routing + Throttling on HPC Storage," ser. HotStorage, 2013.

Digital Library

[25]

"KSTAR," https://fusion.gat.com/global/KSTAR.

[26]

"Amazon S3," http://aws.amazon.com/s3.

[27]

"Openstack Swift," https://wiki.openstack.org/wiki/Swift.

[28]

San Diego Supuercomputer Center, "SDSC cloud storage services," http://cloud.sdsc.edu/hp/index.php.

[29]

J. Huang, X. Zhang, G. Eisenhauer, K. Schwan, M. Wolf, S. Ethier, and S. Klasky, "Scibox: Online sharing of scientific data via the cloud," in Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, ser. IPDPS '14. Washington, DC, USA: IEEE Computer Society, 2014, pp. 145--154.

Digital Library

[30]

H. Yoon, A. Gavrilovska, K. Schwan, and J. Donahue, "Interactive use of cloud services: Amazon sqs and s3," in CCGrid'12, 2012.

Digital Library

[31]

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing," in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI'12. USENIX Association, 2012.

Digital Library

[32]

"Otsu's Method," http://www.mathworks.com/help/images/\\ref/graythresh.html.

[33]

"Curve Fitting," http://www.mathworks.com/products/\\curvefitting/.

[34]

L. Amini et al., "SPC: A distributed, scalable platform for data mining," in DM-SSP, 2006.

Digital Library

[35]

J. Wolf et al., "SODA: An Optimizing Scheduler for Large-scale Stream-based Distributed Computer Systems," in Middleware, 2008.

Digital Library

[36]

R. Khandekar, K. Hildrum, S. Parekh, D. Rajan, J. Wolf, K.-L. Wu, H. Andrade, and B. Gedik, "COLA: Optimizing Stream Processing Applications via Graph Partitioning," in Middleware, 2009.

Digital Library

[37]

D. J. Abadi et al., "Aurora: A New Model and Architecture for Data Stream Management," The VLDB Journal, vol. 12, no. 2, pp. 120--139, Aug. 2003.

Digital Library

[38]

D. Abadi et al., "The design of the borealis stream processing engine," in CIDR, 2005.

[39]

A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein, and J. Widom, "STREAM: The stanford data stream management system." Springer, 2004.

[40]

"Streambase systems," http://www.streambase.com.

[41]

S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah, "TelegraphCQ: Continuous Dataflow Processing," in SIGMOD, 2003.

Digital Library

[42]

W. Thies, M. Karczmarek, and S. P. Amarasinghe, "StreamIt: A Language for Streaming Applications," in CC, 2002.

Digital Library

[43]

L. Hu, K. Schwan, H. Amur, and X. Chen, "Elf: efficient lightweight fast stream processing at scale," in 2014 USENIX Annual Technical Conference. USENIX Association, Jun. 2014.

Digital Library

[44]

B. Babcock, S. Babu, M. Datar, R. Motwani, and D. Thomas, "Operator Scheduling in Data Stream Systems," The VLDB Journal, 2004.

Digital Library

[45]

L. Amini, N. Jain, A. Sehgal, J. Silber, and O. Verscheure, "Adaptive control of extreme-scale stream processing systems," in Proceedings of the 26th IEEE International Conference on Distributed Computing Systems, ser. ICDCS '06. IEEE Computer Society, 2006.

Digital Library

[46]

S. Schneider, H. Andrade, B. Gedik, A. Biem, and K.-L. Wu, "Elastic scaling of data parallel operators in stream processing," in Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, 2009.

Digital Library

[47]

Y. Xing, S. Zdonik, and J.-H. Hwang, "Dynamic load distribution in the borealis stream processor," in Proceedings of the 21st International Conference on Data Engineering, ser. ICDE '05. IEEE Computer Society, 2005.

Digital Library

[48]

S. Schneider, H. Andrade, B. Gedik, A. Biem, and K.-L. Wu, "Elastic scaling of data parallel operators in stream processing," in IPDPS, 2009.

Digital Library

[49]

B. Babcock, M. Datar, and R. Motwani, "Load shedding for aggregation queries over data streams," in Proceedings of the 20th International Conference on Data Engineering, ser. ICDE '04. IEEE Computer Society, 2004.

Digital Library

[50]

N. Tatbul, U. Çetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker, "Load shedding in a data stream manager," in Proceedings of the 29th International Conference on Very Large Data Bases - Volume 29, ser. VLDB '03. VLDB Endowment, 2003.

Digital Library

[51]

S. Klasky, S. Ethier, Z. Lin, K. Martins, D. McCune, and R. Samtaney, "Grid-Based parallel data streaming implemented for the gyrokinetic toroidal code," in SC 2003. IEEE Computer Society, 2003.

Digital Library

[52]

C. Docan, M. Parashar, and S. Klasky, "Dataspaces: an interaction and coordination framework for coupled simulation workflows," in HPDC. ACM, 2010.

Digital Library

[53]

Y. Zhang, Q. Liu, S. Klasky, M. Wolf, K. Schwan, G. Eisenhauer, J. Choi, and N. Podhorszki, "Active workflow system for near real-time extreme-scale science," ser. PPAA '14. ACM, 2014.

Digital Library

[54]

J. Deslippe, A. Essiari, S. J. Patton, T. Samak, C. E. Tull, A. Hexemer, D. Kumar, D. Parkinson, and P. Stewart, "Workflow management for real-time analysis of lightsource experiments," in Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, ser. WORKS '14. IEEE Press, 2014.

Digital Library

[55]

S. Gesing, M. Atkinson, R. Filgueira, I. Taylor, A. Jones, V. Stankovski, C. S. Liew, A. Spinuso, G. Terstyanszky, and P. Kacsuk, "Workflows in a dashboard: A new generation of usability," in Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, ser. WORKS '14. IEEE Press, 2014.

Digital Library

[56]

I. Foster and C. Kesselman, Eds., The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers Inc., 1999.

Digital Library

[57]

I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludascher, and S. Mock, "Kepler: an extensible system for design and execution of scientific workflows," in Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on, 2004.

Digital Library

[58]

P. Couvares, T. Kosar, A. Roy, J. Weber, and K. Wenger, Workflow Management in Condor. Scientific Workflows for Grids, 2007.

[59]

J. Blythe, E. Deelman, and Y. Gil, "Planning for workflow construction and maintenance on the grid," 2003.

[60]

T. Tannenbaum, D. Wright, K. Miller, and M. Livny, "Beowulf cluster computing with linux." MIT Press, 2002, ch. Condor: A Distributed Job Scheduler.

Digital Library

[61]

J. Yu and R. Buyya, "A novel architecture for realizing grid workflow using tuple spaces," Grid Computing, IEEE/ACM International Workshop on, 2004.

Digital Library

[62]

G. Juve, E. Deelman, K. Vahi, G. Mehta, B. Berriman, B. P. Berman, and P. Maechling, "Data sharing options for scientific workflows on amazon ec2," in Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 1--9. {Online}. Available: http://dx.doi.org/10.1109/SC.2010.17

Digital Library

[63]

R. Agarwal, G. Juve, and E. Deelman, "Peer-to-peer data sharing for scientific workflows on amazon ec2," in High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:, Nov 2012, pp. 82--89.

Digital Library

[64]

R. Tudoran, A. Costan, R. Rad, G. Brasche, and G. Antoniu, "Adaptive file management for scientific workflows on the azure cloud," in Big Data, 2013 IEEE International Conference on, Oct 2013, pp. 273--281.

[65]

W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and I. Foster, "The globus striped gridftp framework and server," in SC'05, Seattle, Washington, USA, 2005.

Digital Library

[66]

Globus online, "Reliable, high-performance, secure file transfer," https://www.globusonline.org, 2013.

[67]

H. Abbasi, G. Eisenhauer, M. Wolf, K. Schwan, and S. Klasky, "Just in time: Adding value to the io pipelings of high performance applications with jitstaging," in HPDC'11, San Jose, California, 2011.

Digital Library

Cited By

Hintze DRice A(2016)Picky: Efficient and Reproducible Sharing of Large Datasets Using Merkle-Trees2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2016.25(30-38)Online publication date: Sep-2016
https://doi.org/10.1109/MASCOTS.2016.25

Recommendations

Niche Sites With Affiliate Marketing For Beginners: Niche Market Research, Cheap Domain Name & Web Hosting, Model For Google AdSense, ClickBank, SellHealth, CJ & LinkShare
Amazon Affiliate Niche Sites: The Complete Guide!
Business Models for Monetizing Internet Applications and Web Sites: Experience, Theory, and Predictions

Almost all attempts to date to monetize Internet applications targeted at individuals have focused on natural extensions of traditional media or traditional retailing. Most are either some form of consumer-focused advertising or of consumer-focused e-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WORKS '15: Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science

November 2015

98 pages

ISBN:9781450339896

DOI:10.1145/2822332

Conference Chairs:
Johan Montagnat
CNRS, France
,
Ian Taylor
Cardiff University, UK

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS\DATC: IEEE Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SC15

Sponsor:

SIGHPC
SIGARCH
IEEE-CS\DATC

SC15: The International Conference for High Performance Computing, Networking, Storage and Analysis

November 15, 2015

Texas, Austin

Acceptance Rates

WORKS '15 Paper Acceptance Rate 9 of 13 submissions, 69%;

Overall Acceptance Rate 30 of 54 submissions, 56%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
96
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hintze DRice A(2016)Picky: Efficient and Reproducible Sharing of Large Datasets Using Merkle-Trees2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2016.25(30-38)Online publication date: Sep-2016
https://doi.org/10.1109/MASCOTS.2016.25

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten