skip to main content
10.1145/2792745.2792779acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

Integrating apache spark into PBS-Based HPC environments

Published: 26 July 2015 Publication History

Abstract

This paper describes an effort at the University of Tennessee's National Institute for Computational Sciences (NICS) to integrate Apache Spark into the widely used TORQUE HPC batch environment. The similarities and differences between the execution of a Spark program and that of an MPI program on a cluster are used to motivate how to implement Spark/TORQUE integration. An implementation of this integration, pbs-spark-submit, is described, including demonstrations of functionality on two HPC clusters and a large shared-memory system.

References

[1]
Apache Hadoop. https://hadoop.apache.org/.
[2]
Apache Hive. https://hive.apache.org/.
[3]
Cluster mode overview. https://spark.apache.org/docs/latest/cluster-overview.html.
[4]
Hadoop On Demand documentation. http://hadoop.apache.org/core/docs/r0.17.2/hod.html.
[5]
HiBench. https://github.com/intel-hadoop/HiBench.
[6]
Moab HPC suite basic edition. http://www.adaptivecomputing.com/products/hpc-products/moab-hpc-basic-edition/.
[7]
Modules -- software environment management. http://modules.sourceforge.net/.
[8]
PBS Professional: Job scheduling and commercial-grade HPC workload management. http://www.pbsworks.com/Product.aspx?id=1.
[9]
Seagate/hadoop-on-lustre. https://github.com/Seagate/hadoop-on-lustre.
[10]
Spark GraphX. https://spark.apache.org/graphx/.
[11]
Spark MLlib. https://spark.apache.org/mllib/.
[12]
Spark SQL. https://spark.apache.org/sql/.
[13]
Spark standalone mode. https://spark.apache.org/docs/latest/spark-standalone.
[14]
Spark streaming. https://spark.apache.org/streaming/.
[15]
TORQUE resource manager. http://www.adaptivecomputing.com/products/open-source/torque/.
[16]
Draft standard for information technology -- Portable Operating System Interface (POSIX ®) draft technical standard: Base specifications, issue 7. 2008.
[17]
Exploring the next generation of Big Data solutions with Hadoop 2, 2014. http://hortonworks.com/wp-content/uploads/2014/02/RHEL_Big_Data_HDP-Reference_Architechure_FINAL.pdf.
[18]
Troy Baer. Man page of pbs-spark-submit. https://www.nics.tennessee.edu/~troy/pbstools/man/pbs-spark-submit.1.html.
[19]
Troy Baer. PBS tools. https://www.nics.tennessee.edu/~troy/pbstools/.
[20]
Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing Lusk, and Rajeev Thakur. PMI: A scalable parallel process-management interface for extreme-scale systems. In Recent Advances in the Message Passing Interface, pages 31--41. Springer, 2010.
[21]
R. Glenn Brook, Alexander Heinecke, Anthony Costa, Paul Peltz, Vincent Betro, Troy Baer, Michael Bader, and Pradeep Dubey. Beacon: Exploring the deployment and application of Intel Xeon Phi coprocessors for scientific computing. Computing in Science & Engineering, (1):1--1.
[22]
Jeffrey Dean and Sanjay Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008.
[23]
Adam Diaz. How YARN changed Hadoop job scheduling. Linux Journal, June 2014. http://www.linuxjournal.com/content/how-yarn-changed-hadoop-job-scheduling.
[24]
Robert L Henderson. Job scheduling under the Portable Batch System. In Job scheduling strategies for parallel processing, pages 279--294. Springer, 1995.
[25]
Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, volume 11, pages 22--22, 2011.
[26]
David Jackson, Quinn Snell, and Mark Clement. Core algorithms of the Maui scheduler. In Job Scheduling Strategies for Parallel Processing, pages 87--102. Springer, 2001.
[27]
Sriram Krishnan, Mahidhar Tatineni, and Chaitanya Baru. myHadoop -- Hadoop-on-demand on traditional HPC resources. San Diego Supercomputer Center Technical Report TR-2011-2, University of California, San Diego, 2011.
[28]
A. F. Szczepanski, Jian Huang, T. Baer, Y. C. Mack, and S. Ahern. Data analysis and visualization in high-performance computing. Computer, 46(5):84--92, 2013.
[29]
OpenPBS Team. A batching queuing system. http://www.openpbs.org/.
[30]
Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, et al. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proceedings of the 4th Annual Symposium on Cloud Computing, page 5. ACM, 2013.
[31]
Pete Wyckoff and Doug Johnson. Mpiexec. https://www.osc.edu/~djohnson/mpiexec/.
[32]
Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, pages 10--10, 2010.

Cited By

View all
  • (2019)Integration of Apache Spark with Invasive Resource Manager2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00279(1553-1560)Online publication date: Aug-2019
  • (2019)Strategies to Deploy and Scale Deep Learning on the Summit Supercomputer2019 IEEE/ACM Third Workshop on Deep Learning on Supercomputers (DLS)10.1109/DLS49591.2019.00016(84-94)Online publication date: Nov-2019
  • (2018)How to Make ProfitProceedings of the 8th International Workshop on Runtime and Operating Systems for Supercomputers10.1145/3217189.3217193(1-9)Online publication date: 12-Jun-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure
July 2015
296 pages
ISBN:9781450337205
DOI:10.1145/2792745
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

  • San Diego Super Computing Ctr: San Diego Super Computing Ctr
  • HPCWire: HPCWire
  • Omnibond: Omnibond Systems, LLC
  • SGI
  • Internet2
  • Indiana University: Indiana University
  • CASC: The Coalition for Academic Scientific Computation
  • NICS: National Institute for Computational Sciences
  • Intel: Intel
  • DDN: DataDirect Networks, Inc
  • DELL
  • CORSA: CORSA Technology
  • ALLINEA: Allinea Software
  • Cray
  • RENCI: Renaissance Computing Institute

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NICS
  2. PBS
  3. TORQUE
  4. apache spark
  5. batch processing
  6. data analytics

Qualifiers

  • Research-article

Funding Sources

  • National Science Foundation

Conference

XSEDE '15
Sponsor:
  • San Diego Super Computing Ctr
  • HPCWire
  • Omnibond
  • Indiana University
  • CASC
  • NICS
  • Intel
  • DDN
  • CORSA
  • ALLINEA
  • RENCI

Acceptance Rates

XSEDE '15 Paper Acceptance Rate 49 of 70 submissions, 70%;
Overall Acceptance Rate 129 of 190 submissions, 68%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Integration of Apache Spark with Invasive Resource Manager2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00279(1553-1560)Online publication date: Aug-2019
  • (2019)Strategies to Deploy and Scale Deep Learning on the Summit Supercomputer2019 IEEE/ACM Third Workshop on Deep Learning on Supercomputers (DLS)10.1109/DLS49591.2019.00016(84-94)Online publication date: Nov-2019
  • (2018)How to Make ProfitProceedings of the 8th International Workshop on Runtime and Operating Systems for Supercomputers10.1145/3217189.3217193(1-9)Online publication date: 12-Jun-2018
  • (2017)Spark on the ARCPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093375(1-6)Online publication date: 9-Jul-2017
  • (2016)ScholarProceedings of the Workshop on Education for High Performance Computing10.5555/3018088.3018093(25-31)Online publication date: 13-Nov-2016
  • (2016)Scholar: A Campus HPC Resource to Enable Computational Literacy2016 Workshop on Education for High-Performance Computing (EduHPC)10.1109/EduHPC.2016.009(25-31)Online publication date: Nov-2016
  • (2016)HPC-reuseProceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2016.72(342-345)Online publication date: 16-May-2016
  • (2016)Real-Time Discovery Services over Large, Heterogeneous and Complex Healthcare Datasets Using Schema-Less, Column-Oriented Methods2016 IEEE Second International Conference on Big Data Computing Service and Applications (BigDataService)10.1109/BigDataService.2016.29(257-264)Online publication date: Mar-2016
  • (2016)On-demand data analytics in HPC environments at leadership computing facilities: Challenges and experiences2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840835(2087-2096)Online publication date: Dec-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media