Abstract
Processing a workflow (or a job) created by a user, who can be a researcher from a scientific laboratory or an analysis from a commercial organization, is the main functionality that a data center or a high-performance computing center is generally expected to provide. It can be accomplished with a single core processor and rather small amount of memory if the problem is adequately small while it may require thousands of nodes to solve a complicated problem and peta-bytes of storage for its output. Also specific applications on various platforms are required in general by users for resolving the problems appropriately. In this aspect, a data center should operate non-homogeneous systems for resource management, so-called batch system, in which it results in inefficient resource utilization due to stochastic behavior of user activity. Implementation of virtualization for resource management, e.g. Cloud Computing, is one of promising solutions recently arising, however, it results in the increase of complexity of the system itself as well as the system administration because it naturally implies the intervention of virtualization stack, e.g. hypervisor, between Operating System and applications for resource management. In this paper, we propose a new conceptual design to be implemented as a pre-scheduler capable to insert user submitted jobs dedicated to a specific batch system into available resources managed by other kind of batch systems. The proposed design features transparency in between clients and batch systems, accuracy in terms of monitoring and prediction on the available resources, and scalability for additional batch systems. We suggest the implementation example of the conceptual design based on the scenario established from our experience of operating a data center.
Similar content being viewed by others
References
Bandyopadhyay, D., & Sen, J. (2011). Internet of things: Applications and challenges in technology and standardization. Wireless Personal Communications, 58(1), 49–69.
Park, S.-T., Kim, Y.-R., Jeong, S.-P., Hong, C.-I., & Kang, T.-G. (2016). A case study on effective technique of distributed data storage for big data processing in the wireless internet environment. Wireless Personal Communications, 86(1), 239–253.
Marias, G. F., Prigouris, N., Papazafeiropoulos, G., Hadjiefthymiades, S., & Merakos, L. (2004). Brokering positioning data from heterogeneous infrastructures. Wireless Personal Communications, 30(2–4), 233–245.
Hager, M., Finke, T., Seitz, J., & Waas, T. (2014). Software-based management for ethernet networks. Wireless Personal Communications, 74(3), 1021–1032.
Lamanna, M. (2004). The LHC computing grid project at CERN. Nuclear Instruments & Methods in Physics Research, Section A: Accelerators, Spectrometers, Detectors, and Associated Equipment, 534(1–2), 1–6.
Evans, L., & Bryant, P. (2008). LHC machine. Journal of Instrumentation, 3, S08001.
Gagliardi, F. (2004). The EGEE European grid infrastructure project. In Proceedings of High Performance Computing for Computational Science, pp. 194–203.
Pordes, R., Petravick, D., Kramer, B., Olson, D., Livny, M., Roy, A., et al. (2007). The open science grid. Journal of Physics: Conference Series, 78(1), 012057.
Ahn, S. U., Yeo, I. Y., & Park, S. O. (2014). Secure and efficient high-performance PROOF-based cluster system for high-energy physics. Journal of Supercomputing, 70(1), 166–176.
Tera-scale Open-source Resource and QUEue manager (TORQUE), class. http://www.clusterresources.com/pages/products/torque-resource-manager.php.
Litzkow, M., & Livny, M. (1990). Experience with the condor distributed batch system. In Proceedings of IEEE Workshop on Experimental Distributed Systems, pp. 97–101.
Henderson, R. L. (1995). Job scheduling under the portable batch system. In Proceedings of Workshop on Job Scheduling Strategies for Parallel Processing, pp. 279–294.
Nitzberg, B., Schopf, J. M., & Jones, J. P. (2004). PBS Pro: Grid computing and scheduling attributes. Grid Resource Management, 64, 183–190.
Gentzsch, W. (2001). Sun grid engine: Towards creating a compute power grid. In Proceedings of First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 35–36.
Univa Grid Engine. http://www.univa.com/products/grid-engine.php.
Zhou, S. (1992). LSF: Load sharing in large heterogeneous distributed systems. In Proceedings of Workshop on Cluster Computing.
Prenneis Jr., A. (1996). Loadleveler: Workload management for parallel and distributed computing environments. In Proceedings of Supercomputing Europe, pp. 176.
Yoo, A. B., Jette, M. A., & Grondona, M. (2003). Slurm: Simple linux utility for resource management. In Proceedings of Workshop on Job Scheduling Strategies for Parallel Processing, pp. 44–60.
Yan, Y., & Chapman, B. (2008). Comparative study of distributed resource management systems-SGE, LSF, PBS Pro, and LoadLeveler”, Technical Report-Citeseerx.
Drozdowski, M. (2009). Scheduling for parallel processing. London: Springer.
Feitelson, D. G., Rudolph, L., & Schwiegelshohn, U. (2005). Parallel job scheduling—A status report. In Proceedings of Job Strategies for Parallel Processing, pp. 1–16.
Foster, I., & Kesselman, C. (2003). The Grid 2: Blueprint for a new computing infrastructure. Philadelphia: Elsevier.
Garzoglio, G., Levshina, T., Mhashilkar, P., & Timm, S. (2009). ReSS: A resource selection service for the open science grid. In S. C. Lin & E. Yen (Eds.), Grid Computing (pp. 89–98). Boston, MA: Springer.
Kim, C. W., Yoon, H., Jin, D., & Park, S. O. (2015). Integrated management system for a large computing resources in a scientific data center. Journal of Supercomputing,. doi:10.1007/s11227-015-1480-2.
Yoon, H., Yeo, I. Y., & Kim, J. H. (2014). Updating the trusted connection of re-organized computing resource under the automated system management platform. Journal of Supercomputing, 70(1), 200–210.
Zhang, Q., Cheng, L., & Boutaba, R. (2010). Cloud computing: state-of-the-art and research challenges. Journal of Internet Services and Applications, 1(1), 7–18.
Jackson, D., Snell, Q., & Clement, M. (2001). Core algorithms of the Maui scheduler. In Proceedings of Workshop on Job Scheduling Strategies for Parallel Processing, pp. 87–102.
Acknowledgments
This research was supported by the National Research Foundation of Korea (NRF) through the contract N-15-NM-CR01-S01.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ahn, S.U., Kim, J. A Conceptual Design of Job Pre-processing Flow for Heterogeneous Batch Systems in Data Center. Wireless Pers Commun 89, 847–861 (2016). https://doi.org/10.1007/s11277-016-3224-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-016-3224-x