Abstract
LIMITLESS is a lightweight and scalable framework that provides a holistic view of the system employing the combination of both platform and application monitoring. This paper presents a novel feature for improving the scheduling process based on the performance prediction and the detection of interference between real applications. This feature consists of using malleable synthetic benchmark clones (proxies) for the applications executed in the system with two objectives: (1) build large and representative datasets that can be used to train the machine learning algorithms for predicting, and (2) evaluate if two applications can share the same compute node in order to leverage the unused node resources.
Other related works use detailed micro-architecture independent metrics obtained from functional simulators, which are hard to generate in many new applications. The results are proxies that preserve many of the original features of the applications (control flow, memory access pattern, etc.), and their code needs obfuscation to make impossible the use of reverse engineering. LIMITLESS generates application proxies based on generic-purpose performance information collected from monitoring. It means that other methods may obtain more accurate execution behaviours. However, LIMITLESS’ proxies generate similar performance without extracting data from the binaries, without the necessity of managing code or data from the applications, and they can be shared securely because they have not been generated using any piece of the original code.
LIMITLESS leverages the generated proxies to execute them offline. Each execution increases the datasets of the machine learning algorithms to improve the application scheduling. Besides, the executions between proxies are combined to detect performance degradation (interference) without the necessity of waiting for the execution of the real applications, which depends on the users. In this work, we evaluate the proposed proxy generation approach on a set of benchmarks and applications. We compare the performance obtained during the execution of the proxies and the applications to show their similarity. Finally, we include an evaluation of the interference detection using this approach. As far as we know, this is the first work that uses malleable proxies.
This work has been partially funded by the European High-Performance Computing Joint Undertaking (JU) under the ADMIRE project (grant agreement No 956748) and the Spanish Ministry of Science and innovation Project DECIDE (Ref. PID2019-107858GB-I00.).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Badr, M., Jerger, N.E.: SynFull: synthetic traffic models capturing cache coherent behaviour. ACM SIGARCH Comput. Architect. News 42(3), 109–120 (2014)
Cascajo, A., Singh, D.E., Carretero, J.: Performance-aware scheduling of parallel applications on non-dedicated clusters. Electronics 8(9), 982 (2019)
Cascajo, A., Singh, D.E., Carretero, J.: Limitless - light-weight monitoring tool for large scale systems. In: 2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 220–227 (2021). https://doi.org/10.1109/PDP52278.2021.00042
Ganesan, K., Jo, J., John, L.K.: Synthesizing memory-level parallelism aware miniature clones for SPEC CPU2006 and implant bench workloads. In: ISPASS 2010 - IEEE International Symposium on Performance Analysis of Systems and Software, pp. 33–44 (2010)
Ganesan, K., John, L.K.: Automatic generation of miniaturized synthetic proxies for target applications to efficiently design multicore processors. IEEE Trans. Comput. 63, 833–846 (2014)
Gormley, C., Tong, Z.: Elasticsearch: the Definitive Guide: a Distributed Real-Time Search and Analytics Engine. O’Reilly Media, Inc. (2015)
Joshi, A., Bell, J., Ibm, R.H., John, L.K.: Distilling the essence of proprietary workloads into miniature benchmarks. TACO - ACM Trans. Archit. Code Optim. 5(2), 1–33 (2008). https://doi.org/10.1145/1400112.1400115
Joshi, A., Eeckhout, L., Bell, R.H., John, L.: Performance cloning: a technique for disseminating proprietary applications as benchmarks. In: Proceedings of the 2006 IEEE International Symposium on Workload Characterization, IISWC - 2006, pp. 105–115 (2006)
Luk, C.K, et al.: Pin: building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Not. 40(6), 190–200 (2005)
Martín, G., Marinescu, M.-C., Singh, D.E., Carretero, J.: FLEX-MPI: an MPI extension for supporting dynamic load balancing on heterogeneous non-dedicated systems. In: Wolf, F., Mohr, B., an Mey, D. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 138–149. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40047-6_16
NASA Advanced Supercomputing (NAS) Division: NAS Parallel Benchmarks. https://www.nas.nasa.gov/software/npb.html
Panda, R., John, L.K.: Proxy benchmarks for emerging big-data workloads. In: Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT 2017-September, pp. 105–116 (2017)
University, P.: PARSEC - CSWiki, http://wiki.cs.princeton.edu/index.php/PARSEC-Blackscholes
University, P.: The PARSEC Benchmark Suite. https://parsec.cs.princeton.edu/
Van Ertvelde, L., Eeckhout, L.: Benchmark synthesis for architecture and compiler exploration. In: IEEE International Symposium on Workload Characterization, IISWC 2010, pp. 1–11 (2010)
Van Ertvelde, L., Eeckhout, L.: Dispersing proprietary applications as benchmarks through code mutation. In: ACM SIGPLAN Notices, pp. 201–210 (2008)
Wang, Y., Awad, A., Solihin, Y.: Clone morphing: creating new workload behavior from existing applications. In: ISPASS 2017 - IEEE International Symposium on Performance Analysis of Systems and Software, pp. 97–108 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Cascajo, A., Singh, D.E., Carretero, J. (2022). Detecting Interference Between Applications and Improving the Scheduling Using Malleable Application Proxies. In: Anzt, H., Bienz, A., Luszczek, P., Baboulin, M. (eds) High Performance Computing. ISC High Performance 2022 International Workshops. ISC High Performance 2022. Lecture Notes in Computer Science, vol 13387. Springer, Cham. https://doi.org/10.1007/978-3-031-23220-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-23220-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23219-0
Online ISBN: 978-3-031-23220-6
eBook Packages: Computer ScienceComputer Science (R0)