Skip to main content

Part of the book series: High-Performance Computing Series ((HPC,volume 1))

Abstract

This chapter describes the design and implementation of the mOS multi-kernel project at Intel Corp. The multi-Operating System (mOS) for High-Performance Computing (HPC) combines a Linux and a lightweight kernel (LWK) to provide the required Linux functionality, and the scalability and performance of an LWK. In this chapter, we explain the thought process that led to the current design of mOS. We highlight the difficulties of running two kernels on the compute nodes of a supercomputer, while maintaining Linux compatibility, and tracking recent Linux kernel developments. And, we show how analyzing these sometimes conflicting goals helped us make design and implementation decisions.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.

\(^{*}\)Other names and brands may be claimed as the property of others.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 149.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akkan, H., Ionkov, L., & Lang, M. (2013). Transparently consistent asynchronous shared memory. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’13. New York, NY, USA: ACM.

    Google Scholar 

  • Ali, N., Carns, P., Iskra, K., Kimpe, D., Lang, S., Latham, R., et al. (2009). Scalable I/O forwarding framework for high-performance computing systems. In IEEE International Conference on Cluster Computing and Workshops, 2009. CLUSTER ’09. (pp. 1–10).

    Google Scholar 

  • Brightwell, R., Oldfield, R., Maccabe, A. B., & Bernholdt, D. E. (2013). Hobbes: Composition and virtualization as the foundations of an extreme-scale OS/R. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’13 (pp. 2:1–2:8).

    Google Scholar 

  • Gerofi, B., Takagi, M., Ishikawa, Y., Riesen, R., Powers, E., & Wisniewski, R. W. (2015). Exploring the design space of combining Linux with lightweight kernels for extreme scale computing. In Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’15. New York, NY, USA: ACM.

    Google Scholar 

  • Gerofi, B., Takagi, M., Hori, A., Nakamura, G., Shirasawa, T., & Ishikawa, Y. (2016). On the scalability, performance isolation and device driver transparency of the IHK/McKernel hybrid lightweight kernel. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 1041–1050).

    Google Scholar 

  • Giampapa, M., Gooding, T., Inglett, T., & Wisniewski, R. (2010). Experiences with a lightweight supercomputer kernel: Lessons learned from Blue Gene’s CNK. In 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

    Google Scholar 

  • Intel (2018). mOS for HPC. https://github.com/intel/mOS/wiki.

  • Kocoloski, B. & Lange, J. (2014). HPMMAP: Lightweight memory management for commodity operating systems. In Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS ’14 (pp. 649–658). Washington, DC, USA: IEEE Computer Society.

    Google Scholar 

  • Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., et al. (2010). Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing. In IEEE International Symposium on Parallel Distributed Processing (IPDPS).

    Google Scholar 

  • NERSC (2013). PAL system noise activity program (PSNAP). https://www.nersc.gov/users/computational-systems/cori/nersc-8-procurement/trinity-nersc-8-rfp/nersc-8-trinity-benchmarks/psnap/.

  • Otstott, D., Evans, N., Ionkov, L., Zhao, M., & Lang, M. (2014). Enabling composite applications through an asynchronous shared memory interface. In 2014 IEEE International Conference on Big Data, Big Data 2014, Washington, DC, USA, October 27–30, 2014 (pp. 219–224).

    Google Scholar 

  • Ouyang, J., Kocoloski, B., Lange, J., & Pedretti, K. (2015). Achieving performance isolation with lighweight co-kernels. In Proceeding of the 24th International ACM Symposium on High Performance Distributed Computing (HPDC).

    Google Scholar 

  • Park, Y., Van Hensbergen, E., Hillenbrand, M., Inglett, T., Rosenburg, B., Ryu, K. D., & Wisniewski, R. (2012). FusedOS: Fusing LWK performance with FWK functionality in a heterogeneous environment. In 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) (pp. 211–218).

    Google Scholar 

  • Riesen, R., Brightwell, R., Bridges, P. G., Hudson, T., Maccabe, A. B., Widener, P. M., et al. (2009). Designing and implementing lightweight kernels for capability computing. Concurrency and Computation: Practice and Experience, 21(6), 793–817.

    Article  Google Scholar 

  • Riesen, R., Maccabe, A. B., Gerofi, B., Lombard, D. N., Lange, J. J., & Pedretti, K., et al. (2015). What is a lightweight kernel? In Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’15. New York, NY, USA: ACM.

    Google Scholar 

  • Shimosawa, T., Gerofi, B., Takagi, M., Nakamura, G., Shirasawa, T., Saeki, Y., et al. (2014). Interface for heterogeneous kernels: A framework to enable hybrid OS designs targeting high performance computing on manycore architectures. In 21th International Conference on High Performance Computing, HiPC.

    Google Scholar 

  • Weinhold, C., Lackorzynski, A., Bierbaum, J., Küttler, M., Planeta, M., Härtig, H., et al. (2016). FFMK: A fast and fault-tolerant microkernel-based system for exascale computing. In H.-J. Bungartz, P. Neumann & W. E. Nagel (Eds.) Software for exascale computing - SPPEXA 2013–2015 (pp. 405–426). Cham: Springer International Publishing.

    Google Scholar 

  • Wisniewski, R. W., Inglett, T., Keppel, P., Murty, R., & Riesen, R. (2014). mOS: An architecture for extreme-scale operating systems. In Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’14 (pp. 2:1–2:8). New York, NY, USA: ACM.

    Google Scholar 

Download references

Acknowledgements

This project is a direct result of the work by the current mOS team John Attinella, Sharath Bhat, Jai Dayal, David van Dresser, Tom Musta, Rolf Riesen, Lance Shuler, Andrew Tauferner, and Robert W. Wisniewski, but has been influenced and shaped by many people. Conversations, feedback, advice, and review of our work helped make mOS what it is today. People who provided guidance include Michael Blocksome, Todd Inglett, Pardo Keppel, Jim Dinan, Keith Underwood, Joe Robichaux, Ulf Hannebutte, Thomas Spelce, and Philippe Thierry.

We had many interactions with the IHK/McKernel  team and greatly benefited from being able to use early prototypes of IHK/McKernel. We thank, Yutaka Ishikawa, Balazs Gerofi, and Masamichi Takagi.

Evan Powers, Steven T. Hampson, and Kurt Alstrup worked on the first prototype of mOS. Kurt created the first scheduler and greatly reduced noise. Ravi Murty was very much involved in early mOS architecture discussions and helped to create an initial list of requirements.

We thank Andi Kleen and Ramakrishna (Rama) Karedla for their help and suggestions with BIOS settings and Linux boot command options, and Andi for help understanding how Linux works.

James Cownie had the idea to collect progress threads on a single logical CPU by making it the default for all newly created threads which do not specifically request a CPU. Eric Barton and Jeff Hammond participated in thread scheduling and placement discussions and provided insight into the needs of MPI, SHMEM, and high-performance I/O.

Ralph Castain helped refine the Linux-side requirements.

A large number of supercomputing OS experts helped refine the characteristics and definition of an LWK. We thank Ron Brightwell, Kurt Ferreira, Kamil Iskra, Larry Kaplan, Mike Lang, Jack Lange, David Lombard, Arthur B. (Barney) Maccabe, Yoonho Park, and Kevin Pedretti.

Michael H. O’Hara managed the implementation team for the first year and helped organize getting the first prototype off the ground. Mike Julier took over and continued to drive the implementation team toward v0.1 of mOS.

We have been working closely with Balazs Gerofi and thank him for much valuable input and helping us understand IHK/McKernel  better.

Optimization Notice Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. \(^{*}\)Other names and brands may be claimed as the property of others.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rolf Riesen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Riesen, R., Wisniewski, R.W. (2019). mOS for HPC. In: Gerofi, B., Ishikawa, Y., Riesen, R., Wisniewski, R.W. (eds) Operating Systems for Supercomputers and High Performance Computing. High-Performance Computing Series, vol 1. Springer, Singapore. https://doi.org/10.1007/978-981-13-6624-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6624-6_18

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6623-9

  • Online ISBN: 978-981-13-6624-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics