skip to main content
research-article

K2: A Mobile Operating System for Heterogeneous Coherence Domains

Published:08 June 2015Publication History
Skip Abstract Section

Abstract

Mobile System-on-Chips (SoC) that incorporate heterogeneous coherence domains promise high energy efficiency to a wide range of mobile applications, yet are difficult to program. To exploit the architecture, a desirable, yet missing capability is to replicate operating system (OS) services over multiple coherence domains with minimum inter-domain communication. In designing such an OS, we set three goals: to ease application development, to simplify OS engineering, and to preserve the current OS performance. To this end, we identify a shared-most OS model for multiple coherence domains: creating per-domain instances of core OS services with no shared state, while enabling other extended OS services to share state across domains. To test the model, we build K2, a prototype OS on the TI OMAP4 SoC, by reusing most of the Linux 3.4 source. K2 presents a single system image to applications with its two kernels running on top of the two coherence domains of OMAP4. The two kernels have independent instances of core OS services, such as page allocation and interrupt management, as coordinated by K2; the two kernels share most extended OS services, such as device drivers, whose state is kept coherent transparently by K2. Despite platform constraints and unoptimized code, K2 improves energy efficiency for light OS workloads by 8x-10x, while incurring less than 9% performance overhead for two device drivers shared between kernels. Our experiences with K2 show that the shared-most model is promising.

References

  1. Yuvraj Agarwal, Steve Hodges, Ranveer Chandra, James Scott, Paramvir Bahl, and Rajesh Gupta. 2009. Somniloquy: Augmenting network interfaces to reduce PC energy usage. In Proc. USENIX NSDI. USENIX Association, Berkeley, CA, 365--380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Glenn Ammons, Jonathan Appavoo, Maria Butrico, Dilma Da Silva, David Grove, Kiyokuni Kawachiya, et al. 2007. Libra: A library operating system for a JVM in a virtualized execution environment. In Proc. VEE. ACM, 44--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jonathan Appavoo, Dilma Da Silva, Orran Krieger, Marc Auslander, Michal Ostrowski, Bryan Rosenburg, et al. 2007. Experience distributing objects in an SMMP OS. ACM Transactions on Computer Systems (TOCS) 25, 3 (2007), 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. ARM. 2010. ARM v7-M Architecture Reference Manual. Retrieved from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.subset.architecture.reference/index.html.Google ScholarGoogle Scholar
  5. Francisco J Ballesteros, Noah Evans, Charles Forsyth, Gorka Guardiola, Jim McKie, Ron Minnich, and Enrique Soriano. 2012. Nix: An operating system for high performance manycore computing. Bell Labs Technical Journal 17, 2 (2012), 41--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, et al. 2009. The multikernel: A new OS architecture for scalable multicore systems. In Proc. ACM SOSP. ACM, 29--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, M. Frans Kaashoek, Robert Morris, et al. 2008. Corey: An operating system for many cores. In Proc. USENIX OSDI, Vol. 8. 43--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Edouard Bugnion, Scott Devine, Kinshuk Govil, and Mendel Rosenblum. 1997. Disco: Running commodity operating systems on scalable multiprocessors. ACM Transactions on Computer Systems (TOCS) 15, 4 (1997), 412--447. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Chapin, M. Rosenblum, S. Devine, T. Lahiri, D. Teodosiu, and A. Gupta. 1995. Hive: Fault containment for shared-memory multiprocessors. In Proc. ACM SOSP (SOSP’95). ACM, New York, NY, 12--25. DOI:http://dx.doi.org/10.1145/224056.224059 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. David Cheriton. 1988. The V distributed system. Commun. ACM 31, 3 (1988), 314--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. 2012. Execution migration in a heterogeneous-ISA chip multiprocessor. In Proc. ACM ASPLOS. ACM, New York, NY, 261--272. DOI:http://dx.doi.org/10.1145/2150976.2151004 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. eLinux.org. 2012. PandaBoard Power Measurements. Retrieved from http://elinux.org/PandaBoard_Power_Measurements.Google ScholarGoogle Scholar
  13. Benjamin Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm. 1999. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proc. USENIX OSDI, Vol. 99. 87--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Isaac Gelado, John E. Stone, Javier Cabezas, Sanjay Patel, Nacho Navarro, and Wen-mei W. Hwu. 2010. An asymmetric distributed shared memory model for heterogeneous parallel systems. In Proc. ACM ASPLOS. ACM, New York, NY, 347--358. DOI:http://dx.doi.org/10.1145/1736020.1736059 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Peter Greenhalgh. 2011. Big.LITTLE Processing with ARM Cortex-A15 and Cortex-A7. Technical Report.Google ScholarGoogle Scholar
  16. Kai Li and Paul Hudak. 1989. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst. 7, 4 (Nov. 1989), 321--359. DOI:http://dx.doi.org/10.1145/75104.75105 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. X. Lin, Z. Wang, R. LiKamWa, and L. Zhong. 2012b. Reflex: Using low-power processors in smartphones without knowing them. In Proc. ACM ASPLOS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. X. Lin, Z. Wang, and L. Zhong. 2012a. Supporting distributed execution of smartphone workloads on loosely coupled heterogeneous processors. In Proc. Workshp. Power-Aware Computing and Systems (HotPower’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David Scott, Balraj Singh, Thomas Gazagnaire, et al. 2013. Unikernels: Library operating systems for the cloud. In Proc. ACM ASPLOS. ACM, 461--472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Timothy G. Mattson, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, et al. 2010. The 48-core SCC processor: The programmer’s view. In Proc. IEEE/ACM SC Conf. IEEE Computer Society, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. NICTA. 2012. Linux-Panda Project. Retrieved from http://www.ertos.nicta.com.au/downloads/linux-panda-m3.tbz2.Google ScholarGoogle Scholar
  22. Edmund B. Nightingale, Orion Hodson, Ross McIlroy, Chris Hawblitzel, and Galen Hunt. 2009. Helios: Heterogeneous multiprocessing with satellite kernels. In Proc. ACM SOSP (SOSP’09). ACM, New York, NY, 221--234. DOI:http://dx.doi.org/10.1145/1629575.1629597 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. NVIDIA. 2011. Tegra2 Family: Technical Reference Manual. Retrieved from https://developer.nvidia.com/tegra-2-technical-reference-manual.Google ScholarGoogle Scholar
  24. NVIDIA. 2012. Tegra3 HD mobile processors: Technical Reference Manual. Retrieved from https://developer.nvidia.com/tegra-3-technical-reference-manual.Google ScholarGoogle Scholar
  25. Donald E. Porter, Silas Boyd-Wickizer, Jon Howell, Reuben Olinsky, and Galen C. Hunt. 2011. Rethinking the library OS from the top down. In Proc. ACM ASPLOS (ASPLOS XVI). ACM, New York, NY, 291--304. DOI:http://dx.doi.org/10.1145/1950365.1950399 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Bodhi Priyantha, Dimitrios Lymberopoulos, and Jie Liu. 2011. Littlerock: Enabling energy-efficient continuous sensing on mobile phones. IEEE Pervasive Computing 10, 2 (2011), 12--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Moo-Ryong Ra, Bodhi Priyantha, Aman Kansal, and Jie Liu. 2012. Improving energy efficiency of personal sensing applications with heterogeneous multi-processors. In Proc. ACM UbiComp. ACM, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Leonid Ryzhyk, Peter Chubb, Ihor Kuz, and Gernot Heiser. 2009. Dingo: Taming device drivers. In Proc. the European Conf. Computer Systems (EuroSys’09). ACM, 275--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Samsung. 2012. Exynos 4210 Application Processor. Retrieved from http://www.samsung.com/global/business/semiconductor/product/application/detail?productId=7644&iaId==844.Google ScholarGoogle Scholar
  30. D. J. Scales, K. Gharachorloo, and C. A. Thekkath. 1996. Shasta: A low overhead, software-only approach for supporting fine-grain shared memory. ACM SIGOPS Operating Systems Review 30, 5 (1996), 174--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. SGI. 1998. Cellular IRIX 6.4 Technical Report. Retrieved from http://www.sgistuff.net/software/irixintro/documents/irix6.4TR.html.Google ScholarGoogle Scholar
  32. Youngmin Shin, Ken Shin, Prashant Kenkare, Rajesh Kashyap, Hoi-Jin Lee, Dongjoo Seo, et al. 2013. 28nm high-metal-gate heterogeneous quad-core CPUs for high-performance and energy-efficient mobile application processor. In Proc. IEEE Intl. Solid-State Circuits Conf. (ISSCC’13). IEEE, 154--155.Google ScholarGoogle ScholarCross RefCross Ref
  33. Peter Smith and Norman C. Hutchinson. 1998. Heterogeneous process migration: The Tui system. Software-Practice and Experience 28, 6 (1998), 611--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jacob Sorber, Nilanjan Banerjee, Mark D. Corner, and Sami Rollins. 2005. Turducken: Hierarchical power management for mobile devices. In Proc. USENIX/ACM MobiSys. ACM, 261--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Daniel J. Sorin, Mark D. Hill, and David A. Wood. 2011. A Primer on Memory Consistency and Cache Coherence. Synthesis Lectures on Computer Architecture, Vol. 6. Morgan & Claypool. DOI:http://dx.doi.org/10.2200/S00346ED1V01Y201104CAC016 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Texas Instruments. 2010a. OMAP4 Applications Processor: Technical Reference Manual. Retrieved from http://www.ti.com/product/OMAP4470.Google ScholarGoogle Scholar
  37. Texas Instruments. 2010b. OMAP543x: Technical Reference Manual. Retrieved from http://www.ti.com/litv/pdf/swpu249v.Google ScholarGoogle Scholar
  38. Ronald C. Unrau, Orran Krieger, Benjamin Gamsa, and Michael Stumm. 1995. Hierarchical clustering: A structure for scalable multiprocessor operating system design. Journal of Supercomputing 9, 1--2 (1995), 105--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Carl A. Waldspurger. 2002. Memory resource management in VMware ESX server. SIGOPS Oper. Syst. Rev. 36, SI (Dec. 2002), 181--194. DOI:http://dx.doi.org/10.1145/844128.844146 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. David Wentzlaff and Anant Agarwal. 2009. Factored operating systems (FOS): The case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev. 43, 2 (2009), 76--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Fengyuan Xu, Yunxin Liu, Thomas Moscibroda, Ranveer Chandra, Long Jin, Yongguang Zhang, and Qun Li. 2013. Optimizing background email sync on smartphones. In Proc. USENIX/ACM MobiSys. 55--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Lin Zhong and Niraj K. Jha. 2006. Dynamic power optimization targeting user delays in interactive systems. IEEE Trans. Mobile Computing 5, 11 (2006), 1473--1488. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. K2: A Mobile Operating System for Heterogeneous Coherence Domains

      Recommendations

      Reviews

      Bayard Kohlhepp

      Power management is the unsung hero of mobile computing and the limiting frontier of future mobile technology. Improvements in energy usage will expand our horizons past mobile through wearable and on to implantable technology, but first we have to lose the battery. The K2 operating system (OS) is an incremental advance in reduced energy consumption. New mobile system-on-chips (SoCs) are divided into full-power and reduced-power sections ("coherence domains") in order to consume less power, but assigning software to the proper domain is difficult. So is communication and movement between those domains as applications change state. K2 simplifies the programming of multiple coherence domains while preserving performance. According to the authors, overhead on their TI OMAP4 SoC was kept to a six percent penalty while light workloads were energy-optimized by a factor of ten (read the paper for specifics). The authors created this new OS with three goals in mind: ease application development, simplify OS engineering, and preserve performance. They chose a shared-most model where kernel services are replicated in each coherence domain. This means kernel services look the same to programmers regardless of assigned domain, effectively making domains invisible. Programmers don't have to code energy awareness. K2 works like any old Linux system with only a minor performance hit, satisfying all of the authors' goals. K2 was supported by a National Science Foundation (NSF) CAREER Award and is available for download from www.k2os.org. Work continues on K2 as the underlying SoC architectures keep advancing. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computer Systems
        ACM Transactions on Computer Systems  Volume 33, Issue 2
        June 2015
        86 pages
        ISSN:0734-2071
        EISSN:1557-7333
        DOI:10.1145/2785582
        Issue’s Table of Contents

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 June 2015
        • Revised: 1 November 2014
        • Received: 1 November 2014
        • Accepted: 1 November 2014
        Published in tocs Volume 33, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader