skip to main content
10.1145/3240302.3240318acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

AWGR-based optical processor-to-memory communication for low-latency, low-energy vault accesses

Published:01 October 2018Publication History

ABSTRACT

Memory cubes (MCs), following the general concept of Micron's Hybrid Memory Cube (HMC), represent a promising memory architecture for high scalability and energy efficiency due to their partitioned 3D-stacked DRAM, high-speed serial links, abstract packet-switched interface, and on-die switching fabric connecting the host processor with the MC's different partitions ('vaults'). While previous studies have shown that implementing processor-to-MC links with silicon-photonic (SiP) integrated optical links offers higher energy efficiency and bandwidth density, they keep the electrical switching fabric inside the MC die and perform signal conversion prior to routing packets across the switch.

We believe that the technological limitations of electrical interconnects in terms of energy consumption, the large size of MC dies, the high radix of the on-die switch, and the bandwidth demands will all ultimately turn the on-MC switching fabric into a critical issue in terms of energy and latency. Using an integrated optical switching fabric alleviates all of these issues and allows the host processor to directly communicate with the MC vaults by exploiting wavelength routing, thereby eliminating the need for electrical switch traversal. In particular, we propose to use Arrayed Waveguide Grating Routers (AWGRs) which offer a compact SiP switching fabric with a connectivity pattern that is ideal as the on-MC switch. Our simulation results show that exploiting AWGRs and direct processor-to-vault communication reduces both MC access energy and latency by up to 40% (on average) on PARSEC/SPLASH-2 workloads.

References

  1. Yasuhiko Arakawa, Takahiro Nakamura, Yutaka Urino, and Tomoyuki Fujita. 2013. Silicon photonics for next generation system integration platform. IEEE Communications Magazine 51, 3 (2013), 72--77.Google ScholarGoogle ScholarCross RefCross Ref
  2. Meisam Bahadori, Sébastien Rumley, Dessislava Nikolova, and Keren Bergman. 2016. Comprehensive design space exploration of silicon photonic interconnects. Journal of Lightwave Technology 34, 12 (2016), 2975--2987.Google ScholarGoogle ScholarCross RefCross Ref
  3. Scott Beamer, Krste Asanović, Christopher Batten, Ajay Joshi, and Vladimir Stojanović. 2009. Designing multi-socket systems using silicon photonics. In Proceedings of the 23rd international conference on Supercomputing (ICS). ACM, 521--522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Scott Beamer, Chen Sun, Yong-Jin Kwon, Ajay Joshi, Christopher Batten, Vladimir Stojanović, and Krste Asanović. 2010. Re-architecting DRAM memory systems with monolithically integrated silicon photonics. In ACM SIGARCH Computer Architecture News, Vol. 38. ACM, 129--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Keren Bergman et al. 2016. Photonic network-on-chip design. Springer.Google ScholarGoogle Scholar
  6. Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT). ACM, 72--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Wim Bogaerts and Shankar Kumar Selvaraja. 2011. Compact single-mode silicon hybrid rib/strip waveguide with adiabatic bends. IEEE Photonics Journal 3, 3 (2011), 422--432.Google ScholarGoogle ScholarCross RefCross Ref
  8. Yigit Demir and Nikos Hardavellas. 2016. SLaC: Stage laser control for a flattened butterfly network. In International Symposium on High Performance Computer Architecture (HPCA). IEEE, 321--332.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yigit Demir, Yan Pan, Seukwoo Song, Nikos Hardavellas, John Kim, and Gokhan Memik. 2014. Galaxy: A high-performance energy-efficient multi-chip architecture using photonic interconnects. In Proceedings of the 28th ACM international conference on Supercomputing (ICS). ACM, 303--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Paolo Grani, Roberto Proietti, Venkatesh Akella, and SJ Ben Yoo. 2017. Design and Evaluation of AWGR-Based Photonic NoC Architectures for 2.5 D Integrated High Performance Computing Systems. In IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 289--300.Google ScholarGoogle Scholar
  11. Paolo Grani, Roberto Proietti, Stanley Cheung, and SJ Ben Yoo. 2016. Flat-topology high-throughput compute node with AWGR-based optical-interconnects. Journal of Lightwave Technology 34, 12 (2016), 2959--2968.Google ScholarGoogle ScholarCross RefCross Ref
  12. Parisa Khadem Hamedani, Natalie Enright Jerger, and Shaahin Hessabi. 2014. Qut: A low-power optical network-on-chip. In Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS). IEEE, 80--87.Google ScholarGoogle ScholarCross RefCross Ref
  13. Wim Heirman, Trevor Carlson, and Lieven Eeckhout. 2012. Sniper: scalable and accurate parallel multi-core simulation. In HiPEAC. High-Performance and Embedded Architecture and Compilation Network of Excellence (HiPEAC), 91--94.Google ScholarGoogle Scholar
  14. JEDEC. 2015. High bandwidth memory (HBM) DRAM. https://www.jedec.org/standards-documents/docs/jesd235a. {Online; accessed 03-14-2018}.Google ScholarGoogle Scholar
  15. S Kamei, M Ishii, M Itoh, T Shibata, Y Inoue, and T Kitagawa. 2003. 64x 64-channel uniform-loss and cyclic-frequency arrayed-waveguide grating router module. Electronics Letters 39, 1 (2003), 83--84.Google ScholarGoogle ScholarCross RefCross Ref
  16. Gwangsun Kim, John Kim, Jung Ho Ahn, and Jaeha Kim. 2013. Memory-centric system interconnect design with hybrid memory cubes. In Proceedings of the 22nd international conference on Parallel architectures and compilation techniques (PACT). IEEE Press, 145--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ashok V Krishnamoorthy, Ron Ho, Xuezhe Zheng, Herb Schwetman, Jon Lexau, Pranay Koka, GuoLiang Li, Ivan Shubin, and John E Cunningham. 2009. Computer systems based on silicon photonic interconnects. Proc. IEEE 97, 7 (2009), 1337--1361.Google ScholarGoogle ScholarCross RefCross Ref
  18. Micron. 2014. Hybrid Memory Cube Specification 2.1. http://www.hybridmemorycube.org/files/SiteDownloads/HMC-30G-VSR_HMCC_Specification_Rev2.1_20151105.pdf. {Online; accessed 11-14-2018}.Google ScholarGoogle Scholar
  19. Micron. 2017. Hybrid Memory Cube. {Online; accessed 03-14-2018}.Google ScholarGoogle Scholar
  20. Sajjad Moazeni, Sen Lin, Mark Wade, Luca Alloatti, Rajeev J Ram, Miloš Popović, and Vladimir Stojanović. 2017. A 40-Gb/s PAM-4 Transmitter Based on a Ring-Resonator Optical DAC in 45-nm SOI CMOS. IEEE Journal of Solid-State Circuits 52, 12 (2017), 3503--3516.Google ScholarGoogle ScholarCross RefCross Ref
  21. NVIDIA. 2017. NVIDIA Tesla V100 GPU Architecture. http://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf. {Online; accessed 03-14-2018}.Google ScholarGoogle Scholar
  22. Yan Pan, Prabhat Kumar, John Kim, Gokhan Memik, Yu Zhang, and Alok Choudhary. 2009. Firefly: Illuminating future network-on-chip with nanophotonics. In ACM SIGARCH Computer Architecture News, Vol. 37. ACM, 429--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J Thomas Pawlowski. 2011. Hybrid memory cube: breakthrough DRAM performance with a fundamentally re-architected DRAM subsystem. In Hot Chips, Vol. 23.Google ScholarGoogle Scholar
  24. Matthew Poremba, Itir Akgun, Jieming Yin, Onur Kayiran, Yuan Xie, and Gabriel H Loh. 2017. There and Back Again: Optimizing the Interconnect in Networks of Memory Cubes. In International Symposium on Computer Architecture (ISCA). ACM, 678--690. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Luca Ramini and Davide Bertozzi. 2012. Power efficiency of wavelength-routed optical NoC topologies for global connectivity of 3D multi-core processors. In Proceedings of the Fifth International Workshop on Network on Chip Architectures. ACM, 25--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Luca Ramini, Paolo Grani, Sandro Bartolini, and Davide Bertozzi. 2013. Contrasting wavelength-routed optical NoC topologies for power-efficient 3D-stacked multicore processors using physical-layer analysis. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE). EDA Consortium, 1589--1594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kuanping Shang, Shibnath Pathak, Chuan Qin, and SJ Ben Yoo. 2017. Low-Loss Compact Silicon Nitride Arrayed Waveguide Gratings for Photonic Integrated Circuits. IEEE Photonics Journal 9, 5 (2017), 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  28. Dong J Shin, Kwan S Cho, Ho C Ji, Beom S Lee, Sung G Kim, Jin K Bok, Sang H Choi, Yong H Shin, Jung H Kim, Shin Y Lee, et al. 2013. Integration of silicon photonics into DRAM process. In Optical Fiber Communication Conference (OFC/NFOEC). IEEE, 1--3.Google ScholarGoogle ScholarCross RefCross Ref
  29. Patrick Siegl, Rainer Buchty, and Mladen Berekovic. 2016. Data-centric computing frontiers: A survey on processing-in-memory. In The International Symposium on Memory Systems (MEMSYS). ACM, 295--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Avinash Sodani. 2015. Knights landing (knl): 2nd generation intel® xeon phi processor. In IEEE Hot Chips 27 Symposium (HCS). IEEE, 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  31. Phillip Stanley-Marbell, Victoria Caparros Cabezas, and Ronald Luijten. 2011. Pinned to the walls - Impact of packaging and application properties on the memory and power walls. In ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED). IEEE, 51--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tiehui Su, Guangyao Liu, Katherine E Badham, Samuel T Thurman, Richard L Kendrick, Alan Duncan, Danielle Wuchenich, Chad Ogden, Guy Chriqui, Shaoqi Feng, et al. 2018. Interferometric imaging using Si 3 N 4 photonic integrated circuits for a SPIDER imager. Optics express 26, 10 (2018), 12801--12812.Google ScholarGoogle Scholar
  33. Chen Sun, Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason Miller, Anant Agarwal, Li-Shiuan Peh, and Vladimir Stojanovic. 2012. DSENT-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In Sixth IEEE/ACM International Symposium on Networks-on-Chip (NoCS). IEEE, 201--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Chen Sun, Mark T Wade, Yunsup Lee, Jason S Orcutt, Luca Alloatti, Michael S Georgas, Andrew S Waterman, Jeffrey M Shainline, Rimas R Avizienis, Sen Lin, et al. 2015. Single-chip microprocessor that communicates directly using light. Nature 528, 7583 (2015), 534.Google ScholarGoogle Scholar
  35. Zhehui Wang, Zhengbin Pang, Peng Yang, Jiang Xu, Xuanqi Chen, Rafael KV Maeda, Zhifei Wang, Luan HK Duong, Haoran Li, and Zhe Wang. 2017. MOCA: An inter/intra-chip optical network for memory. In Proceedings of the 54th Annual Design Automation Conference (DAC) 2017. IEEE, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ke Wen, Hang Guan, David M Calhoun, David Donofrio, and John Shalf. 2016. Silicon photonic memory interconnect for many-core architectures. In High Performance Extreme Computing Conference (HPEC). IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  37. Sebastian Werner, Pouya Fotouhi, Roberto Proietti, Xian Xiao, and S.J. Ben Yoo. 2018. Energy-efficient High-throughput Photonic NoCs for 2.5D Integrated Systems: A Case for AWGRs. In 12th IEEE/ACM International Symposium on Networks-on-Chip (NOCS) (forthcoming). IEEE.Google ScholarGoogle Scholar
  38. Sebastian Werner, Javier Navaridas, and Mikel Luján. 2017. Designing Low-Power, Low-Latency Networks-on-Chip by Optimally Combining Electrical and Optical Links. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 265--276.Google ScholarGoogle ScholarCross RefCross Ref
  39. Sebastian Werner, Javier Navaridas, and Mikel Luján. 2017. Subchannel Scheduling for Shared Optical On-chip Buses. In 2017 IEEE 25th Annual Symposium on High-Performance Interconnects (HOTI). IEEE, 49--56.Google ScholarGoogle Scholar
  40. Sebastian Werner, Javier Navaridas, and Mikel Luján. 2017. A Survey on Optical Network-on-Chip Architectures. ACM Computing Surveys (CSUR) 50, 6 (2017), 89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Business Wire. 2015. Hybrid Memory Cube (HMC) and High-bandwidth Memory (HBM Global Market Report (2018--2023)). https://www.businesswire.com/news/home/20180312005484/en/Hybrid-Memory-Cube-HMC-High-bandwidth-Memory-HBM. {Online; accessed 03-14-2018}.Google ScholarGoogle Scholar
  42. Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In ACM SIGARCH Computer Architecture News. ACM, 24--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jia Zhan, Itir Akgun, Jishen Zhao, Al Davis, Paolo Faraboschi, Yuangang Wang, and Yuan Xie. 2016. A unified memory network architecture for in-memory computing in commodity servers. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. AWGR-based optical processor-to-memory communication for low-latency, low-energy vault accesses

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        MEMSYS '18: Proceedings of the International Symposium on Memory Systems
        October 2018
        361 pages
        ISBN:9781450364751
        DOI:10.1145/3240302

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 October 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader