skip to main content
10.1145/3400302.3415636acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

A many-core accelerator design for on-chip deep reinforcement learning

Authors Info & Claims
Published:17 December 2020Publication History

ABSTRACT

Deep Reinforcement Learning (DRL) is substantially resource-consuming, and it requires large-scale distributed computing-nodes to learn complicated tasks, like videogame and Go play. This work attempts to down-scale a distributed DRL system into a specialized many-core chip and achieve energy-efficient on-chip DRL. With the customized Network-on-Chip that handles the communication of on-chip data and control-signals, we proposed a Synchronous Asynchronous RL Architecture (SARLA) and the according many-core chip that completely avoids the unnecessary data duplication and synchronization activities in multi-node RL systems. In evaluation, the SARLA system achieves considerable energy-efficiency boost over the GPU-based implementations for typical DRL workloads built with OpenAI-gym.

References

  1. Mnih V, Kavukcuoglu K, Silver D, et al. "Human-level control through deep reinforcement learning," Nature, 2015, 518(7540): 529--533Google ScholarGoogle ScholarCross RefCross Ref
  2. Arun. Nair, et al. "Massively parallel methods for deep reinforcement learning. In ICML Deep Learning Workshop. 2015.Google ScholarGoogle Scholar
  3. Mnih V, Badia A P, Mirza M, et al. "Asynchronous methods for deep reinforcement learning," In Proc. ICML. New York, USA, 2016: 1928--1937Google ScholarGoogle Scholar
  4. Y.-H. Chen, et al. "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 127--138, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  5. N. P. Jouppi, et al., "In-datacenter performance analysis of a tensor processing unit," arXiv preprint arXiv:1704.04760, 2017Google ScholarGoogle Scholar
  6. Y. Chen, et al., " DaDianNao: A Machine-Learning Supercomputer," in Proc. MICRO, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sutton R S, A G. Barto, "Reinforcement learning: an introduction," Cambridge: MIT press, 1998Google ScholarGoogle Scholar
  8. M. Riedmiller, "Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method," In Proc. ICML, 2005.Google ScholarGoogle Scholar
  9. Lange S, et al., "Autonomous reinforcement learning on raw visual input data in a real world application. In Proc. IJCNN, Australia, 2012.Google ScholarGoogle Scholar
  10. W. Wen, et al., "Learning structured sparsity in deep neural networks," in Proc. NIPS, 2016, pp. 2074--2082.Google ScholarGoogle Scholar
  11. T. Lillicrap, et al., "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015.Google ScholarGoogle Scholar
  12. D. Kim et al., 3D-MAPS: 3D Massively Parallel Processor with Stacked Memory, In Proc. Solid-State Circuits Conference (ISSCC), pp.188--190, 2012.Google ScholarGoogle Scholar
  13. Hoeju Chung, et al., A 58nm 1.8V 1Gb PRAM with 6.4MB/s program BW, In Proc. Solid-State Circuits Conference (ISSCC), pp.588--590, 2011.Google ScholarGoogle Scholar
  14. B. C. Lee et al., Architecting Phase Change Memory as a Scalable DRAM Alternative, In Proc. International Symposium on Computer Architecture (ISCA), pp.2--12, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Seshadri et al., RowClone: fast and energy-efficient in-DRAM bulk data copy and initialization, in Proc. International Symposium on Microarchitecture (MICRO), pp. 185--197, 2013.Google ScholarGoogle Scholar
  16. G. Graefe et al., B-tree indexes and CPU caches, In Proc. International Conference on Data Engineering (ICDE), 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Horspool, Practical fast searching in strings, J. Software: Practice and Experience, vol.10, no.6, pp.501--506, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Chhugani, Efficient Implementation of Sorting on MultiCore SIMD CPU Architecture, In Proc. the VLDB Endowment, vol.1, no.2, pp.1313--1324, 2008.Google ScholarGoogle Scholar
  19. R. Ubal et al., Multi2Sim: a simulation framework for CPU-GPU computing, In Proc. Parallel architectures and compilation techniques (PACT), pp.335--344, 2012.Google ScholarGoogle Scholar
  20. X. Dong et al., NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Non-Volatile Memory, IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol.31, no.7, pp.994--1007, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. Ahmad et al., PUMA: Purdue MapReduce Benchmarks Suite, Technical Report, Purdue ECE Tech Report TR-ECE-12-11.Google ScholarGoogle Scholar
  22. M. Guthaus et al., MiBench: A free, commercially representative embedded benchmark suite, In Proc. Workload Characterization (WWC), pp.3--14, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  23. OpenCV library; http://code.opencv.org.Google ScholarGoogle Scholar
  24. Pizza&Chili repository, http://pizzachili.dcc.uchile.cl/texts.htmlGoogle ScholarGoogle Scholar
  25. DARPA Intrusion Detection Data Sets, http://www.ll.mit.edu/mission/Google ScholarGoogle Scholar
  26. P. Svärd et al. Evaluation of delta compression techniques for efficient live migration of large virtual machines, in Proc. Virtual execution environments (VEE), pp.111--120, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Li et al., McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures, In International Symposium on Microarchitecture (MICRO), pp.469--480, 2009.Google ScholarGoogle Scholar
  28. Free PDK 45nm open-access based PDK for the 45nm technology node. http://www.eda.ncsu.edu/wiki/FreePDK.Google ScholarGoogle Scholar

Index Terms

  1. A many-core accelerator design for on-chip deep reinforcement learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design
        November 2020
        1396 pages
        ISBN:9781450380263
        DOI:10.1145/3400302
        • General Chair:
        • Yuan Xie

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 December 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate457of1,762submissions,26%

        Upcoming Conference

        ICCAD '24
        IEEE/ACM International Conference on Computer-Aided Design
        October 27 - 31, 2024
        New York , NY , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader