skip to main content
10.1145/3173162.3173205acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler

Published:19 March 2018Publication History

ABSTRACT

Different from developing neural networks (NNs) for general-purpose processors, the development for NN chips usually faces with some hardware-specific restrictions, such as limited precision of network signals and parameters, constrained computation scale, and limited types of non-linear functions. This paper proposes a general methodology to address the challenges. We decouple the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware's constraints. We propose multiple techniques to make the transformation adaptable to different kinds of NN chips, and reliable for restrict hardware constraints. We have built such a software tool that supports both spiking neural networks (SNNs) and traditional artificial neural networks (ANNs). We have demonstrated its effectiveness with a fabricated neuromorphic chip and a processing-in-memory (PIM) design. Tests show that the inference error caused by this solution is insignificant and the transformation time is much shorter than the retraining time. Also, we have studied the parameter-sensitivity evaluations to explore the tradeoffs between network error and resource utilization for different transformation strategies, which could provide insights for co-design optimization of neuromorphic hardware and software.

References

  1. Mart'ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Gregory S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian J. Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Józefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Gordon Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul A. Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda B. Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR Vol. abs/1603.04467 (2016). http://arxiv.org/abs/1603.04467Google ScholarGoogle Scholar
  2. A. Agarwal, E. Akchurin, and C. Basoglu. 2014. An introduction to computational networks and the computational network toolkit. (2014).Google ScholarGoogle Scholar
  3. Filipp Akopyan, Jun Sawada, Andrew Cassidy, Rodrigo Alvarez-Icaza, John Arthur, Paul Merolla, Nabil Imam, Yutaka Nakamura, Pallab Datta, Gi-Joon Nam, Brian Taba, Michael Beakes, Bernard Brezzo, Jente B Kuang, Rajit Manohar, William P Risk, Bryan Jackson, and Dharmendra S Modha. 2015. Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Vol. 34, 10 (2015), 1537--1557.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermüller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron C. Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Melanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian J. Goodfellow, Matthew Graham, cCaglar Gülccehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefranccois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Joseph Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, Franccois Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph P. Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, and Ying Zhang. 2016. Theano: A Python framework for fast computation of mathematical expressions. CoRR Vol. abs/1605.02688 (2016). http://arxiv.org/abs/1605.02688Google ScholarGoogle Scholar
  5. Arnon Amir, Pallab Datta, William P Risk, Andrew S Cassidy, Jeffrey A Kusnitz, Steve K Esser, Alexander Andreopoulos, Theodore M Wong, Myron Flickner, Rodrigo Alvarez-Icaza, Emmett McQuinn, Ben Shaw, Norm Pass, and Dharmendra S Modha. 2013. Cognitive computing programming paradigm: a corelet language for composing networks of neurosynaptic cores. In Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 1--10.Google ScholarGoogle Scholar
  6. Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. 2015. Fixed point optimization of deep convolutional neural networks for object recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 1131--1135.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ben Varkey Benjamin, Peiran Gao, Emmett McQuinn, Swadesh Choudhary, Anand R Chandrasekaran, Jean-Marie Bussat, Rodrigo Alvarez-Icaza, John V Arthur, Paul A Merolla, and Kwabena Boahen. 2014. Neurogrid: A mixed-analog-digital multichip system for large-scale neural simulations. Proc. IEEE Vol. 102, 5 (2014), 699--716.Google ScholarGoogle ScholarCross RefCross Ref
  8. Mahdi Nazm Bojnordi and Engin Ipek. 2016. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In High Performance Computer Architecture (HPCA), 2016 IEEE International Symposium on. 1--13.Google ScholarGoogle ScholarCross RefCross Ref
  9. Snaider Carrillo, Jim Harkin, Liam J McDaid, Fearghal Morgan, Sandeep Pande, Seamus Cawley, and Brian McGinley. 2013. Scalable hierarchical network-on-chip architecture for spiking neural network hardware implementations. IEEE Transactions on Parallel and Distributed Systems Vol. 24, 12 (2013), 2451--2461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrew S Cassidy, Paul Merolla, John V Arthur, Steve K Esser, Bryan Jackson, Rodrigo Alvarez-Icaza, Pallab Datta, Jun Sawada, Theodore M Wong, Vitaly Feldman, Arnon Amir, Daniel Ben-Dayan Rubin, Filipp Akopyan, Emmett McQuinn, William P Risk, and Dharmendra S Modha. 2013. Cognitive computing building block: A versatile and efficient digital neuron model for neurosynaptic cores. In Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 1--10.Google ScholarGoogle Scholar
  11. Lukas Cavigelli and Luca Benini. 2016. A 803 gop/s/w convolutional network accelerator. IEEE Transactions on Circuits and Systems for Video Technology (2016).Google ScholarGoogle Scholar
  12. Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014 a. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In ACM Sigplan Notices, Vol. Vol. 49. ACM, 269--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015 b. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274.Google ScholarGoogle Scholar
  14. Wenlin Chen, James Wilson, Stephen Tyree, Kilian Weinberger, and Yixin Chen. 2015 c. Compressing neural networks with the hashing trick International Conference on Machine Learning. 2285--2294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Teman. 2014 b. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 609--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. H. Chen, T. Krishna, J. Emer, and V. Sze. 2016. 14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. In 2016 IEEE International Solid-State Circuits Conference (ISSCC). 262--263.Google ScholarGoogle Scholar
  17. Zhen Chen, Bin Gao, Zheng Zhou, Peng Huang, Haitong Li, and Wenjia Ma. 2015 a. Optimized learning scheme for grayscale image recognition in a RRAM based analog neuromorphic system. In Electron Devices Meeting (IEDM), 2015 IEEE International. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  18. Ping Chi, Shuangchen Li, Cong Xu, Tao Zhang, Jishen Zhao, Yongpan Liu, Yu Wang, and Yuan Xie. 2016. Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 27--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ronan Collobert, Koray Kavukcuoglu, and Clement Farabet. 2011. Torch7: A Matlab-like Environment for Machine Learning neural information processing systems.Google ScholarGoogle Scholar
  20. Misha Denil, Babak Shakibi, Laurent Dinh, Marctextquotesingle Aurelio Ranzato, and Nando de Freitas. 2013. Predicting Parameters in Deep Learning. In Advances in Neural Information Processing Systems 26, bibfieldeditorC. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2148--2156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. 2015. ShiDianNao: Shifting vision processing closer to the sensor ACM SIGARCH Computer Architecture News, Vol. Vol. 43. ACM, 92--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Steven K Esser, Paul A Merolla, John V Arthur, Andrew S Cassidy, Rathinakumar Appuswamy, Alexander Andreopoulos, David J Berg, Jeffrey L McKinstry, Timothy Melano, Davis R Barch, Carmelo di Nolfo, Pallab Datta, Arnon Amir, Brian Taba, Myron D Flickner, and Dharmendra S Modha. 2016. Convolutional networks for fast, energy-efficient neuromorphic computing. Proceedings of the National Academy of Sciences (2016), 201604850.Google ScholarGoogle ScholarCross RefCross Ref
  23. Clément Farabet, Berin Martini, Benoit Corda, Polina Akselrod, Eugenio Culurciello, and Yann LeCun. 2011. Neuflow: A runtime reconfigurable dataflow processor for vision Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 109--116.Google ScholarGoogle Scholar
  24. Clément Farabet, Cyril Poulet, Jefferson Y Han, and Yann LeCun. 2009. Cnp: An fpga-based processor for convolutional networks Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on. IEEE, 32--37.Google ScholarGoogle Scholar
  25. Steve B Furber, David R Lester, Luis A Plana, Jim D Garside, Eustace Painkras, Steve Temple, and Andrew D Brown. 2013. Overview of the spinnaker system architecture. IEEE Trans. Comput. Vol. 62, 12 (2013), 2454--2467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, and William J. Dally. 2016 a. ESE: Efficient Speech Recognition Engine with Compressed LS™ on FPGA. CoRR Vol. abs/1612.00694 (2016). http://arxiv.org/abs/1612.00694 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016 b. EIE: efficient inference engine on compressed deep neural network Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 243--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google ScholarGoogle Scholar
  29. Bart LM Happel and Jacob MJ Murre. 1994. Design and evolution of modular neural network architectures. Neural networks Vol. 7, 6 (1994), 985--1004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward networks are universal approximators. Neural networks Vol. 2, 5 (1989), 359--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Miao Hu, Hai Li, Yiran Chen, Qing Wu, and Garrett S Rose. 2013. BSB training scheme implementation on memristor-based circuit Computational Intelligence for Security and Defense Applications (CISDA), 2013 IEEE Symposium on. IEEE, 80--87.Google ScholarGoogle Scholar
  32. Miao Hu, John Paul Strachan, Zhiyong Li, Emmanuelle M Grafals, Noraica Davila, Catherine Graves, Sity Lam, Ning Ge, Jianhua Joshua Yang, and R Stanley Williams. 2016. Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Eric Hunsberger and Chris Eliasmith. 2016. Training Spiking Deep Networks for Neuromorphic Hardware. CoRR Vol. abs/1611.05141 (2016). http://arxiv.org/abs/1611.05141Google ScholarGoogle Scholar
  34. Kyuyeon Hwang and Wonyong Sung. 2014. Fixed-point feedforward deep neural network design using weightsGoogle ScholarGoogle Scholar
  35. 1, 0, and- 1 Signal Processing Systems (SiPS), 2014 IEEE Workshop on. IEEE, 1--6.Google ScholarGoogle Scholar
  36. YU Ji, Youhui Zhang, ShuangChen Li, Ping Chi, CiHang Jiang, Peng Qu, Yuan Xie, and WenGuang Chen. 2016. NEUTRAMS: Neural Network Transformation and Co-design under Neuromorphic Hardware Constraints. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding Proceedings of the 22nd ACM international conference on Multimedia. ACM, 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, Richard C. Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. CoRR Vol. abs/1704.04760. http://arxiv.org/abs/1704.04760 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Patrick Judd, Jorge Albericio, Tayler Hetherington, Tor M Aamodt, and Andreas Moshovos. 2016. Stripes: Bit-serial deep neural network computing. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Duckhwan Kim, Jaeha Kung, Sek Chai, Sudhakar Yalamanchili, and Saibal Mukhopadhyay. 2016. Neurocube: A programmable digital neuromorphic architecture with high-density 3D memory. In Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on. IEEE, 380--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yongtae Kim, Yong Zhang, and Peng Li. 2015. A Reconfigurable Digital Neuromorphic Processor with Memristive Synaptic Crossbar for Cognitive Computing. J. Emerg. Technol. Comput. Syst. Vol. 11, 4, Article bibinfoarticleno38 (April. 2015), 25 pages. dl.acm.org/citation.cfm?id=1373936.1373969 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing fpga-based accelerator design for deep convolutional neural networks Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shijin Zhang, Zidong Du, Lei Zhang, Huiying Lan, Shaoli Liu, Ling Li, Qi Guo, Tianshi Chen, and Yunji Chen. 2016. Cambricon-X: An accelerator for sparse neural networks Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
      March 2018
      827 pages
      ISBN:9781450349116
      DOI:10.1145/3173162
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 53, Issue 2
        ASPLOS '18
        February 2018
        809 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/3296957
        Issue’s Table of Contents

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 March 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ASPLOS '18 Paper Acceptance Rate56of319submissions,18%Overall Acceptance Rate535of2,713submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader