Skip to main content

In-Memory Computing Architectures for Big Data and Machine Learning Applications

  • Conference paper
  • First Online:
Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications (FDSE 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1688))

Included in the following conference series:

  • 1871 Accesses

Abstract

Traditional computing hardware is working to meet the extensive computational load presented by the rapidly growing Machine Learning (ML) and Artificial Intelligence algorithms such as Deep Neural Networks and Big Data. In order to get hardware solutions to meet the low-latency and high-throughput computational needs of these algorithms, Non-Von Neumann computing architectures such as In-memory Computing (IMC) have been extensively researched and experimented with over the last five years. This study analyses and reviews works designed to accelerate Machine Learning task. We investigate different architectural aspects and directions and provide our comparative evaluations. We further discuss IMC research’s challenges and limitations and present possible directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hashiyana, V., Suresh, N., Sverdlik, W.: Big data: We’re almost at infinity. In: 2017 IST-Africa Week Conference (IST-Africa), pp. 1–7. IEEE (2017)

    Google Scholar 

  2. Salkuti, S.R.: A survey of big data and machine learning. Int. J. Electr. Comput. Eng. (2088–8708) 10(1) (2020)

    Google Scholar 

  3. Zhang, Y., Huang, T., Bompard, E.F.: Big data analytics in smart grids: a review. Energy Inform. 1(1), 1–24 (2018). https://doi.org/10.1186/s42162-018-0007-5

    Article  Google Scholar 

  4. Khan, A.I., Al-Habsi, S.: Machine learning in computer vision. Procedia Comput. Sci. 167, 1444–1451 (2020)

    Article  Google Scholar 

  5. Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Philosophical Trans. Roy. Soc. A 379(2194), 20200209 (2021)

    Article  MathSciNet  Google Scholar 

  6. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

    Google Scholar 

  7. Khan, A.A., Laghari, A.A., Awan, S.A.: Machine learning in computer vision: a review. EAI Endorsed Trans. Scalable Inf. Syst. 8(32), e4 (2021)

    Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  10. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  11. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  12. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)

    Google Scholar 

  13. Wang, W., Yang, Y., Wang, X., Wang, W., Li, J.: Development of convolutional neural network and its application in image classification: a survey. Optical Eng. 58(4), 040901 (2019)

    Article  Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  15. Bradski, G.: The opencv library. Dr. Dobb’s J. Softw. Tools Prof. Programmer 25(11), 120–123 (2000)

    Google Scholar 

  16. Longa, A., Santin, G., Pellegrini, G.: Pyg, torch_geometric (2022). http://github.com/PyGithub/PyGithub. Accessed 24 Sept 2022

  17. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst. 32(1), 4–24 (2020)

    Article  MathSciNet  Google Scholar 

  18. Zhao, R., Luk, W., Niu, X., Shi, H., Wang, H.: Hardware acceleration for machine learning. In: 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 645–650. IEEE (2017)

    Google Scholar 

  19. Faggin, F., Mead, C.: Vlsi implementation of neural networks (1990)

    Google Scholar 

  20. Jesan, J.P., Lauro, D.M.: Human brain and neural network behavior: a comparison (2003)

    Google Scholar 

  21. Mijwel, M.M.: Artificial neural networks advantages and disadvantages. Retrieved from LinkedIn (2018) http://www.linkedin.com/pulse/artificial-neuralnetWork

  22. Reuben, J.: Rediscovering majority logic in the post-cmos era: a perspective from in-memory computing. J. Low Power Electron. Appl. 10(3), 28 (2020)

    Article  Google Scholar 

  23. Lynham, J.: How have catch shares been allocated? Marine Policy 44, 42–48 (2014)

    Article  Google Scholar 

  24. Hoschek, W., Jaen-Martinez, J., Samar, A., Stockinger, H., Stockinger, K.: Data management in an international data grid project. In: Buyya, R., Baker, M. (eds.) GRID 2000. LNCS, vol. 1971, pp. 77–90. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44444-0_8

    Chapter  Google Scholar 

  25. Kabakus, A.T., Kara, R.: A performance evaluation of in-memory databases. J. King Saud Univ.-Comput. Inf. Sci. 29(4), 520–525 (2017)

    Google Scholar 

  26. Rashed, M.R.H., Thijssen, S., Jha, S.K., Yao, F., Ewetz, R.: Stream: towards read-based in-memory computing for streaming based data processing. In: 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 690–695. IEEE (2022)

    Google Scholar 

  27. Peng, X., Huang, S., Jiang, H., Lu, A., Yu, S.: Dnn+ neurosim v2. 0: an end-to-end benchmarking framework for compute-in-memory accelerators for on-chip training. IEEE Trans. Comput.-Aided Des. Integrated Circuits Syst. 40(11), 2306–2319 (2020)

    Google Scholar 

  28. Angizi, S., He, Z., Fan, D.: Dima: a depthwise cnn in-memory accelerator. In: 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 1–8. IEEE (2018)

    Google Scholar 

  29. Ríos, C., et al.: In-memory computing on a photonic platform. Sci. Adv. 5(2), eaau5759 (2019)

    Google Scholar 

  30. Zanotti, T., Puglisi, F.M., Pavan, P.: Reconfigurable smart in-memory computing platform supporting logic and binarized neural networks for low-power edge devices. IEEE J. Emerging Sel. Top. Circuits Syst. 10(4), 478–487 (2020)

    Article  Google Scholar 

  31. Agrawal, A., Jaiswal, A., Lee, C., Roy, K.: X-sram: enabling in-memory boolean computations in cmos static random access memories. IEEE Trans. Circuits Syst. I: Regular Papers 65(12), 4219–4232 (2018)

    Google Scholar 

  32. Verma, N., et al.: In-memory computing: advances and prospects. IEEE Solid-State Circuits Mag. 11(3), 43–55 (2019)

    Article  Google Scholar 

  33. Wang, Y.: Design considerations for emerging memory and in-memory computing. In: VLSI 2020 Symposium on Technology and Circuits. Short Course 3(8) (2020)

    Google Scholar 

  34. Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R., Eleftheriou, E.: Memory devices and applications for in-memory computing. Nature Nanotechnol. 15(7), 529–544 (2020)

    Article  Google Scholar 

  35. Ielmini, D., Pedretti, G.: Device and circuit architectures for in-memory computing. Adv. Intell. Syst. 2(7), 2000040 (2020)

    Article  Google Scholar 

  36. Jawandhiya, P.: Hardware design for machine learning. Int. J. Artif. Intell. Appl. 9(1), 63–84 (2018)

    Google Scholar 

  37. Dazzi, M., Sebastian, A., Benini, L., Eleftheriou, E.: Accelerating inference of convolutional neural networks using in-memory computing. Front. Comput. Neurosci. 15, 674154 (2021)

    Google Scholar 

  38. Saikia, J., Yin, S., Jiang, Z., Seok, M., Seo, J.: K-nearest neighbor hardware accelerator using in-memory computing sram. In: 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 1–6. IEEE (2019)

    Google Scholar 

  39. Dietterich, T.G.: Machine-learning research. AI Mag. 18(4), 97–97 (1997)

    Google Scholar 

  40. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  41. Capra, M., Peloso, R., Masera, G., Roch, M.R., Martina, M.: Edge computing: a survey on the hardware requirements in the internet of things world. Future Internet 11(4), 100 (2019)

    Article  Google Scholar 

  42. Kim, J.-W., Kim, D.-S., Kim, S.-H., Shin, S.-M.: The firmware design and implementation scheme for c form-factor pluggable optical transceiver. Appl. Sci. 10(6), 2143 (2020)

    Article  Google Scholar 

  43. Freund, K.: A machine learning landscape: where amd, intel, nvidia, qualcomm and xilinx ai engines live. http://www.forbes.com/sites/moorinsights/2017/03/03, Forbes, 2022. Accessed 23 Sept 2022

  44. Chmielewski, Ł, Weissbart, L.: On reverse engineering neural network implementation on GPU. In: Zhou, J., et al. (eds.) ACNS 2021. LNCS, vol. 12809, pp. 96–113. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81645-2_7

    Chapter  Google Scholar 

  45. Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing fpga-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161–170 (2015)

    Google Scholar 

  46. Jung, S., Kim, S.: Hardware implementation of a real-time neural network controller with a dsp and an fpga for nonlinear systems. IEEE Trans. Ind. Electron. 54(1), 265–271 (2007)

    Article  Google Scholar 

  47. Sahin, S., Becerikli, Y., Yazici, S.: Neural network implementation in hardware using FPGAs. In: King, I., Wang, J., Chan, L.-W., Wang, D.L. (eds.) ICONIP 2006. LNCS, vol. 4234, pp. 1105–1112. Springer, Heidelberg (2006). https://doi.org/10.1007/11893295_122

    Chapter  Google Scholar 

  48. Nurvitadhi, E., Sim, J., Sheffield, D., Mishra, A., Krishnan, S., Marr, D.: Accelerating recurrent neural networks in analytics servers: Comparison of fpga, cpu, gpu, and asic. In: 2016 26th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4. IEEE (2016)

    Google Scholar 

  49. Boutros, A., Yazdanshenas, S., Betz, V.: You cannot improve what you do not measure: Fpga vs. asic efficiency gaps for convolutional neural network inference. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 11(3), 1–23 (2018)

    Google Scholar 

  50. Kerbl, B., Kenzel, M., Winter, M., Steinberger, M.: Cuda and applications to task-based programming (2022). http://cuda-tutorial.github.io/part2_22.pdf. Accessed 23 Sept 2022

  51. Tarditi, D., Puri, S., Oglesby, J.: Accelerator: using data parallelism to program gpus for general-purpose uses. ACM SIGPLAN Not. 41(11), 325–335 (2006)

    Article  Google Scholar 

  52. Jang, H., Park, A., Jung, K.: Neural network implementation using cuda and openmp. In: 2008 Digital Image Computing: Techniques and Applications, pp. 155–161. IEEE (2008)

    Google Scholar 

  53. Silicon Graphics Khronos Group. Opengl (2022). http://www.opengl.org/. Accessed 23 Sept 2022

  54. Advanced Micro Devices. Amd radeon graphics cards specifications (2022). http://www.amd.com/en/support/kb/faq/gpu-624. Accessed 23 Sept 2022

  55. Nvidia. Cuda toolkit (2022). http://developer.nvidia.com/cuda-zone. Accessed 23 Sept 2022

  56. Zhang, C., Song, D., Huang, C., Swami, A., Chawla, N.V.: Heterogeneous graph neural network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 793–803 (2019)

    Google Scholar 

  57. Touvron, H., Cord, M., Sablayrolles, A., Synnaeve, G., Jégou, H.: Going deeper with image transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 32–42 (2021)

    Google Scholar 

  58. Osman, A.A.M.: Gpu computing taxonomy. In: Recent Progress in Parallel and Distributed Computing, IntechOpen (2017)

    Google Scholar 

  59. Ashu Rege. An introduction to modern gpu architecture (nvidia talk). http://download.nvidia.com/developer/cuda/seminar/TDCI_Arch.pdf

  60. author. Nvidia, gpu (2022). http://www.nvidia.com/en-us/data-center/a100/. Accessed 21 Sept 2022

  61. author. Googlecloud, tpu (2022). http://cloud.google.com/tpu/docs/bfloat16. Accessed 21 Sept 2022

  62. author. Graphcore, ipu (2022). http://www.graphcore.ai/. Accessed 21 Sept 2022

  63. Jia, X., et al.: Highly scalable deep learning training system with mixed-precision: training imagenet in four minutes. arXiv preprint arXiv:1807.11205 (2018)

  64. Goncalo, R., Pedro, T., Nuno, R.: Positnn: training deep neural networks with mixed low-precision posit. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7908–7912 (2021)

    Google Scholar 

  65. Miyashita, D., Lee, E.H., Murmann, B.: Convolutional neural networks using logarithmic data representation. arXiv preprint arXiv:1603.01025 (2016)

  66. Sun, X.: Ultra-low precision 4-bit training of deep neural networks. Adv. Neural Inf. Process. Syst. 33, 1796–1807 (2020)

    Google Scholar 

  67. Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE (2013)

    Google Scholar 

  68. Sun, X., et al.: Hybrid 8-bit floating point (hfp8) training and inference for deep neural networks. In: Advances in Neural Information Processing Systems, 32 (2019)

    Google Scholar 

  69. Lin, T., Wang, Y., Liu, X., Qiu, X.: A survey of transformers. arXiv preprint arXiv:2106.04554 (2021)

  70. Chen, Y., et al.: Mobile-former: bridging mobilenet and transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5270–5279 (2022)

    Google Scholar 

  71. Sebastian, A., et al.: Computational memory-based inference and training of deep neural networks. In: 2019 Symposium on VLSI Technology, pp. T168–T169. IEEE (2019)

    Google Scholar 

  72. Nandakumar, S.R., et al.: Mixed-precision deep learning based on computational memory. Front. Neurosci. 14, 406 (2020)

    Article  Google Scholar 

  73. Yann, L., Corinna, C., Burges Christopher, J.C.: Mnist, dataset (2022). http://yann.lecun.com/exdb/mnist/. Accessed 21 Sept 2022

  74. Wang, C., Gong, L., Qi, Yu., Li, X., Xie, Y., Zhou, X.: Dlau: a scalable deep learning accelerator unit on fpga. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 36(3), 513–517 (2016)

    Google Scholar 

  75. Merolla, P., Arthur, J., Akopyan, F., Imam, N., Manohar, R., Modha, D.S.: A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45 nm. In: 2011 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–4. IEEE (2011)

    Google Scholar 

  76. Chen, T., et al.: Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGARCH Comput. Archit. News 42(1), 269–284 (2014)

    Article  Google Scholar 

  77. Shafiee, A., et al.: Isaac: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Comput. Archit. News 44(3), 14–26 (2016)

    Article  Google Scholar 

  78. Song, L., Qian, X., Li, H., Chen, Y.: Pipelayer: a pipelined reram-based accelerator for deep learning. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 541–552. IEEE (2017)

    Google Scholar 

  79. Chen, Y., Chen, T., Zhiwei, X., Sun, N., Temam, O.: Diannao family: energy-efficient hardware accelerators for machine learning. Commun. ACM 59(11), 105–112 (2016)

    Article  Google Scholar 

  80. Mao, H., Song, M., Li, T., Dai, Y., Shu, J.: Lergan: a zero-free, low data movement and pim-based gan architecture. In: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 669–681. IEEE (2018)

    Google Scholar 

  81. Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., Bharath, A.A.: Generative adversarial networks: an overview. IEEE Signal Process. Magazine 35(1), 53–65 (2018)

    Article  Google Scholar 

  82. Salami, B., Unsal, O.S., Kestelman, A.C.: Comprehensive evaluation of supply voltage underscaling in fpga on-chip memories. In: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 724–736. IEEE (2018)

    Google Scholar 

  83. Makrani, H.M., Sayadi, H., Mohsenin, T., Rafatirad, S., Sasan, A., Homayoun, H.: Xppe: cross-platform performance estimation of hardware accelerators using machine learning. In Proceedings of the 24th Asia and South Pacific Design Automation Conference, pp. 727–732 (2019)

    Google Scholar 

  84. Song, M., Zhang, J., Chen, H., Li, T.: Towards efficient microarchitectural design for accelerating unsupervised gan-based deep learning. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 66–77. IEEE (2018)

    Google Scholar 

  85. Li, B., Song, L., Chen, F., Qian, X., Chen, Y., Li, H.H.: Reram-based accelerator for deep learning. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 815–820. IEEE (2018)

    Google Scholar 

  86. Chen, Y., et al.: Dadiannao: a machine-learning supercomputer. In: 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 609–622. IEEE (2014)

    Google Scholar 

  87. Luo, T., et al.: Dadiannao: a neural network supercomputer. IEEE Trans. Comput. 66(1), 73–88 (2016)

    Article  MathSciNet  Google Scholar 

  88. Korchagin, P.A., Letopolskiy, A.B., Teterina, I.A.: Results of research of working capability of refined pipelayer equipment. In: International Conference "Aviamechanical Engineering and Transport" (AVENT 2018), pp. 416–420. Atlantis Press (2018)

    Google Scholar 

  89. Qiao, X., Cao, X., Yang, H., Song, L., Li, H.: Atomlayer: a universal reram-based cnn accelerator with atomic layer computation. In: Proceedings of the 55th Annual Design Automation Conference, pp. 1–6 (2018)

    Google Scholar 

  90. Liu, D., et al.: Pudiannao: a polyvalent machine learning accelerator. ACM SIGARCH Comput. Archit. News 43(1), 369–381 (2015)

    Article  Google Scholar 

  91. O’Shea, K., Nash, R.: An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015)

  92. Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. In: Workshop on faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition (2008)

    Google Scholar 

  93. Furber, S.B., Galluppi, F., Temple, S., Plana, L.A.: The spinnaker project. Proc. IEEE 102(5), 652–665 (2014)

    Article  Google Scholar 

  94. Gokmen, T., Haensch, W.: Algorithm for training neural networks on resistive device arrays. Front. Neurosc. 14, 103 (2020)

    Article  Google Scholar 

  95. Wang, C., Gong, L., Li, X., Zhou, X.: A ubiquitous machine learning accelerator with automatic parallelization on fpga. IEEE Trans. Parallel Distrib. Syst. 31(10), 2346–2359 (2020)

    Article  Google Scholar 

  96. Yan, B., et al.: Resistive memory-based in-memory computing: from device and large-scale integration system perspectives. Adv. Intell. Syst. 1(7), 1900068 (2019)

    Article  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge financial support DST/INT/Czech/P-12/2019, reg. no. LTAIN19176.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Václav Snášel or Lingping Kong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Snášel, V., Dang, T.K., Pham, P.N.H., Küng, J., Kong, L. (2022). In-Memory Computing Architectures for Big Data and Machine Learning Applications. In: Dang, T.K., Küng, J., Chung, T.M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2022. Communications in Computer and Information Science, vol 1688. Springer, Singapore. https://doi.org/10.1007/978-981-19-8069-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8069-5_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8068-8

  • Online ISBN: 978-981-19-8069-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics