skip to main content
10.1145/3649476.3658760acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article
Open access

Efficient Exploration in Edge-Friendly Hyperdimensional Reinforcement Learning

Published: 12 June 2024 Publication History

Abstract

Integrating deep learning with Reinforcement Learning (RL) results in algorithms that achieve human-like learning in complex yet unknown environments via a process of trial and error. Despite the advancements, the computational costs associated with deep learning become a major drawback. This paper proposes a revamped Q-learning algorithm powered by Hyperdimensional Computing (HDC), targeting more efficient and adaptive exploration. We introduce a solution leveraging model uncertainty to navigate agent exploration. Our evaluation shows that the proposed algorithm is a significant enhancement in learning quality and efficiency compared to previous HDC-based algorithms, achieving more than 330 more rewards with small overheads in computation. In addition, it maintains an edge over DNN-based alternatives by ensuring reduced runtime costs and improved policy learning, achieving up to 6.9 × faster learning.

References

[1]
Shipra Agrawal and Navin Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Conference on learning theory. JMLR Workshop and Conference Proceedings, 39–1.
[2]
Aseel AlOrbani and Michael Bauer. 2021. Load balancing and resource allocation in smart cities using reinforcement learning. In 2021 IEEE International Smart Cities Conference (ISC2). IEEE, 1–7.
[3]
Kai Arulkumaran, Marc Peter Deisenroth, 2017. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34, 6 (2017), 26–38.
[4]
Sercan Aygun, Mehran Shoushtari Moghadam, 2023. Learning from Hypervectors: A Survey on Hypervector Encoding. arXiv preprint arXiv:2308.00685 (2023).
[5]
Kamyar Azizzadenesheli, Emma Brunskill, and Animashree Anandkumar. 2018. Efficient exploration through bayesian deep q-networks. In 2018 Information Theory and Applications Workshop (ITA). IEEE, 1–9.
[6]
Greg Brockman 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).
[7]
Hanning Chen and Mohsen Imani. 2022. Density-aware parallel hyperdimensional genome sequence matching. In 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 1–4.
[8]
Hanning Chen, Mariam Issa, 2022. Darl: Distributed reconfigurable accelerator for hyperdimensional reinforcement learning. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.
[9]
Hanning Chen, M Hassan Najafi, 2022. Full stack parallel online hyperdimensional regression on fpga. In 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 517–524.
[10]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. PMLR, 1050–1059.
[11]
Florin-Cristian Ghesu and Bogdan others Georgescu. 2017. Multi-scale deep reinforcement learning for real-time 3D-landmark detection in CT scans. IEEE transactions on pattern analysis and machine intelligence 41, 1 (2017), 176–189.
[12]
Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, and Sergey Levine. 2017. Reinforcement learning with deep energy-based policies. In International conference on machine learning. PMLR, 1352–1361.
[13]
Alejandro Hernández-Cano, Yang Ni, Zhuowen Zou, Ali Zakeri, and Mohsen Imani. 2024. Hyperdimensional computing with holographic and adaptive encoder. Frontiers in Artificial Intelligence 7 (2024). https://doi.org/10.3389/frai.2024.1371988
[14]
Wenjun Huang, Arghavan Rezvani, Hanning Chen, Yang Ni, Sanggeon Yun, Sungheon Jeong, and Mohsen Imani. 2024. A Plug-in Tiny AI Module for Intelligent and Selective Sensor Data Transmission. arXiv preprint arXiv:2402.02043 (2024).
[15]
Mohsen Imani, Zhuowen Zou, Samuel Bosch, Sanjay Anantha Rao, Sahand Salamat, Venkatesh Kumar, Yeseong Kim, and Tajana Rosing. 2021. Revisiting hyperdimensional learning for fpga and low-power architectures. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 221–234.
[16]
Mariam Issa, Sina Shahhosseini, 2022. Hyperdimensional hybrid learning on end-edge-cloud networks. In 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 652–655.
[17]
Pentti Kanerva. 2009. Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cognitive computation 1, 2 (2009), 139–159.
[18]
Jaeyoung Kang, Minxuan Zhou, 2022. RelHD: A Graph-based Learning on FeFET with Hyperdimensional Computing. In 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 553–560.
[19]
Jintao Ke 2019. Optimizing online matching for ride-sourcing services with multi-agent deep reinforcement learning. arXiv preprint arXiv:1902.06228 (2019).
[20]
Yeseong Kim, Mohsen Imani, and Tajana S Rosing. 2018. Efficient human activity recognition using hyperdimensional computing. In Proceedings of the 8th International Conference on the Internet of Things. 1–6.
[21]
Tze Leung Lai, Herbert Robbins, 1985. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics 6, 1 (1985), 4–22.
[22]
Le Liang, Hao Ye, and Geoffrey Ye Li. 2019. Spectrum sharing in vehicular networks based on multi-agent reinforcement learning. IEEE Journal on Selected Areas in Communications 37, 10 (2019), 2282–2292.
[23]
Gabriel Maicas, Gustavo Carneiro, 2017. Deep reinforcement learning for active breast lesion detection from DCE-MRI. In International conference on medical image computing and computer-assisted intervention. Springer, 665–673.
[24]
Volodymyr Mnih, Koray Kavukcuoglu, 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
[25]
Ali Moin, Andy Zhou, 2021. A wearable biosensing system with in-sensor adaptive machine learning for hand gesture recognition. Nature Electronics 4, 1 (2021), 54–63.
[26]
Almuthanna Nassar and Yasin Yilmaz. 2021. Deep reinforcement learning for adaptive network slicing in 5G for intelligent vehicular systems and smart cities. IEEE Internet of Things Journal 9, 1 (2021), 222–235.
[27]
Yang Ni, Danny Abraham, 2023. Efficient Off-Policy Reinforcement Learning via Brain-Inspired Computing. In Proceedings of the Great Lakes Symposium on VLSI 2023. 449–453.
[28]
Yang Ni, Hanning Chen, 2023. Brain-Inspired Trustworthy Hyperdimensional Computing with Efficient Uncertainty Quantification. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 01–09.
[29]
Yang Ni, Mariam Issa, 2022. Hdpg: hyperdimensional policy-based reinforcement learning for continuous control. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 1141–1146.
[30]
Yang Ni, Yeseong Kim, 2022. Algorithm-hardware co-design for efficient brain-inspired hyperdimensional learning on edge. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 292–297.
[31]
Yang Ni, Nicholas Lesica, Fan-Gang Zeng, and Mohsen Imani. 2022. Neurally-inspired hyperdimensional classification for efficient and robust biosignal processing. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.
[32]
Ian Osband, Charles Blundell, Alexander Pritzel, and Benjamin Van Roy. 2016. Deep exploration via bootstrapped DQN. Advances in neural information processing systems 29 (2016).
[33]
Ian Osband, Benjamin Van Roy, and Zheng Wen. 2016. Generalization and exploration via randomized value functions. In International Conference on Machine Learning. PMLR, 2377–2386.
[34]
Una Pale, Tomas Teijeiro, and David Atienza. 2022. Multi-centroid hyperdimensional computing approach for epileptic seizure detection. Frontiers in Neurology 13 (2022), 816294.
[35]
Prathyush Poduval, Haleh Alimohamadi, 2022. Graphd: Graph-based hyperdimensional memorization for brain-like cognitive learning. Frontiers in Neuroscience 16 (2022), 757125.
[36]
Abbas Rahimi, Artiom Tchouprina, 2020. Hyperdimensional computing for blind and one-shot classification of EEG error-related potentials. Mobile Networks and Applications 25 (2020), 1958–1969.
[37]
Brian Sallans and Geoffrey E Hinton. 2004. Reinforcement learning with factored states and actions. The Journal of Machine Learning Research 5 (2004), 1063–1088.
[38]
Sina Shahhosseini, Yang Ni, 2022. Flexible and personalized learning for wearable health applications using hyperdimensional computing. In Proceedings of the Great Lakes Symposium on VLSI 2022. 357–360.
[39]
Kumar Shridhar and Harshil others Jain. 2020. End to end binarized neural networks for text classification. In Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing. 29–34.
[40]
Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.
[41]
Ruixuan Wang, Xun Jiao, and X Sharon Hu. 2022. Odhd: one-class brain-inspired hyperdimensional computing for outlier detection. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 43–48.
[42]
Ruixuan Wang, Sabrina Hassan Moon, X Sharon Hu, Xun Jiao, and Dayane Reis. 2024. A Computing-in-Memory-based One-Class Hyperdimensional Computing Model for Outlier Detection. IEEE Trans. Comput. (2024).
[43]
Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8, 3-4 (1992), 279–292.
[44]
Ali Zakeri, Zhuowen Zou, 2024. Conjunctive block coding for hyperdimensional graph representation. Intelligent Systems with Applications (2024), 200353.
[45]
Zhuowen Zou, Hanning Chen, 2022. Biohd: an efficient genome sequence search platform using hyperdimensional memorization. In Proceedings of the 49th Annual International Symposium on Computer Architecture. 656–669.
[46]
Zhuowen Zou, Yeseong Kim, 2021. Scalable edge-based hyperdimensional learning system with brain-like neural adaptation. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–15.

Cited By

View all
  • (2024)Dynamic MAC Protocol for Wireless Spectrum Sharing via Hyperdimensional Self-LearningIEEE Access10.1109/ACCESS.2024.346486812(138519-138534)Online publication date: 2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024
June 2024
797 pages
ISBN:9798400706059
DOI:10.1145/3649476
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2024

Check for updates

Author Tags

  1. Agent Exploration
  2. Brain-inspired Computing
  3. Hyperdimensional Computing
  4. Reinforcement Learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

GLSVLSI '24
Sponsor:
GLSVLSI '24: Great Lakes Symposium on VLSI 2024
June 12 - 14, 2024
FL, Clearwater, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Upcoming Conference

GLSVLSI '25
Great Lakes Symposium on VLSI 2025
June 30 - July 2, 2025
New Orleans , LA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)345
  • Downloads (Last 6 weeks)58
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Dynamic MAC Protocol for Wireless Spectrum Sharing via Hyperdimensional Self-LearningIEEE Access10.1109/ACCESS.2024.346486812(138519-138534)Online publication date: 2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media