research-article

Open access

Efficient Exploration in Edge-Friendly Hyperdimensional Reinforcement Learning

Authors:

William Y. Chung,

Mohsen ImaniAuthors Info & Claims

GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024

Pages 111 - 118

https://doi.org/10.1145/3649476.3658760

Published: 12 June 2024 Publication History

All formats PDF

Abstract

Integrating deep learning with Reinforcement Learning (RL) results in algorithms that achieve human-like learning in complex yet unknown environments via a process of trial and error. Despite the advancements, the computational costs associated with deep learning become a major drawback. This paper proposes a revamped Q-learning algorithm powered by Hyperdimensional Computing (HDC), targeting more efficient and adaptive exploration. We introduce a solution leveraging model uncertainty to navigate agent exploration. Our evaluation shows that the proposed algorithm is a significant enhancement in learning quality and efficiency compared to previous HDC-based algorithms, achieving more than 330 more rewards with small overheads in computation. In addition, it maintains an edge over DNN-based alternatives by ensuring reduced runtime costs and improved policy learning, achieving up to 6.9 × faster learning.

References

[1]

Shipra Agrawal and Navin Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Conference on learning theory. JMLR Workshop and Conference Proceedings, 39–1.

[2]

Aseel AlOrbani and Michael Bauer. 2021. Load balancing and resource allocation in smart cities using reinforcement learning. In 2021 IEEE International Smart Cities Conference (ISC2). IEEE, 1–7.

[3]

Kai Arulkumaran, Marc Peter Deisenroth, 2017. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34, 6 (2017), 26–38.

[4]

Sercan Aygun, Mehran Shoushtari Moghadam, 2023. Learning from Hypervectors: A Survey on Hypervector Encoding. arXiv preprint arXiv:2308.00685 (2023).

[5]

Kamyar Azizzadenesheli, Emma Brunskill, and Animashree Anandkumar. 2018. Efficient exploration through bayesian deep q-networks. In 2018 Information Theory and Applications Workshop (ITA). IEEE, 1–9.

[6]

Greg Brockman 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).

[7]

Hanning Chen and Mohsen Imani. 2022. Density-aware parallel hyperdimensional genome sequence matching. In 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 1–4.

[8]

Hanning Chen, Mariam Issa, 2022. Darl: Distributed reconfigurable accelerator for hyperdimensional reinforcement learning. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.

Digital Library

[9]

Hanning Chen, M Hassan Najafi, 2022. Full stack parallel online hyperdimensional regression on fpga. In 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 517–524.

[10]

Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. PMLR, 1050–1059.

[11]

Florin-Cristian Ghesu and Bogdan others Georgescu. 2017. Multi-scale deep reinforcement learning for real-time 3D-landmark detection in CT scans. IEEE transactions on pattern analysis and machine intelligence 41, 1 (2017), 176–189.

[12]

Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, and Sergey Levine. 2017. Reinforcement learning with deep energy-based policies. In International conference on machine learning. PMLR, 1352–1361.

[13]

Alejandro Hernández-Cano, Yang Ni, Zhuowen Zou, Ali Zakeri, and Mohsen Imani. 2024. Hyperdimensional computing with holographic and adaptive encoder. Frontiers in Artificial Intelligence 7 (2024). https://doi.org/10.3389/frai.2024.1371988

[14]

Wenjun Huang, Arghavan Rezvani, Hanning Chen, Yang Ni, Sanggeon Yun, Sungheon Jeong, and Mohsen Imani. 2024. A Plug-in Tiny AI Module for Intelligent and Selective Sensor Data Transmission. arXiv preprint arXiv:2402.02043 (2024).

[15]

Mohsen Imani, Zhuowen Zou, Samuel Bosch, Sanjay Anantha Rao, Sahand Salamat, Venkatesh Kumar, Yeseong Kim, and Tajana Rosing. 2021. Revisiting hyperdimensional learning for fpga and low-power architectures. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 221–234.

[16]

Mariam Issa, Sina Shahhosseini, 2022. Hyperdimensional hybrid learning on end-edge-cloud networks. In 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 652–655.

[17]

Pentti Kanerva. 2009. Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cognitive computation 1, 2 (2009), 139–159.

[18]

Jaeyoung Kang, Minxuan Zhou, 2022. RelHD: A Graph-based Learning on FeFET with Hyperdimensional Computing. In 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 553–560.

[19]

Jintao Ke 2019. Optimizing online matching for ride-sourcing services with multi-agent deep reinforcement learning. arXiv preprint arXiv:1902.06228 (2019).

[20]

Yeseong Kim, Mohsen Imani, and Tajana S Rosing. 2018. Efficient human activity recognition using hyperdimensional computing. In Proceedings of the 8th International Conference on the Internet of Things. 1–6.

Digital Library

[21]

Tze Leung Lai, Herbert Robbins, 1985. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics 6, 1 (1985), 4–22.

[22]

Le Liang, Hao Ye, and Geoffrey Ye Li. 2019. Spectrum sharing in vehicular networks based on multi-agent reinforcement learning. IEEE Journal on Selected Areas in Communications 37, 10 (2019), 2282–2292.

[23]

Gabriel Maicas, Gustavo Carneiro, 2017. Deep reinforcement learning for active breast lesion detection from DCE-MRI. In International conference on medical image computing and computer-assisted intervention. Springer, 665–673.

Digital Library

[24]

Volodymyr Mnih, Koray Kavukcuoglu, 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).

[25]

Ali Moin, Andy Zhou, 2021. A wearable biosensing system with in-sensor adaptive machine learning for hand gesture recognition. Nature Electronics 4, 1 (2021), 54–63.

[26]

Almuthanna Nassar and Yasin Yilmaz. 2021. Deep reinforcement learning for adaptive network slicing in 5G for intelligent vehicular systems and smart cities. IEEE Internet of Things Journal 9, 1 (2021), 222–235.

[27]

Yang Ni, Danny Abraham, 2023. Efficient Off-Policy Reinforcement Learning via Brain-Inspired Computing. In Proceedings of the Great Lakes Symposium on VLSI 2023. 449–453.

[28]

Yang Ni, Hanning Chen, 2023. Brain-Inspired Trustworthy Hyperdimensional Computing with Efficient Uncertainty Quantification. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 01–09.

[29]

Yang Ni, Mariam Issa, 2022. Hdpg: hyperdimensional policy-based reinforcement learning for continuous control. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 1141–1146.

Digital Library

[30]

Yang Ni, Yeseong Kim, 2022. Algorithm-hardware co-design for efficient brain-inspired hyperdimensional learning on edge. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 292–297.

[31]

Yang Ni, Nicholas Lesica, Fan-Gang Zeng, and Mohsen Imani. 2022. Neurally-inspired hyperdimensional classification for efficient and robust biosignal processing. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.

Digital Library

[32]

Ian Osband, Charles Blundell, Alexander Pritzel, and Benjamin Van Roy. 2016. Deep exploration via bootstrapped DQN. Advances in neural information processing systems 29 (2016).

[33]

Ian Osband, Benjamin Van Roy, and Zheng Wen. 2016. Generalization and exploration via randomized value functions. In International Conference on Machine Learning. PMLR, 2377–2386.

[34]

Una Pale, Tomas Teijeiro, and David Atienza. 2022. Multi-centroid hyperdimensional computing approach for epileptic seizure detection. Frontiers in Neurology 13 (2022), 816294.

[35]

Prathyush Poduval, Haleh Alimohamadi, 2022. Graphd: Graph-based hyperdimensional memorization for brain-like cognitive learning. Frontiers in Neuroscience 16 (2022), 757125.

[36]

Abbas Rahimi, Artiom Tchouprina, 2020. Hyperdimensional computing for blind and one-shot classification of EEG error-related potentials. Mobile Networks and Applications 25 (2020), 1958–1969.

Digital Library

[37]

Brian Sallans and Geoffrey E Hinton. 2004. Reinforcement learning with factored states and actions. The Journal of Machine Learning Research 5 (2004), 1063–1088.

Digital Library

[38]

Sina Shahhosseini, Yang Ni, 2022. Flexible and personalized learning for wearable health applications using hyperdimensional computing. In Proceedings of the Great Lakes Symposium on VLSI 2022. 357–360.

[39]

Kumar Shridhar and Harshil others Jain. 2020. End to end binarized neural networks for text classification. In Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing. 29–34.

[40]

Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.

[41]

Ruixuan Wang, Xun Jiao, and X Sharon Hu. 2022. Odhd: one-class brain-inspired hyperdimensional computing for outlier detection. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 43–48.

Digital Library

[42]

Ruixuan Wang, Sabrina Hassan Moon, X Sharon Hu, Xun Jiao, and Dayane Reis. 2024. A Computing-in-Memory-based One-Class Hyperdimensional Computing Model for Outlier Detection. IEEE Trans. Comput. (2024).

Digital Library

[43]

Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8, 3-4 (1992), 279–292.

[44]

Ali Zakeri, Zhuowen Zou, 2024. Conjunctive block coding for hyperdimensional graph representation. Intelligent Systems with Applications (2024), 200353.

[45]

Zhuowen Zou, Hanning Chen, 2022. Biohd: an efficient genome sequence search platform using hyperdimensional memorization. In Proceedings of the 49th Annual International Symposium on Computer Architecture. 656–669.

Digital Library

[46]

Zhuowen Zou, Yeseong Kim, 2021. Scalable edge-based hyperdimensional learning system with brain-like neural adaptation. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–15.

Digital Library

Cited By

Ni YAbraham DIssa MHernández-Cano AImani MMercati PImani M(2024)Dynamic MAC Protocol for Wireless Spectrum Sharing via Hyperdimensional Self-LearningIEEE Access10.1109/ACCESS.2024.346486812(138519-138534)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3464868

Index Terms

Efficient Exploration in Edge-Friendly Hyperdimensional Reinforcement Learning
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems
2. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Intelligent agents
  2. Machine learning

Recommendations

Efficient Off-Policy Reinforcement Learning via Brain-Inspired Computing
GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

Reinforcement Learning (RL) has opened up new opportunities to enhance existing smart systems that generally include a complex decision-making process. However, modern RL algorithms, e.g., Deep Q-Networks (DQN), are based on deep neural networks, ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Guiding Reinforcement Learning Exploration Using Natural Language
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems

In this work we present a technique for using natural language to help reinforcement learning generalize to unseen environments using neural machine translation techniques. These techniques are then integrated into policy shaping to make it more ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024

June 2024

797 pages

ISBN:9798400706059

DOI:10.1145/3649476

Editors:
Inna Partin-Vaisband
University of Illinois Chicago, USA
,
Srinivas Katkoori
University of South Florida, USA
,
Lu Peng
Tulane University, USA
,
Boris Vaisband
McGill University, Canada
,
Tooraj Nikoubin
University of Texas at Dallas, USA

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

GLSVLSI '24

Sponsor:

SIGDA

GLSVLSI '24: Great Lakes Symposium on VLSI 2024

June 12 - 14, 2024

FL, Clearwater, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Upcoming Conference

GLSVLSI '25

Sponsor:
sigda

Great Lakes Symposium on VLSI 2025

June 30 - July 2, 2025

New Orleans , LA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
345
Total Downloads

Downloads (Last 12 months)345
Downloads (Last 6 weeks)58

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ni YAbraham DIssa MHernández-Cano AImani MMercati PImani M(2024)Dynamic MAC Protocol for Wireless Spectrum Sharing via Hyperdimensional Self-LearningIEEE Access10.1109/ACCESS.2024.346486812(138519-138534)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3464868

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten