skip to main content
10.1145/3649476.3658799acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
short-paper

Jailbreaking Pre-trained Large Language Models Towards Hardware Vulnerability Insertion Ability

Published: 12 June 2024 Publication History

Abstract

We introduce RTLAttack, the first prompt-based jailbreak model designed to activate the hardware attack capabilities of LLM. Unlike conventional approaches, RTLAttack combines LLM and hardware security traits, enabling models to execute sensitive tasks like hardware Trojans insertion persistently. Extensive experiments across 10 prominent LLMs, including Claude and ChatGPT, demonstrated RTLAttack efficacy, achieving an 88.90% hardware vulnerability insertion success rate. Moreover, we analyzed the integrity and usability of LLM-injected vulnerabilities, unveiling inherent attack capabilities and harmful applications in current LLMs. This study aims to foster a comprehensive understanding of LLM capabilities in LLM-aided hardware design and drive finer-grained alignment.

References

[1]
Baleegh Ahmad, Wei-Kai Liu, Luca Collini, Hammond Pearce, Jason M. Fung, Jonathan Valamehr, Mohammad Bidmeshki, Piotr Sapiecha, Steve Brown, Krishnendu Chakrabarty, Ramesh Karri, and Benjamin Tan. 2022. Don’t CWEAT It: Toward CWE Analysis Techniques in Early Stages of Hardware Design. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design (San Diego, California, USA) (ICCAD ’22). Association for Computing Machinery, New York, NY, USA, Article 157, 9 pages. https://doi.org/10.1145/3508352.3549369
[2]
Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, and Xing Xie. 2024. A Survey on Evaluation of Large Language Models. ACM Trans. Intell. Syst. Technol. 15, 3, Article 39 (mar 2024), 45 pages. https://doi.org/10.1145/3641289
[3]
Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J Pappas, and Eric Wong. 2023. Jailbreaking black box large language models in twenty queries. arXiv preprint arXiv:2310.08419 (2023).
[4]
Chen Chen, Rahul Kande, Pouya Mahmoody, Ahmad-Reza Sadeghi, and JV Rajendran. 2022. Trusting the trust anchor: towards detecting cross-layer vulnerabilities with hardware fuzzing. In Proceedings of the 59th ACM/IEEE Design Automation Conference (San Francisco, California) (DAC ’22). Association for Computing Machinery, New York, NY, USA, 1379–1383. https://doi.org/10.1145/3489517.3530638
[5]
Cheng Li, Jindong Wang, Yixuan Zhang, Kaijie Zhu, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, and Xing Xie. 2023. Large language models understand and can be enhanced by emotional stimuli. arXiv preprint arXiv:2307.11760 (2023).
[6]
Fábio Perez and Ian Ribeiro. 2022. Ignore previous prompt: Attack techniques for language models. arXiv preprint arXiv:2211.09527 (2022).
[7]
Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, and Michael Zeng. 2023. Automatic prompt optimization with" gradient descent" and beam search. arXiv preprint arXiv:2305.03495 (2023).
[8]
Dipayan Saha, Shams Tarek, Katayoon Yahyaei, Sujan Kumar Saha, Jingbo Zhou, Mark Tehranipoor, and Farimah Farahmandi. 2023. LLM for SoC Security: A Paradigm Shift. Cryptology ePrint Archive (2023).
[9]
Jonathan Spring, Eric Hatleback, Allen Householder, Art Manion, and Deana Shick. 2021. Time to Change the CVSS?IEEE Security & Privacy 19, 2 (2021), 74–78.
[10]
J.D. Zamfirescu-Pereira, Richmond Y. Wong, Bjoern Hartmann, and Qian Yang. 2023. Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, German) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 437, 21 pages. https://doi.org/10.1145/3544548.3581388

Index Terms

  1. Jailbreaking Pre-trained Large Language Models Towards Hardware Vulnerability Insertion Ability

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024
        June 2024
        797 pages
        ISBN:9798400706059
        DOI:10.1145/3649476
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 12 June 2024

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Hardware Security
        2. LLM
        3. LLM Jailbreak
        4. Vulnerability Insertion

        Qualifiers

        • Short-paper
        • Research
        • Refereed limited

        Conference

        GLSVLSI '24
        Sponsor:
        GLSVLSI '24: Great Lakes Symposium on VLSI 2024
        June 12 - 14, 2024
        FL, Clearwater, USA

        Acceptance Rates

        Overall Acceptance Rate 312 of 1,156 submissions, 27%

        Upcoming Conference

        GLSVLSI '25
        Great Lakes Symposium on VLSI 2025
        June 30 - July 2, 2025
        New Orleans , LA , USA

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 119
          Total Downloads
        • Downloads (Last 12 months)119
        • Downloads (Last 6 weeks)13
        Reflects downloads up to 14 Feb 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media