Instance-Aware and Semantic-Guided Prompt for Few-Shot Learning in Large Language Models

Weng, Jinta; Li, Donghao; Deng, Yifan; Zhang, Jie; Hu, Yue; Huang, Heyan

doi:10.1007/978-981-99-8148-9_5

Jinta Weng^10,11,
Donghao Li^10,11,
Yifan Deng^10,11,
Jie Zhang^10,11,
Yue Hu¹¹ &
…
Heyan Huang¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1966))

Included in the following conference series:

International Conference on Neural Information Processing

899 Accesses

Abstract

The effectiveness of large language models (LLMs) and instruction learning has been demonstrated in different pre-trained language models (such as ChatGPT). However, current prompt learning methods usually use a unified template for the same tasks, and the template is difficult to capture significant information from different instances. To integrate the semantic attention dynamically on the instance level, We propose ISPrompt, an instance-semantic-aware prompt learning model. Specifically, the instance-driven prompt generated from the semantic dependency tree is introduced. Then, the proposed model would select a suitable semantic prompt from the prompt selection pool to motivate the prompt-based fine-tuning process. Our results show that the proposed model achieves state-of-the-art performance on few-shot learning tasks, which proves that ISPrompt integrating the instance semantics dynamically could assume as a better knowledge-mining tool for PLMs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

When Few-Shot Learning Meets Large-Scale Knowledge-Enhanced Pre-training: Alibaba at FewCLUE

Better Few-Shot Text Classification with Pre-trained Language Model

EBFP: Example-Based Further Pre-training

Notes

1.
https://nlp.stanford.edu/software/tagger.shtml.

References

Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv e-prints arXiv:1810.04805 (2018)
Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners. arXiv preprint arXiv:2012.15723 (2020)
Han, X., Zhao, W., Ding, N., Liu, Z., Sun, M.: PTR: prompt tuning with rules for text classification. arXiv preprint arXiv:2105.11259 (2021)
Hu, S., et al.: Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. arXiv preprint arXiv:2108.02035 (2021)
Jiang, Z., Xu, F.F., Araki, J., Neubig, G.: How can we know what language models know? Trans. Assoc. Comput. Linguist. 8, 423–438 (2020)
Article Google Scholar
Kavumba, P., Takahashi, R., Oda, Y.: Are prompt-based models clueless? In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), pp. 2333–2352 (2022)
Google Scholar
Kocoń, J., et al.: ChatGPT: jack of all trades, master of none. arXiv preprint arXiv:2302.10724 (2023)
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 3045–3059 (2021)
Google Scholar
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol. 1: Long Papers), pp. 4582–4597. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.353. https://aclanthology.org/2021.acl-long.353
Lin, Y., Tan, Y.C., Frank, R.: Open sesame: getting inside BERT’s linguistic knowledge (2019)
Google Scholar
Liu, X., Ji, K., Fu, Y., Du, Z., Yang, Z., Tang, J.: P-Tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks (2021)
Google Scholar
Liu, X., et al.: GPT understands, too. arXiv preprint arXiv:2103.10385 (2021)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. ArXiv abs/1907.11692 (2019)
Google Scholar
Mahabadi, R.K., et al.: Perfect: prompt-free and efficient few-shot learning with language models. arXiv preprint arXiv:2204.01172 (2022)
OpenAI: Gpt-4 technical report (2023)
Google Scholar
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., Huang, X.: Pre-trained models for natural language processing: a survey. Sci. Chin. Technol. Sci. 63(10), 1872–1897 (2020). https://doi.org/10.1007/s11431-020-1647-3
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392 (2016)
Google Scholar
Schick, T., Schütze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 255–269. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.eacl-main.20. https://aclanthology.org/2021.eacl-main.20
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980 (2020)
Sumers, T., Hawkins, R., Ho, M.K., Griffiths, T., Hadfield-Menell, D.: How to talk so AI will learn: instructions, descriptions, and autonomy. In: Advances in Neural Information Processing Systems, vol. 35, pp. 34762–34775 (2022)
Google Scholar
Voorhees, E.M., Tice, D.M.: Building a question answering test collection. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 200–207 (2000)
Google Scholar
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355 (2018)
Google Scholar
Wang, J., et al.: Towards unified prompt tuning for few-shot text classification. arXiv preprint arXiv:2205.05313 (2022)
Wang, Z., et al.: Learning to prompt for continual learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 139–149 (2022)
Google Scholar
Webson, A., Pavlick, E.: Do prompt-based models really understand the meaning of their prompts? arxiv abs/2109.01247 (2021)
Google Scholar
Wei, J., et al.: Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022)
Wei, J., et al.: Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903 (2022)
Zhong, W., et al.: Improving task generalization via unified schema prompt. arXiv preprint arXiv:2208.03229 (2022)

Download references

Author information

Authors and Affiliations

School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Jinta Weng, Donghao Li, Yifan Deng & Jie Zhang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Jinta Weng, Donghao Li, Yifan Deng, Jie Zhang & Yue Hu
School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
Heyan Huang

Authors

Jinta Weng
View author publications
You can also search for this author in PubMed Google Scholar
Donghao Li
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Deng
View author publications
You can also search for this author in PubMed Google Scholar
Jie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yue Hu
View author publications
You can also search for this author in PubMed Google Scholar
Heyan Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yue Hu or Heyan Huang .

Editor information

Editors and Affiliations

School of Automation, Central South University, Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
Hongyi Li
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weng, J., Li, D., Deng, Y., Zhang, J., Hu, Y., Huang, H. (2024). Instance-Aware and Semantic-Guided Prompt for Few-Shot Learning in Large Language Models. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1966. Springer, Singapore. https://doi.org/10.1007/978-981-99-8148-9_5

Download citation

DOI: https://doi.org/10.1007/978-981-99-8148-9_5
Published: 26 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8147-2
Online ISBN: 978-981-99-8148-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Instance-Aware and Semantic-Guided Prompt for Few-Shot Learning in Large Language Models