Abstract
Prompt-tuning takes advantage of large-scale pretrained language models and achieves great performance while being more parameter-efficient. However, existing prompt-tuning methods require tuning different pretrained language models for each specific task, and fail to utilize information across different tasks, which limits their applicability in complex situations. To address above issues, we propose PromptFusion, a unique prompt-based multi-task transfer learning approach which learns knowledge from multiple tasks and incorporates for the target task at low cost. The proposed approach first learns task-specific parameters with prompts to extract information individually, then, a fusion module is designed to aggregate information. Our method is interpretable because it can explain which sources of tasks are the crucial factors to influence the model decision on the target task. We also examine a more effective way to encapsulate information by incorporating parallel adapter modules into transformer layers, and this makes a linkage between parameter-efficient transfer learning methods. We empirically evaluate our methods on the GLUE benchmark and a variety of hard NLU tasks. The results show that our approach outperforms full fine-tuning and other parameter-efficient multi-task methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Carreras, X., Màrquez, L.: Introduction to the conll-2004 shared task: semantic role labeling. In: HLT-NAACL, pp. 89–97 (2004)
Carreras, X., Màrquez, L.: Introduction to the conll-2005 shared task: semantic role labeling. In: CoNLL-2005, pp. 152–164 (2005)
Caruana, R.: Mach. Learn. 28(1), 41–75 (1997)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)
Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res. 23(120), 1–39 (2022)
French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 3(4), 128–135 (1999)
Ha, D., Dai, A.M., Le, Q.V.: Hypernetworks. In: ICLR (2017)
He, J., Zhou, C., Ma, X., Berg-Kirkpatrick, T., Neubig, G.: Towards a unified view of parameter-efficient transfer learning. In: ICLR (2022)
He, Y., et al.: Hyperprompt: prompt-based task-conditioning of transformers. In: International Conference on Machine Learning, pp. 8678–8690. PMLR (2022)
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, pp. 2790–2799 (2019)
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: EMNLP, pp. 3045–3059 (2021)
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: ACL/IJCNLP, pp. 4582–4597 (2021)
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021)
Liu, X., et al.: P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks. In: ACL, pp. 61–68 (2022)
Liu, X., He, P., Chen, W., Gao, J.: Multi-task deep neural networks for natural language understanding. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL, pp. 4487–4496 (2019)
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Mahabadi, R.K., Ruder, S., Dehghani, M., Henderson, J.: Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. In: ACL/IJCNLP, pp. 565–576 (2021)
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Elsevier (1989)
Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., Gurevych, I.: AdapterFusion: non-destructive task composition for transfer learning. In: EACL, pp. 487–503 (2021)
Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: EMNLP-CoNLL, pp. 1–40 (2012)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: EMNLP, pp. 2383–2392 (2016)
Sang, E.F.T.K., Meulder, F.D.: Introduction to the conll-2003 shared task: language-independent named entity recognition. In: HLT-NAACL, pp. 142–147 (2003)
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Vu, T., Lester, B., Constant, N., Al-Rfou’, R., Cer, D.: SPoT: better frozen model adaptation through soft prompt transfer. In: ACL, pp. 5039–5059 (2022)
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: ICLR (2019)
Weischedel, R., et al.: Ontonotes release 5.0 ldc2013t19. In: LDC (2013)
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: EMNLP, pp. 38–45 (2020)
Zhuang, F., et al.: A comprehensive survey on transfer learning. Proc. IEEE 109(1), 43–76 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Song, H., He, H., Zhu, Q., Xue, X. (2023). PromptFusion: A Low-Cost Prompt-Based Task Composition for Multi-task Learning. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Lecture Notes in Computer Science, vol 13623. Springer, Cham. https://doi.org/10.1007/978-3-031-30105-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-30105-6_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30104-9
Online ISBN: 978-3-031-30105-6
eBook Packages: Computer ScienceComputer Science (R0)