ABSTRACT
With the rapid development of deep learning and natural language processing, more and more systems have applied deep learning models. However, a large number of data for training is a major bottleneck of deep learning at present. For the postgraduate thesis oral defense system, our model still utilizes the word retrieval method to match teachers and students who have the same research field because of the small amount of data and information. In this paper, we propose a two-stage training framework to improve the system matching correlation which fine-tunes the pre-trained model on specific downstream data and then utilizes contrastive learning and matching network to conduct self-supervised training. At the same time, the framework uses adversarial training to improve the robustness of the model. We evaluate our approach on the dataset of our system, and experiment results demonstrate the effectiveness of our approach.
- Sanjeev Arora, Hrishikesh Khandeparkar, Mikhail Khodak, Orestis Plevrakis, and Nikunj Saunshi. 2019. A Theoretical Analysis of Contrastive Unsupervised Representation Learning. In International Conference on Machine Learning.Google Scholar
- Siqi Bao, H. He, Fan Wang, and Hua Wu. 2020. PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable. In Annual Meeting of the Association for Computational Linguistics.Google Scholar
- Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. (2020).Google Scholar
- Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, and Guoping Hu. 2021. Pre-Training With Whole Word Masking for Chinese BERT. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3504–3514.Google ScholarDigital Library
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
- Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, M. Zhou, and Hsiao-Wuen Hon. 2019. Unified Language Model Pre-training for Natural Language Understanding and Generation. ArXiv abs/1905.03197(2019).Google Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv abs/2010.11929(2021).Google Scholar
- Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 6894–6910.Google ScholarCross Ref
- I. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. CoRR abs/1412.6572(2015).Google Scholar
- Ben Goodrich, Vinay Rao, Mohammad Saleh, and Peter J. Liu. 2019. Assessing The Factual Accuracy of Generated Text. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(2019).Google ScholarDigital Library
- Doaa Hassan. 2018. A Text Mining Approach for Evaluating Event Credibility on Twitter. 2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE) (2018), 171–174.Google ScholarCross Ref
- Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2020. DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION. In International Conference on Learning Representations.Google Scholar
- Geoffrey E. Hinton and Ruslan Salakhutdinov. 2006. Reducing the Dimensionality of Data with Neural Networks. Science 313(2006), 504 – 507.Google ScholarCross Ref
- Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman Mohamed. 2021. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3451–3460.Google ScholarDigital Library
- Junjie Huang, Duyu Tang, Linjun Shou, Ming Gong, Ke Xu, Daxin Jiang, Ming Zhou, and Nan Duan. 2021. CoSQA: 20,000+ Web Queries for Code Search and Question Answering. ArXiv abs/2105.13239(2021).Google Scholar
- Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. (2019).Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv abs/1907.11692(2019).Google Scholar
- Takeru Miyato, Andrew M. Dai, and I. Goodfellow. 2017. Adversarial Training Methods for Semi-Supervised Text Classification. arXiv: Machine Learning(2017).Google Scholar
- Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained Models for Natural Language Processing: A Survey. ArXiv abs/2003.08271(2020).Google Scholar
- Yu Sun, Shuohuan Wang, Shikun Feng, Siyu Ding, Chao Pang, Junyuan Shang, Jiaxiang Liu, Xuyi Chen, Yanbin Zhao, Yuxiang Lu, Weixin Liu, Zhihua Wu, Weibao Gong, Jianzhong Liang, Zhizhou Shang, Peng Sun, Wei Liu, Xuan Ouyang, Dianhai Yu, Hao Tian, Hua Wu, and Haifeng Wang. 2021. ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation. ArXiv abs/2107.02137(2021).Google Scholar
- Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE 2.0: A Continual Pre-training Framework for Language Understanding. ArXiv abs/1907.12412(2020).Google Scholar
- Oriol Vinyals, Charles Blundell, Timothy P. Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching Networks for One Shot Learning. In NIPS.Google Scholar
- Rohan Kumar Yadav, Lei Jiao, Ole-Christoffer Granmo, and Morten Goodwin Olsen. 2021. Human-Level Interpretable Learning for Aspect-Based Sentiment Analysis. In AAAI Conference on Artificial Intelligence.Google Scholar
- Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Annual Meeting of the Association for Computational Linguistics.Google Scholar
Index Terms
- Research and application of matching network
Recommendations
Scheduled sampling for one-shot learning via matching network
Highlights- A scheduled sampling strategy is introduced to adjust the training procedure of matching network, which accomplishes to learn the ability for one-shot ...
AbstractConsidering human can learn new object successfully from just one sample, one-shot learning, where each visual class just has one labeled sample for training, has attracted more and more attention. In the past years, most researchers ...
Two-stage framework with improved U-Net based on self-supervised contrastive learning for pavement crack segmentation
AbstractAfter the deep learning method emerged, the automated detection technology of pavement crack images has significantly progressed. The dominant approach is supervised deep learning, which relies on large-scale labeled ground truth. However, the ...
Rule-based construction of matching processes
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementSemi-automatic schema matching systems have been developed to compute mapping suggestions that can be corrected by a user. However, constructing and tuning match strategies still requires a high manual effort. We therefore propose a self-configuring ...
Comments