Using Pre-trained Language Model to Enhance Active Learning for Sentence Matching
Abstract
1 Introduction
![](/cms/10.1145/3480937/asset/7b07d46b-86c9-412a-b51b-b8c0e3fefaae/assets/images/medium/tallip2102-42-f01.jpg)
![](/cms/10.1145/3480937/asset/aa5d4994-46a7-46ce-be9f-a1a3723d2cf8/assets/images/medium/tallip2102-42-un1.gif)
![](/cms/10.1145/3480937/asset/6a691798-3890-4cdc-acdc-8d2b956d0821/assets/images/medium/tallip2102-42-un2.gif)
2 Related Work
2.1 Active Learning
2.2 Sentence Matching
2.3 Pre-trained Language Model
3 Preliminaries
3.1 Sentence Matching
3.2 Active Learning
![](/cms/10.1145/3480937/asset/8c5fee41-e743-4943-85a6-46479761ad43/assets/images/medium/tallip2102-42-algo1.jpg)
4 Methodology
4.1 Pre-trained Language Model
![](/cms/10.1145/3480937/asset/09161c34-af77-44eb-99b4-2df76a173d3b/assets/images/medium/tallip2102-42-f02.jpg)
4.2 Criteria for Instance Selection
![](/cms/10.1145/3480937/asset/4967cdc0-9df3-44c7-9e8c-f46db65b107b/assets/images/medium/tallip2102-42-f03.jpg)
4.3 Instance Selection
![](/cms/10.1145/3480937/asset/66f111bc-0b50-4832-b01c-bcfc7e9794e9/assets/images/medium/tallip2102-42-f04.jpg)
5 Experiments
5.1 Configuration
5.2 Datasets
training | validation | test | |
---|---|---|---|
SNLI | 549,367 | 9,842 | 9,824 |
MultiNLI | 392,702 | 9,815 | 9,832 |
Quora | 384,348 | 10,000 | 10,000 |
LCQMC | 238,766 | 8,802 | 12,500 |
BQ | 100,000 | 1,000 | 1,000 |
5.3 Comparisons
5.4 Overall Results
![](/cms/10.1145/3480937/asset/fb994f01-dcc2-4ead-a61c-f56567b72d16/assets/images/medium/tallip2102-42-f05.jpg)
Random | Entropy | EGL | DAL | CORE | BADGE | COLD | LM | |
---|---|---|---|---|---|---|---|---|
SNLI | 77.90 | 79.80 | 77.86 | 79.76 | 79.30 | 79.55 | 78.75 | 80.99 |
MultiNLI | 67.83 | 70.27 | 66.80 | 70.56 | 69.40 | 69.29 | 68.65 | 71.79 |
Quora | 79.01 | 80.21 | 77.91 | 79.68 | 80.55 | 80.69 | 79.48 | 81.79 |
LCQMC | 82.04 | 83.25 | 80.35 | 82.72 | 82.82 | 82.66 | 81.92 | 84.29 |
BQ | 71.44 | 73.60 | 71.59 | 73.29 | 73.67 | 73.88 | 72.81 | 74.73 |
5.5 Ablation Study
![](/cms/10.1145/3480937/asset/e6ec4131-068b-4b74-b035-6fcf1c4562e7/assets/images/medium/tallip2102-42-f06.jpg)
Ent | E+Cov | E+Noi | E+Div | E+All |
---|---|---|---|---|
79.80 | 80.99 | 81.11 | 81.45 | 80.99 |
5.6 Discussion
Ent | E+Noi | E+All | |||
---|---|---|---|---|---|
Ent | original | improved | original | improved | |
LCQMC | 83.25 | 83.72 | 83.99 | 84.29 | 84.60 |
BQ | 73.60 | 73.85 | 74.43 | 74.73 | 75.43 |
![](/cms/10.1145/3480937/asset/d4608276-3ea1-453d-88b5-342df00bb31b/assets/images/medium/tallip2102-42-f07.jpg)
Entroy | Uncontext | AE | Topic | Skip | LM |
---|---|---|---|---|---|
79.80 | 80.63 | 80.42 | 80.54 | 80.71 | 80.99 |
![](/cms/10.1145/3480937/asset/23cf7cf5-8355-4d5d-8cc1-5606561e19ce/assets/images/medium/tallip2102-42-f08.jpg)
Entroy | Sum | Sub | Nowei | Noabs | LM |
---|---|---|---|---|---|
79.80 | 80.35 | 80.67 | 80.29 | 80.44 | 80.99 |
threshold-3.0 | threshold-5.0 | threshold-10.0 | |
---|---|---|---|
SNLI | 80.46 | 80.77 | 80.99 |
MultiNLI | 70.92 | 71.56 | 71.79 |
Quora | 81.36 | 81.55 | 81.79 |
LCQMC | 83.94 | 84.07 | 84.29 |
BQ | 74.18 | 74.64 | 74.73 |
E+Div | E+All | |||
---|---|---|---|---|
k-means | prototype | k-means | prototype | |
SNLI | 81.45 | 80.32 | 80.99 | 80.16 |
MultiNLI | 71.66 | 70.45 | 71.79 | 70.38 |
Quora | 81.59 | 80.24 | 81.79 | 80.50 |
LCQMC | 83.70 | 83.51 | 84.29 | 83.66 |
BQ | 74.65 | 73.86 | 74.73 | 74.13 |
5.7 Comparison among Different Pre-trained Language Models
BERT | RoBERTa | ALBERT | XLNet | |
---|---|---|---|---|
SNLI | 80.99 | 81.45 | 80.39 | 82.58 |
MultiNLI | 71.79 | 72.65 | 71.20 | 72.87 |
Quora | 81.79 | 82.44 | 80.47 | 83.12 |
LCQMC | 84.29 | 84.36 | 83.34 | 82.31 |
BQ | 74.73 | 75.32 | 74.17 | 74.08 |
5.8 Visualization of Instance Selection
![](/cms/10.1145/3480937/asset/678a16e2-4ecf-4428-9936-6f3747376783/assets/images/medium/tallip2102-42-f09.jpg)
![](/cms/10.1145/3480937/asset/d620713b-e832-4f2c-bc01-03b6469b5a7b/assets/images/medium/tallip2102-42-f10.jpg)
6 Conclusion
Footnote
References
Index Terms
- Using Pre-trained Language Model to Enhance Active Learning for Sentence Matching
Recommendations
Pre-Training Acquisition Functions by Deep Reinforcement Learning for Fixed Budget Active Learning
AbstractThere are many situations in supervised learning where the acquisition of data is very expensive and sometimes determined by a user’s budget. One way to address this limitation is active learning. In this study, we focus on a fixed budget regime ...
Pre-trained Language Models for Tagalog with Multi-source Data
Natural Language Processing and Chinese ComputingAbstractPre-trained language models (PLMs) for Tagalog can be categorized into two kinds: monolingual models and multilingual models. However, existing monolingual models are only trained in small-scale Wikipedia corpus and multilingual models fail to ...
KnowLog: Knowledge Enhanced Pre-trained Language Model for Log Understanding
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software EngineeringLogs as semi-structured text are rich in semantic information, making their comprehensive understanding crucial for automated log analysis. With the recent success of pre-trained language models in natural language processing, many studies have leveraged ...
Comments
Information & Contributors
Information
Published In
![cover image ACM Transactions on Asian and Low-Resource Language Information Processing](/cms/asset/7a43505a-d543-40ac-9d21-b117b815e7ca/3494070.cover.jpg)
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Research-article
- Refereed
Funding Sources
- National Natural Science Foundation of China
- Beijing Academy of Artifcial Intelligence
- Key Research Program of the Chinese Academy of Sciences
- National Laboratory of Pattern Recognition
- Youth Innovation Promotion Association CAS
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 1,022Total Downloads
- Downloads (Last 12 months)387
- Downloads (Last 6 weeks)34
Other Metrics
Citations
View Options
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderHTML Format
View this article in HTML Format.
HTML FormatLogin options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in