Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis

Shi, Xiaoming; Che, Wanxiang

doi:10.1007/s11704-022-2134-1

Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis

Research Article
Published: 05 January 2023

Volume 17, article number 175333, (2023)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Xiaoming Shi¹ &
Wanxiang Che¹

55 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Slot filling, to extract entities for specific types of information (slot), is a vitally important modular of dialogue systems for automatic diagnosis. Doctor responses can be regarded as the weak supervision of patient queries. In this way, a large amount of weakly labeled data can be obtained from unlabeled diagnosis dialogue, alleviating the problem of costly and time-consuming data annotation. However, weakly labeled data suffers from extremely noisy samples. To alleviate the problem, we propose a simple and effective Co-Weak-Teaching method. The method trains two slot filling models simultaneously. These two models learn from two different weakly labeled data, ensuring learning from two aspects. Then, one model utilizes selected weakly labeled data generated by the other, iteratively. The model, obtained by the Co-Weak-Teaching on weakly labeled data, can be directly tested on testing data or sequentially fine-tuned on a small amount of human-annotated data. Experimental results on these two settings illustrate the effectiveness of the method with an increase of 8.03% and 14.74% in micro and macro f1 scores, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrepancy-Based Active Learning for Weakly Supervised Bleeding Segmentation in Wireless Capsule Endoscopy Images

Advancing Automatic Subject Indexing: Combining Weak Supervision with Extreme Multi-label Classification

Robust multi-label surgical tool classification in noisy endoscopic videos

Article Open access 14 February 2025

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Lipton Z C, Li X, Gao J, Li L, Ahmed F, Deng L. BBQ-networks: efficient exploration in deep reinforcement learning for task-oriented dialogue systems. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 5237–5244
Wen T H, Vandyke D, Mrkšić N, Gašić M, Rojas-Barahona L, Su P H, Ultes S, Young S. A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017, 438–449
Yan Z, Duan N, Chen P, Zhou M, Zhou J, Li Z. Building task-oriented dialogue systems for online shopping. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 4618–4626
Xu L, Zhou Q, Gong K, Liang X, Tang J, Lin L. End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 7346–7353
Wang L, Li X, Liu J, He K, Yan Y, Xu W. Bridge to target domain by prototypical contrastive learning and label confusion: re-explore zero-shot learning for slot filling. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 9474–9480
Shi X, Hu H, Che W, Sun Z, Liu T, Huang J. Understanding medical conversations with scattered keyword attention and weak supervision from responses. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 8838–8845
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I W, Sugiyama M. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8536–8546
Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171–4186
Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
Yao K, Zweig G, Hwang M Y, Shi Y, Yu D. Recurrent neural networks for language understanding. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association. 2013, 2524–2528
Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, He X, Heck L, Tur G, Yu D, Zweig G. Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(3): 530–539
Article Google Scholar
Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y N, Gao J, Deng L, Wang Y Y. Multi-domain joint semantic frame parsing using Bidirectional RNN-LSTM. In: Proceedings of the 17th Annual Meeting of the International Speech Communication Association. 2016, 715–719
Zhao L, Feng Z. Improving slot filling in spoken language understanding with joint pointer and attention. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 426–431
Barahona L M, Gašić M, Mrkšić N, Su P H, Ultes S, Wen T H, Young S. Exploiting sentence and context representations in deep neural models for spoken language understanding. In: Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. 2016, 258–267
Blum A, Mitchell T. Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. 1998, 92–100
Abney S. Bootstrapping. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002, 360–367
Balcan M F, Blum A, Yang K. Co-training and expansion: towards bridging theory and practice. In: Proceedings of the 17th International Conference on Neural Information Processing Systems. 2004, 89–96
Wang W, Zhou Z H. Theoretical foundation of co-training and disagreement-based algorithms. 2017, arXiv preprint arXiv: 1708.04403
Du J, Ling C X, Zhou Z H. When does cotraining work in real data? IEEE Transactions on Knowledge and Data Engineering, 2011, 23(5): 788–799
Article Google Scholar
Angluin D, Laird P. Learning from noisy examples. Machine Learning, 1988, 2(4): 343–370
Article Google Scholar
Arpit D, Jastrzębski S, Ballas N, Krueger D, Bengio E, Kanwal M S, Maharaj T, Fischer A, Courville A, Bengio Y, Lacoste-Julien S. A closer look at memorization in deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 233–242
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 2021, 64(3): 107–115
Article Google Scholar
Goldberger J, Ben-Reuven E. Training deep neural-networks using a noise adaptation layer. In: Proceedings of the 5th International Conference on Learning Representations. 2017
Patrini G, Rozza A, Krishna Menon A, Nock R, Qu L. Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2233–2241

Download references

Acknowledgements

The authors want to thank Sendong Zhao for discussion.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
Xiaoming Shi & Wanxiang Che

Authors

Xiaoming Shi
View author publications
Search author on:PubMed Google Scholar
Wanxiang Che
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Xiaoming Shi or Wanxiang Che.

Additional information

Xiaoming Shi is a PhD student in School of Computer Science and Technology, Harbin Institute of Technology, China. His main research interests are in artificial intelligence, machine learning and natural language processing. He now is working on dialogue systems for automatic diagnosis.

Wanxiang Che is a Professor in School of Computer Science and Technology, Harbin Institute of Technology, China. His main research interests are in artificial intelligence, machine learning and natural language processing. He is the vice director of Research Center for Social Computing and Information Retrieval. He is a young scholar of “Heilongjiang Scholar” and a visiting scholar of Stanford University, USA.

Electronic Supplementary Material