skip to main content
research-article

Semi-Supervised Semantic Role Labeling with Bidirectional Language Models

Published: 17 June 2023 Publication History

Abstract

The recent success of neural networks in NLP applications has provided a strong impetus to develop supervised models for semantic role labeling (SRL) that forego the requirement for extensive feature engineering. Recent state-of-the-art approaches require high-quality annotated datasets that are costly to obtain and almost unavailable for low-resource languages. We present a semi-supervised approach that utilizes both labeled and unlabeled data to provide performance improvement over a mere supervised SRL model. We show that our proposed semi-supervised SRL model provides larger improvement over a supervised model in the scenario where labeled training data size is small. Our SRL system leverages unlabeled data under the language modeling paradigm. We demonstrate that the incorporation of a self pre-trained bidirectional language model (S-PrLM) into a SRL system can help in SRL performance improvement by learning composition functions from the unlabeled data. Previous researches have concluded that syntax information is very useful for high-performing SRL systems, so we incorporate syntax information by employing an unsupervised approach to leverage dependency path information to connect argument candidates in vector space, which helps in distinguishing arguments with similar contexts but different syntactic functions. The basic idea is to connect predicate (wp) with argument candidate (wa) with the dependency path (r) between them in the embedding space. Experiments on the CoNLL-2008 and CoNLL-2009 datasets confirm that our full SRL model outperforms previous best models in terms of F1 score.

References

[1]
Laith Mohammad Abualigah, Ahamad Tajudin Khader, and Essam Said Hanandeh. 2018. A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Engineering Applications of Artificial Intelligence 73 (2018), 111–125.
[2]
Laith Mohammad Qasim Abualigah et al. 2019. Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering. Springer.
[3]
Lucas Antiqueira, Osvaldo N. Oliveira, Jr., Luciano da Fontoura Costa, and Maria das Graças Volpe Nunes. 2009. A complex network approach to text summarization. Information Sciences 179, 5 (2009), 584–599.
[4]
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research 3, (Feb. 2003), 1137–1155.
[5]
Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on freebase from question-answer pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 1533–1544.
[6]
Anders Björkelund, Bernd Bohnet, Love Hafdell, and Pierre Nugues. 2010. A high-performance syntactic and semantic dependency parser. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 33–36.
[7]
Anders Björkelund, Love Hafdell, and Pierre Nugues. 2009. Multilingual semantic role labeling. In Proceedings of the 13th Conference on Computational Natural Language Learning - Shared Task (CoNLL’09). 43–48.
[8]
Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. 2011. Learning structured embeddings of knowledge bases. In Proceedings of the 25th AAAI Conference on Artificial Intelligence.
[9]
Deng Cai and Hai Zhao. 2016. Neural word segmentation learning for Chinese. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16) (Volume 1: Long Papers). 409–420.
[10]
Jiaxun Cai, Shexia He, Zuchao Li, and Hai Zhao. 2018. A full end-to-end semantic role labeler, syntax-agnostic over syntax-aware?. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). 2753–2765.
[11]
Rui Cai and Mirella Lapata. 2019. Semi-supervised semantic role labeling with cross-view training. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 1017–1026.
[12]
Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One billion word benchmark for measuring progress in statistical language modeling. arXiv:1312.3005. https://arxiv.org/abs/1312.3005
[13]
Chaotao Chen, Run Zhuo, and Jiangtao Ren. 2019. Gated recurrent neural network with sentimental relations for sentiment classification. Information Sciences 502 (2019), 268–278.
[14]
Jason P. C. Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association for Computational Linguistics 4 (2016), 357–370.
[15]
Kevin Clark, Minh-Thang Luong, Christopher D. Manning, and Quoc V. Le. 2018. Semi-supervised sequence modeling with cross-view training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP’18).
[16]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. https://arxiv.org/abs/1810.04805.
[17]
Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. 2014. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL’14). 1370–1380.
[18]
Quynh Thi Ngoc Do, Steven Bethard, and Marie-Francine Moens. 2015. Domain adaptation in semantic role labeling using a neural language model and linguistic resources. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23, 11 (2015), 1812–1823.
[19]
Timothy Dozat and Christopher D. Manning. 2016. Deep biaffine attention for neural dependency parsing. arXiv:1611.01734. https://arxiv.org/abs/1611.01734.
[20]
Nicholas FitzGerald, Oscar Täckström, Kuzman Ganchev, and Dipanjan Das. 2015. Semantic role labeling with neural network factors. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 960–970.
[21]
William Foland and James H. Martin. 2015. Dependency-based semantic role labeling using convolutional neural networks. In Proceedings of the 4th Joint Conference on Lexical and Computational Semantics. 279–288.
[22]
Volkmar Frinken, Alicia Fornés, Josep Lladós, and Jean-Marc Ogier. 2012. Bidirectional language model for handwriting recognition. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer, 611–619.
[23]
Hagen Fürstenau and Mirella Lapata. 2012. Semi-supervised semantic role labeling via structural alignment. Computational Linguistics 38, 1 (2012), 135–171.
[24]
ZhiQiang Geng, GuoFei Chen, YongMing Han, Gang Lu, and Fang Li. 2020. Semantic relation extraction using sequential and tree-structured LSTM with attention. Information Sciences 509 (2020), 183–192.
[25]
Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of semantic roles. Computational Linguistics 28, 3 (2002), 245–288.
[26]
Jan Hajič, Massimiliano Ciaramita, Richard Johansson, Daisuke Kawahara, Maria Antònia Martí, Lluís Màrquez, Adam Meyers, Joakim Nivre, Sebastian Padó, Jan Štepánek, et al. 2009. The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages. In Proceedings of the 13th Conference on Computational Natural Language Learning - Shared Task (CoNLL’09). 1–18.
[27]
Luheng He, Kenton Lee, Mike Lewis, and Luke Zettlemoyer. 2017. Deep semantic role labeling: What works and what’s next. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17). 473–483.
[28]
Shexia He, Zuchao Li, Hai Zhao, and Hongxiao Bai. 2018. Syntax for semantic role labeling, to be, or not to be. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18) (Volume 1: Long Papers). 2061–2071.
[29]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[30]
Richard Johansson and Pierre Nugues. 2008. Dependency-based syntactic–semantic analysis with PropBank and NomBank. In Proceedings of the 12th Conference on Computational Natural Language Learning (CoNLL’08). 183–187.
[31]
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the limits of language modeling. arXiv:1602.02410. https://arxiv.org/abs/1602.02410.
[32]
Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 1700–1709.
[33]
Philipp Koehn. 2009. Statistical Machine Translation. Cambridge University Press.
[34]
Stefan Kombrink, Tomáš Mikolov, Martin Karafiát, and Lukáš Burget. 2011. Recurrent neural network based language modeling in meeting recognition. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (ISCA’11).
[35]
Sneha Kudugunta and Emilio Ferrara. 2018. Deep neural networks for bot detection. Information Sciences 467 (2018), 312–322.
[36]
Joel Lang and Mirella Lapata. 2011a. Unsupervised semantic role induction via split-merge clustering. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL:HLT). 1117–1126.
[37]
Joel Lang and Mirella Lapata. 2011b. Unsupervised semantic role induction with graph partitioning. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1320–1331.
[38]
Tao Lei, Yuan Zhang, Lluis Marquez, Alessandro Moschitti, and Regina Barzilay. 2015. High-order low-rank tensors for semantic role labeling. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL:HLT). 1150–1160.
[39]
Omer Levy and Yoav Goldberg. 2014. Dependency-based word embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL’14) (Volume 2: Short Papers). 302–308.
[40]
Junhui Li, Guodong Zhou, Hai Zhao, Qiaoming Zhu, and Peide Qian. 2009. Improving nominal SRL in Chinese language with verbal SRL information and automatic predicate recognition. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 1280–1288.
[41]
Zuchao Li, Jiaxun Cai, Shexia He, and Hai Zhao. 2018a. Seq2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics (CoNLL’18). 3203–3214.
[42]
Zuchao Li, Shexia He, Jiaxun Cai, Zhuosheng Zhang, Hai Zhao, Gongshen Liu, Linlin Li, and Luo Si. 2018b. A unified syntax-aware framework for semantic role labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 2401–2411.
[43]
Zuchao Li, Shexia He, Hai Zhao, Yiqing Zhang, Zhuosheng Zhang, Xi Zhou, and Xiang Zhou. 2019. Dependency or span, end-to-end uniform semantic role labeling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6730–6737.
[44]
Yi Luan, Yangfeng Ji, Hannaneh Hajishirzi, and Boyang Li. 2016. Multiplicative representations for unsupervised semantic role induction. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16) (Volume 2: Short Papers). 118–123.
[45]
Diego Marcheggiani, Anton Frolov, and Ivan Titov. 2017. A simple and accurate syntax-agnostic neural model for dependency-based semantic role labeling. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL’17).
[46]
Diego Marcheggiani and Ivan Titov. 2017. Encoding sentences with graph convolutional networks for semantic role labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 1506–1515.
[47]
Sanket Vaibhav Mehta, Jay Yoon Lee, and Jaime Carbonell. 2018. Towards semi-supervised learning for deep semantic role labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 4958–4963.
[48]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv:1301.3781. https://arxiv.org/abs/1301.3781.
[49]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems (NIPS’13). 3111–3119.
[50]
Piotr Mirowski and Andreas Vlachos. 2015. Dependency recurrent neural language models for sentence completion. arXiv:1507.01193. https://arxiv.org/abs/1507.01193.
[51]
Kashif Munir, Hai Zhao, and Zuchao Li. 2021a. Adaptive convolution for semantic role labeling. In IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 29. IEEE, 782–791.
[52]
Kashif Munir, Hai Zhao, and Zuchao Li. 2021b. Neural unsupervised semantic role labeling. Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 20, 6 (2021), 1–16.
[53]
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th Annual Meeting of the International Conference on Machine Learning (ICML’10).
[54]
Hyo-Jung Oh, Sung Hyon Myaeng, and Myung-Gil Jang. 2007. Semantic passage segmentation based on sentence topics for question answering. Information Sciences 177, 18 (2007), 3696–3717.
[55]
Alvaro Peris and Francisco Casacuberta. 2015. A bidirectional recurrent neural language model for machine translation. Procesamiento del Lenguaje Natural55 (2015), 109–116.
[56]
Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. arXiv:1705.00108. https://arxiv.org/abs/1705.00108.
[57]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL: HLT).
[58]
Sameer Pradhan, Wayne Ward, Kadri Hacioglu, James H. Martin, and Daniel Jurafsky. 2005. Semantic role labeling using different syntactic views. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL’05). 581–588.
[59]
Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics 34, 2 (2008), 257–287.
[60]
Feng Qian, Lei Sha, Baobao Chang, Lu-chen Liu, and Ming Zhang. 2017. Syntax aware LSTM model for semantic role labeling. In Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing. 27–32.
[61]
Lianhui Qin, Zhisong Zhang, and Hai Zhao. 2016. Implicit discourse relation recognition with context-aware character-enhanced embeddings. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 1914–1924.
[62]
Michael Roth and Mirella Lapata. 2016. Neural semantic role labeling with dependency path embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 1192–1202.
[63]
Chen Shi, Shujie Liu, Shuo Ren, Shi Feng, Mu Li, Ming Zhou, Xu Sun, and Houfeng Wang. 2016. Knowledge-based semantic embedding for machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16) (Volume 1: Long Papers). 2245–2254.
[64]
Peng Shi and Jimmy Lin. 2019. Simple BERT models for relation extraction and semantic role labeling. arXiv:1904.05255. https://arxiv.org/abs/1904.05255.
[65]
Anders Søgaard and Yoav Goldberg. 2016. Deep multi-task learning with low level tasks supervised at lower layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) (Volume 2: Short Papers). 231–235.
[66]
Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. 2003. Using predicate-argument structures for information extraction. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL’03). 8–15.
[67]
Mihai Surdeanu, Richard Johansson, Adam Meyers, Lluís Màrquez, and Joakim Nivre. 2008. The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies. In Proceedings of the 12th Conference on Computational Natural Language Learning - Shared Task (CoNLL’08). 159–177.
[68]
Ivan Titov and Alexandre Klementiev. 2012. Semi-supervised semantic role labeling: Approaching from an unsupervised perspective. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12). 2635–2652.
[69]
Rui Wang, Hai Zhao, Bao-Liang Lu, Masao Utiyama, and Eiichro Sumita. 2016. Connecting phrase based statistical machine translation adaptation. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 3135–3145.
[70]
Liu Wenyin, Xiaojun Quan, Min Feng, and Bite Qiu. 2010. A short text modeling method combining semantic and statistical information. Information Sciences 180, 20 (2010), 4031–4041.
[71]
Chung-Hsien Wu, Ze-Jing Chuang, and Yu-Chung Lin. 2006. Emotion recognition from text using semantic labels and separable mixture models. ACM Transactions on Asian Language Information Pprocessing (TALIP) 5, 2 (2006), 165–183.
[72]
Su Yan and Xiaojun Wan. 2014. SRRank: Leveraging semantic roles for extractive multi-document summarization. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22, 12 (2014), 2048–2058.
[73]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of the Advances in Neural Information Processing Systems (NIPS’19). 5753–5763.
[74]
Zhilin Yang, Ruslan Salakhutdinov, and William W. Cohen. 2017. Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv:1703.06345. https://arxiv.org/abs/1703.06345.
[75]
Yichun Yin, Furu Wei, Li Dong, Kaimeng Xu, Ming Zhang, and Ming Zhou. 2016. Unsupervised word and dependency path embeddings for aspect term extraction. arXiv:1605.07843. https://arxiv.org/abs/1605.07843.
[76]
Yuebing Zhang, Zhifei Zhang, Duoqian Miao, and Jiaqi Wang. 2019. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Information Sciences 477 (2019), 55–64.
[77]
Zhuosheng Zhang, Yafang Huang, and Hai Zhao. 2018. Subword-augmented embedding for cloze reading comprehension. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). 1802–1814.
[78]
Zhisong Zhang, Hai Zhao, and Lianhui Qin. 2016. Probabilistic graph-based dependency parsing with convolutional neural network. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16) (Volume 1: Long Papers). 1382–1392.
[79]
Hai Zhao, Wenliang Chen, and Chunyu Kit. 2009a. Semantic dependency parsing of NomBank and PropBank: An efficient integrated approach via a large-scale feature selection. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 30–39.
[80]
Hai Zhao, Wenliang Chen, Chunyu Kit, and Guodong Zhou. 2009b. Multilingual dependency learning: A huge feature engineering method to semantic dependency parsing. In Proceedings of the 13th Conference on Computational Natural Language Learning - Shared Task (CoNLL’09). 55–60.
[81]
Hai Zhao, Wenliang Chen, Jun’ichi Kazama, Kiyotaka Uchimoto, and Kentaro Torisawa. 2009. Multilingual dependency learning: Exploiting rich features for tagging syntactic and semantic dependencies. In Proceedings of the 13th Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, Association for Computational Linguistics, Boulder, CO, 61–66. https://aclanthology.org/W09-1209.
[82]
Hai Zhao, Xiaotian Zhang, and Chunyu Kit. 2013. Integrative semantic dependency parsing via efficient large-scale feature selection. Journal of Artificial Intelligence Research 46 (2013), 203–233.
[83]
Junru Zhou, Zuchao Li, and Hai Zhao. 2020. Parsing all: Syntax and semantics, dependencies and spans. In Findings of the Association for Computational Linguistics: EMNLP 2020. 4438–4449.
[84]
Jie Zhou and Wei Xu. 2015. End-to-end learning of semantic role labeling using recurrent neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL’15). 1127–1137.

Cited By

View all
  • (2024)A Systematic Review on Semantic Role Labeling for Information Extraction in Low-Resource DataIEEE Access10.1109/ACCESS.2024.339237012(57917-57946)Online publication date: 2024

Index Terms

  1. Semi-Supervised Semantic Role Labeling with Bidirectional Language Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 6
    June 2023
    635 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3604597
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 June 2023
    Online AM: 01 April 2023
    Accepted: 03 March 2023
    Revised: 13 August 2022
    Received: 05 July 2021
    Published in TALLIP Volume 22, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Semantic role labeling
    2. semantic parsing
    3. syntax
    4. language models
    5. dependency
    6. contextualized representations
    7. path embedding
    8. unsupervised
    9. CoNLL-2008
    10. CoNLL-2009
    11. semi-supervised

    Qualifiers

    • Research-article

    Funding Sources

    • Key Projects of National Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)71
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Systematic Review on Semantic Role Labeling for Information Extraction in Low-Resource DataIEEE Access10.1109/ACCESS.2024.339237012(57917-57946)Online publication date: 2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media