Skip to main content
Log in

Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Semantic role labeling (SRL) aims at identifying the predicate-argument structure of a sentence. Recent work has significantly improved SRL performance by incorporating syntactic information and exploiting pre-trained models like BERT. Most of them use pre-trained models as isolated encoders to obtain word embeddings and enhance them with word-level syntax. Unlike many other languages, Chinese pre-trained models normally use Chinese characters instead of subwords as the basic input units, making the many-units-in-one-word phenomena more frequent and the relationship between characters more important. However, this character-level information is often ignored by previous research. In this paper, we propose the Character-Level Syntax-Infused network for Chinese SRL, which effectively incorporates the syntactic information between Chinese characters into pre-trained models. Experiments on the Chinese benchmarks of CoNLL-2009 and Universal Proposition Bank (UPB) show that the proposed approach achieves state-of-the-art results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability statement

The data (CoNLL-2009 dataset) that support the Tables 1, 2 and 3, 5, 6, 7, 8 and Figs. 7, 8 are available from Linguistic Data Consurtium (LDC) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. It can be obtained at https://catalog.ldc.upenn.edu/LDC2012T03 and https://catalog.ldc.upenn.edu/LDC2012T04. Data (UPB) supporting Table 4 are publicly available at https://github.com/System-T/UniversalPropositions.

Notes

  1. We compute statistics for languages whose monolingual BERT models are publicly available. The BERT models are obtained from https://huggingface.co/models.

  2. https://catalog.ldc.upenn.edu/LDC2012T03.

  3. https://github.com/System-T/UniversalPropositions.

  4. http://hdl.handle.net/11234/1-1827.

  5. http://propbank.github.io/.

  6. https://stanfordnlp.github.io/stanza/.

  7. The unlabeled attachment score (UAS) for the UPB Chinese test set is 70.60%.

  8. https://github.com/google-research/bert#pre-trained-models.

  9. https://github.com/ymcui/Chinese-BERT-wwm.

  10. https://ufal.mff.cuni.cz/conll2009-st/scorer.html.

  11. https://github.com/KiroSummer/A_Syntax-aware_MTL_Framework_for_Chinese_SRL.

  12. The pipeline model is used in syntax-aware systems.

  13. The pipeline model is used in syntax-aware systems.

  14. We use the officially provided predicted parses on the CoNLL-2009 Chinese dataset, whose UAS for the valid set is 82.6%.

References

  1. Akbik A, Chiticariu L, Danilevsky M, Li Y, Vaithyanathan S, Zhu H (2015) Generating high quality proposition Banks for multilingual semantic role labeling. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol 1: Long Papers), Association for Computational Linguistics, Beijing, pp 397–407

  2. Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on Freebase from question-answer pairs. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Seattle, Washington, pp 1533–1544

    Google Scholar 

  3. Björkelund A, Hafdell L, Nugues P (2009) Multilingual semantic role labeling. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, Association for Computational Linguistics, Boulder, pp 43–48

    Google Scholar 

  4. Cai J, He S, Li Z, Zhao H (2018) A full end-to-end semantic role labeler, syntactic-agnostic over syntactic-aware? In: Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics, Santa Fe, New Mexico, pp 2753–2765

    Google Scholar 

  5. Cai R, Lapata M (2019) Semi-supervised semantic role labeling with cross-view training. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, pp 1018–1027

    Google Scholar 

  6. Che W, Zhang M, Liu T, Li S (2006) A hybrid convolution tree kernel for semantic role labeling. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Association for Computational Linguistics, Sydney, pp 73–80

    Google Scholar 

  7. Chen D, Manning C (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, pp 740–750

    Google Scholar 

  8. Conia S, Navigli R (2020) Bridging the gap in multilingual semantic role labeling: a language-agnostic approach. In: Proceedings of the 28th International Conference on Computational Linguistics, International Committee on Computational Linguistics. Barcelona, pp 1396–1410

    Google Scholar 

  9. Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, pp 4171–4186

  10. El Boukkouri H, Ferret O, Lavergne T, Noji H, Zweigenbaum P, Tsujii J (2020) CharacterBERT: Reconciling ELMo and BERT for word-level open-vocabulary representations from characters. In: Proceedings of the 28th International Conference on Computational Linguistics, International Committee on Computational Linguistics, International Committee on Computational Linguistics, Barcelona, pp 6903–6915

    Google Scholar 

  11. FitzGerald N, Täckström O, Ganchev K, Das D (2015) Semantic role labeling with neural network factors. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, pp 960–970

    Google Scholar 

  12. Gildea D, Jurafsky D (2000) Automatic labeling of semantic roles. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Hong Kong, pp 512–520

    Google Scholar 

  13. Hajič J, Ciaramita M, Johansson R, Kawahara D, Martí MA, Màrquez L, Meyers A, Nivre J, Padó S, Štěpánek J, Straňák P, Surdeanu M, Xue N, Zhang Y (2009) The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, Association for Computational Linguistics, Boulder, pp 1–18

    Google Scholar 

  14. He S, Li Z, Zhao H, Bai H (2018) Syntax for semantic role labeling, to be, or not to be. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, pp 2061–2071

    Google Scholar 

  15. He S, Li Z, Zhao H (2019) Syntax-aware multilingual semantic role labeling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, pp 5350–5359

    Google Scholar 

  16. Jawahar G, Sagot B, Seddah D (2019) What does BERT learn about the structure of language? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, pp 3651–3657

    Google Scholar 

  17. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings, OpenReview.net

  18. Li J, Zhou G, Zhao H, Zhu Q, Qian P (2009) Improving nominal SRL in Chinese language with verbal SRL information and automatic predicate recognition. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Singapore, pp 1280–1288

    Google Scholar 

  19. Li Z, He S, Cai J, Zhang Z, Zhao H, Liu G, Li L, Si L (2018) A unified syntax-aware framework for semantic role labeling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, pp 2401–2411

    Google Scholar 

  20. Li Z, He S, Zhao H, Zhang Y, Zhang Z, Zhou X, Zhou X (2019) Dependency or span, end-to-end uniform semantic role labeling. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019. AAAI Press, pp 6730–6737

  21. Li Z, Zhao H, Wang R, Parnow K (2020a) High-order semantic role labeling. In: Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, pp 1134-1151 (online)

    Google Scholar 

  22. Lin Y, Tan YC, Frank R (2019) Open sesame: Getting inside BERT’s linguistic knowledge. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and interpreting neural networks for NLP. Association for Computational Linguistics, Florence, pp 241–253

    Google Scholar 

  23. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized BERT pretraining approach. https://arxiv.org/abs/1907.11692

  24. Loshchilov I, Hutter F (2019) Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, OpenReview.net

  25. Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, pp 1506–1515

    Google Scholar 

  26. Marcheggiani D, Frolov A, Titov I (2017) A simple and accurate syntax-agnostic neural model for dependency-based semantic role labeling. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). Association for Computational Linguistics, Vancouver, pp 411–420

    Google Scholar 

  27. Meyers A, Reeves R, Macleod C, Szekely R, Zielinska V, Young B, Grishman R (2004) The NomBank project: an interim report. In: Proceedings of the Workshop Frontiers in Corpus Annotation at HLTNAACL 2004, Association for Computational Linguistics, Boston, pp 24–31

    Google Scholar 

  28. Munir K, Zhao H, Li Z (2021) Adaptive convolution for semantic role labeling. IEEE ACM Trans Audio Speech Lang Process 29:782–791

    Article  Google Scholar 

  29. Palmer M, Gildea D, Kingsbury P (2005) The Proposition Bank: an annotated corpus of semantic roles. Comput Linguist 31(1):71–106

    Article  Google Scholar 

  30. Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018a) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, pp 2227–2237

  31. Peters M, Neumann M, Zettlemoyer L, Yih Wt (2018b) Dissecting contextual word embeddings: architecture and representation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, pp 1499–1509

    Google Scholar 

  32. Pradhan S, Ward W, Hacioglu K, Martin J, Jurafsky D (2005) Semantic role labeling using different syntactic views. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), Association for Computational Linguistics, Ann Arbor, pp 581–588

    Google Scholar 

  33. Qian F, Sha L, Chang B, Liu Lc, Zhang M (2017) Syntax aware LSTM model for semantic role labeling. In: Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, Association for Computational Linguistics, Copenhagen, pp 27–32

    Google Scholar 

  34. Qin L, Zhang Z, Zhao H (2016) Implicit discourse relation recognition with context-aware character-enhanced embeddings. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp 1914–1924

  35. Roth M, Lapata M (2016) Neural semantic role labeling with dependency path embeddings. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, pp 1192–1202

  36. Shi C, Liu S, Ren S, Feng S, Li M, Zhou M, Sun X, Wang H (2016) Knowledge-based semantic embedding for machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, pp 2245–2254

  37. Shi P, Lin J (2019) Simple BERT models for relation extraction and semantic role labeling. https://arxiv.org/abs/1904.05255

  38. Strubell E, Verga P, Andor D, Weiss D, McCallum A (2018) Linguistically-informed self-attention for semantic role labeling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, pp 5027–5038

    Google Scholar 

  39. Surdeanu M, Harabagiu S, Williams J, Aarseth P (2003) Using predicate-argument structures for information extraction. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Sapporo, pp 8–15

    Google Scholar 

  40. Surdeanu M, Johansson R, Meyers A, Màrquez L, Nivre J (2008) The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies. In: CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning, Coling 2008 Organizing Committee. Manchester, pp 159–177

    Google Scholar 

  41. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, Beijing, pp 1556–1566

  42. Tenney I, Das D, Pavlick E (2019) BERT rediscovers the classical NLP pipeline. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, pp 4593–4601

    Google Scholar 

  43. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser u, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. Curran Associates Inc., Red Hook, pp 6000–6010

  44. Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, Conference Track Proceedings, OpenReview.net (April 30-May 3, 2018)

  45. Xia Q, Li Z, Zhang M (2019) A syntax-aware multi-task learning framework for Chinese semantic role labeling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, pp 5382–5392

    Google Scholar 

  46. Xu L, Hu H, Zhang X, Li L, Cao C, Li Y, Xu Y, Sun K, Yu D, Yu C, Tian Y, Dong Q, Liu W, Shi B, Cui Y, Li J, Zeng J, Wang R, Xie W, Li Y, Patterson Y, Tian Z, Zhang Y, Zhou H, Liu S, Zhao Z, Zhao Q, Yue C, Zhang X, Yang Z, Richardson K, Lan Z (2020). CLUE: A Chinese language understanding evaluation benchmark. In: Proceedings of the 28th International Conference on Computational Linguistics, In: International Committee on Computational Linguistics, Barcelona, pp 4762–4772

    Google Scholar 

  47. Yih Wt, Richardson M, Meek C, Chang MW, Suh J (2016) The value of semantic parse labeling for knowledge base question answering. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, Berlin, pp 201–206

  48. Zhang M (2020) A survey of syntactic-semantic parsing based on constituent and dependency structures. Sci China Technol Sci:1–23

  49. Zhang Z, Zhao H, Qin L (2016). Probabilistic graph-based dependency parsing with convolutional neural network. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, pp 1382–1392

  50. Zhang Z, Wu Y, Zhao H, Li Z, Zhang S, Zhou X, Zhou X (2020) Semantics-aware BERT for language understanding. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence. EAAI 2020, New York. AAAI Press, pp 9628–9635 (7-12 Feb 2020)

  51. Zhao H, Chen W, Kazama J, Uchimoto K, Torisawa K (2009) Multilingual dependency learning: exploiting rich features for tagging syntactic and semantic dependencies. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, Association for Computational Linguistics, Boulder, pp 61–66

    Google Scholar 

  52. Zhou J, Xu W (2015) End-to-end learning of semantic role labeling using recurrent neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (vol 1: Long Papers). Association for Computational Linguistics, Beijing, pp 1127–1137

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wanxiang Che.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Lei, Z. & Che, W. Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling. Int. J. Mach. Learn. & Cyber. 12, 3503–3515 (2021). https://doi.org/10.1007/s13042-021-01397-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01397-3

Keywords

Navigation