Abstract
Semantic role labeling (SRL) enriches many downstream applications, e.g., machine translation, question answering, summarization, and stance/belief detection. However, building multilingual SRL models is challenging due to the scarcity of semantically annotated corpora for multiple languages. Moreover, state-of-the-art SRL projection (XSRL) based on large language models (LLMs) yields output that is riddled with spurious role labels. Remediation of such hallucinations is not straightforward due to the lack of explainability of LLMs. We show that hallucinated role labels are related to naturally occurring divergence types that interfere with initial alignments. We implement Divergence-Aware Hallucination-Remediated SRL projection (DAHRS), leveraging linguistically-informed alignment remediation followed by greedy First-Come First-Assign (FCFA) SRL projection. DAHRS improves the accuracy of SRL projection without additional transformer-based machinery, beating XSRL in both human and automatic comparisons, and advancing beyond headwords to accommodate phrase-level SRL projection (e.g., EN-FR, EN-ES). Using CoNLL-2009 as our ground truth, we achieve a higher word-level F1 over XSRL: 87.6% vs. 77.3% (EN-FR) and 89.0% vs. 82.7% (EN-ES). Human phrase-level assessments yield 89.1% (EN-FR) and 91.0% (EN-ES). We also define a divergence metric to adapt our approach to other language pairs (e.g., English-Tagalog).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Figure 1 inputs: (a) EN: The dow ’s dive was the 12th - worst ever and the sharpest since the market fell 156.83 FR: La chute du dow jones a été la 12e - la pire et la plus forte depuis que le marché a chuté de 156.83. (b) EN: Some “circuit breakers” installed after the october 1987 crash failed their first test. FR: Certains “disjoncteurs” installés après l’écrasement d’octobre 1987 ont échoué leur premier test.
- 2.
SRL-BERT achieves an F1 Score of 86.49 on the English Ontonotes dataset [37], and it can be used non-exclusively. https://allenai.org/terms.
- 3.
A phrase consists of a token that begins with a “B” tag and continues with tokens that have an “I” tag. The following token will have a new “B”, an “O”, or end of the sentence, indicating the end of the phrase.
- 4.
We have simplified the notion of predicate considerably in this discussion, focusing on verbs; however, other parts of speech may serve as predicates. For example, destruction of the city is a nominal phrase conveying a destroy event with a single argument: the city. Future work aims to explore other parts of speech as predicates.
- 5.
We use EN-ES-TL parallel data from LORELEI [35].
References
Akbik, A., Chiticariu, L., Danilevsky, M., Li, Y., Vaithyanathan, S., Zhu, H.: Generating high quality proposition banks for multilingual semantic role labeling. In: Proceedings of ACL-IJCNLP (2015)
Baklanova, E., Bellamy, K.: Spanish suffixes in tagalog: the case of common nouns. In: Traces of Contact in the Lexicon. BRILL (2023)
Campagnano, C., Conia, S., Navigli, R.: SRL4E - semantic role labeling for emotions: a unified evaluation framework. In: Proceedings of ACL (2022)
Danilevsky, M., Qian, K., Aharonov, R., Katsis, Y., Kawas, B., Sen, P.: A survey of the state of explainable AI for natural language processing. In: Proceedings of AACL-IJCNLP (2020)
Daza, A., Frank, A.: Translate and label! an encoder-decoder approach for cross-lingual semantic role labeling. In: Proceedings of EMNLP-IJCNLP (2019)
Daza, A., Frank, A.: X-srl: a parallel cross-lingual semantic role labeling dataset. In: Proceedings of EMNLP (2020)
DeepL SE: DeepL: neural machine translation software (2017)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT (2018)
Dorr, B.J.: Machine translation divergences: a formal description and proposed solution. Computational Linguistics (1994)
Dorr, B.J., Pearl, L., Hwa, R., Habash, N.: DUSTer: a method for unraveling cross-language divergences for statistical word-level alignment. In: Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers, vol. 2499 (2002)
Dou, Z.Y., Neubig, G.: Word alignment by fine-tuning embeddings on parallel corpora. In: Proceedings of the EACL (2021)
Fei, H., Zhang, M., Ji, D.: Cross-lingual semantic role labeling with high-quality translated training corpus. In: Proceedings of ACL (2020)
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Proceedings of ICML. vol. 3 (2017)
Genest, P.E., Lapalme, G.: Framework for abstractive summarization using text-to-text generation. In: Monolingual@ACL (2011)
Guarasci, R., Silvestri, S., Pietro, G.D., Fujita, H., Esposito, M.: Bert syntactic transfer: a computational experiment on Italian, French and English languages. Comput. Speech Lang. 71 (2022)
Hajič, J., et al.: 2009 conll shared task part 2 (2012)
Hassan, H., et al.: Achieving human parity on automatic Chinese to English news translation. arXiv preprint arXiv:1803.05567 (2018)
Hoffman, R.R., Mueller, S.T., Klein, G., Litman, J.: Metrics for explainable AI: challenges and prospects (2018)
Hwa, R., Resnik, P., Weinberg, A., Cabezas, C., Kolak, O.: Bootstrapping parsers via syntactic projection across parallel texts. Natural Lang. Eng. 11, 311–325 (2005)
Kozhevnikov, M., Titov, I.: Cross-lingual transfer of semantic role labeling models. In: Proceedings of ACL (2013)
Liu, D., Gildea, D.: Semantic role features for machine translation (2010)
Liu, H., Yin, Q., Wang, W.Y.: Towards explainable NLP: a generative explanation framework for text classification. In: Proceedings of ACL (2020)
Maniyar, S.N., Kulkarni, S.B., Bhise, P.R.: Linguistic divergence in various language pair in machine translation perceptive. IOSR J. Comput. Eng. (IOSR-JCE) 23(1) (2021)
Marzouk, S.: Chapter 3 german light verb construction in the course of the development of machine translation. In: Translation, Interpreting, Cognition: The Way Out of the Box. Language Science Press (2021)
Mather, B., Dorr, B.J., Dalton, A., de Beaumont, W., Rambow, O., Schmer-Galunder, S.M.: From stance to concern: adaptation of propositional analysis to new tasks and domains. In: Findings of the ACL (2022)
Mather, B., Dorr, B.J., Rambow, O., Strzalkowski, T.: A general framework for domain-specialization of stance detection. In: Proceedings of FLAIRS (2021)
McDonald, R., et al.: Universal dependency annotation for multilingual parsing. In: Proceedings of ACL (2013)
Mehta, S.V., Lee, J.Y., Carbonell, J.: Towards semi-supervised learning for deep semantic role labeling. In: Proceedings of EMNLP (2018)
Mulcaire, P., Swayamdipta, S., Smith, N.A.: Polyglot semantic role labeling. In: Proceedings of ACL (2018)
OpenAI: ChatGPT: Large-scale language model (2021)
Pražák, O., Konopík, M.: Cross-lingual srl based upon universal dependencies. In: Proceedings of RANLP (2017)
Rottmann, K., Vogel, S.: Word reordering in statistical machine translation with a pos-based distortion model. In: Proceedings of IEEE on TMI (2007)
Shen, Y., Chu, C., Cromieres, F., Kurohashi, S.: Cross-language projection of dependency trees with constrained partial parsing for tree-to-tree machine translation. In: Proceedings of the First Conference on Machine Translation (2016)
Shi, P., Lin, J.: Simple bert models for relation extraction and semantic role labeling. arXiv preprint arXiv:1904.05255 (2019)
Tracey, J., et al.: Lorelei (low resource languages for emergent incidents) tagalog representative language pack (2023)
Tsai, H.C., Kuo, C.W., Huang, Y.F.: Llamaloop: enhancing information retrieval in llama with semantic relevance feedback loop. Preprint at Reserch Square (2023)
Weischedel, R., et al.: Ontonotes release 5.0 (2013)
Yarowsky, D., Ngai, G.: Inducing multilingual pos taggers and np bracketers via robust projection across aligned corpora. In: Second Meeting of the NAACL (2001)
Zhang, M., Jiang, H., Aw, A., Li, H., Tan, C.L., Li, S.: A tree sequence alignment-based tree-to-tree translation model. In: Proceedings of the ACL-HLT (2008)
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: Bertscore: evaluating text generation with bert. arXiv preprint arXiv:1904.09675 (2019)
Acknowledgements
This research is based upon work supported by Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001121C0186. Any opinions, findings and conclusions or recommendations expressed in this research are those of the authors and do not necessarily reflect the views of the US Government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Youm, S., Mather, B., Jayaweera, C., Prada, J., Dorr, B. (2024). DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection. In: Rapp, A., Di Caro, L., Meziane, F., Sugumaran, V. (eds) Natural Language Processing and Information Systems. NLDB 2024. Lecture Notes in Computer Science, vol 14762. Springer, Cham. https://doi.org/10.1007/978-3-031-70239-6_29
Download citation
DOI: https://doi.org/10.1007/978-3-031-70239-6_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70238-9
Online ISBN: 978-3-031-70239-6
eBook Packages: Computer ScienceComputer Science (R0)