Skip to main content

Enhancing Low-Resource Languages Question Answering with Syntactic Graph

  • Conference paper
  • First Online:
  • 1017 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13248))

Abstract

Multilingual pre-trained language models (PLMs) facilitate zero-shot cross-lingual transfer from rich-resource languages to low-resource languages in extractive question answering (QA) tasks. However, during fine-tuning on the QA task, the syntactic information of languages in multilingual PLMs is not always preserved or even is forgotten, which may influence the detection of answer spans for low-resource languages. In this paper, we propose an auxiliary task to predict syntactic graphs to enhance syntax information in the fine-tuning stage of the QA task to improve the answer span detection of low-resource. The syntactic graph includes Part-of-Speech (POS) information and syntax tree information without dependency parse label. We convert the syntactic graph prediction task into two subtasks to adapt the sequence input of PLMs: POS tags prediction task and syntax tree prediction task (including depth prediction of a word and distance prediction of two words). Moreover, to improve the alignment between languages, we parallel train the source language and target languages syntactic graph prediction task. Extensive experiments on three multilingual QA datasets show the effectiveness of our proposed approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://stanfordnlp.github.io/CoreNLP/.

References

  1. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100, 000+ questions for machine comprehension of text. In: 21st International Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392. Association for Computational Linguistics. Austin, Texas, USA (2016)

    Google Scholar 

  2. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: 23rd International Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics. Minneapolis, MN, USA (2019)

    Google Scholar 

  3. Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: 33rd International Conference on Neural Information Processing Systems, pp. 7057–7067. Vancouver, BC, Canada (2019)

    Google Scholar 

  4. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: 58th Conference of the Association for Computational Linguistics, pp. 8440–8451. Association for Computational Linguistics. Online (2020)

    Google Scholar 

  5. Artetxe, M., Schwenk, H.: Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Linguist. 7, 597–610 (2019)

    Article  Google Scholar 

  6. Patrick, S.H., Lewis, B.O., Rinott, R., Riedel, S., Schwenk, H.: MLQA: evaluating cross-lingual extractive question answering. In: 58th Conference of the Association for Computational Linguistics, pp. 7315–7330 (2020)

    Google Scholar 

  7. Artetxe, M., Ruder, S., Yogatama, D.: On the cross-lingual transferability of monolingual representations. In: 58th Conference of the Association for Computational Linguistics, pp. 4623–4637 (2020)

    Google Scholar 

  8. Jonathan, H., et al.: TyDi QA: a Benchmark for information-seeking question answering in typologically diverse languages. Trans. Assoc. Comput. Linguist. 8, 454–470 (2020)

    Article  Google Scholar 

  9. Yuan, F., et al.: Enhancing answer boundary detection for multilingual machine reading comprehension. In: 58th Conference of the Association for Computational Linguistics, pp. 925–934. Association for Computational Linguistics (2020)

    Google Scholar 

  10. Gangi Reddy, R., et al.: Answer span correction in machine reading comprehension. In: 25th International Conference on Empirical Methods in Natural Language Processing, pp. 2496–2501. Association for Computational Linguistics (2020)

    Google Scholar 

  11. Hewitt, J., Manning, C.D.: A structural probe for finding syntax in word representations. In: 23rd International Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4129–4138. Association for Computational Linguistics. Minneapolis, MN, USA (2019)

    Google Scholar 

  12. Chi, E.A., Hewitt, J., Manning, C.D.: Finding universal grammatical relations in multilingual BERT. In: 58th Conference of the Association for Computational Linguistics, pp. 5564–5577. Association for Computational Linguistics (2020)

    Google Scholar 

  13. Pérez-Mayos, L., Carlini, R., Ballesteros, M., Wanner, L.: On the evolution of syntactic information encoded by BERT’s contextualized representations. In: 16th Conference of the European Chapter of the Association for Computational Linguistics, pp. 2243–2258. Association for Computational Linguistics(2021)

    Google Scholar 

  14. Xu, K., Wu, L., Wang, Z., Yu, M., Chen, L., Sheinin, V.: Exploiting rich syntactic information for semantic parsing with graph-to-sequence model. In: 23th International Conference on Empirical Methods in Natural Language Processing, pp. 918–924. Association for Computational Linguistics. Brussels, Belgium (2018)

    Google Scholar 

  15. Hsu, T.-Y., Liu, C.-L., Lee, H.: Zero-shot reading comprehension by cross-lingual transfer learning with multi-lingual language representation model. In: 24th International Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 5932–5939. Association for Computational Linguistics. Hong Kong, China (2019)

    Google Scholar 

  16. Liu, J., Shou, L., Pei, J., Gong, M., Yang, M., Jiang, D.: Cross-lingual machine reading comprehension. In: 24th International Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 1586–1595. Association for Computational Linguistics. Hong Kong, China (2019)

    Google Scholar 

  17. Liu, J., Shou, L., Pei, J., Gong, M., Yang, M., Jiang, D.: Cross-lingual machine reading comprehension with language branch knowledge distillation. In: 28th International Conference on Computational Linguistics, pp. 2710–2721. International Committee on Computational Linguistics (2020)

    Google Scholar 

  18. Huang, W.-C., Huang, C., Lee, H.: Improving cross-lingual reading comprehension with self-training. arXiv preprint arXiv:2105.03627 (2021)

  19. Gaochen Wu, Bin Xu, Yuxin Qin, Fei Kong, Bangchang Liu, Hongwen Zhao, Dejie Chang: Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking. arXiv preprint arXiv:2107.05002 (2021)

  20. Jiao, J., Wang, S., Zhang, X., Wang, L., Feng, Z., Wang, J.: gMatch: knowledge base question answering via semantic matching. Knowl.-Based Syst. 228, 107270 (2021)

    Article  Google Scholar 

  21. Hu, J., Ruder, S., Siddhant, A., Neubig, G., Firat, O., Johnson, M.: XTREME: a massively multilingual multi-task benchmark for evaluating cross-lingual generalisation. In: 37th International Conference on Machine Learning, pp. 4411–4421. Proceedings of Machine Learning Research. Virtual Event (2020)

    Google Scholar 

  22. Zeman, D., et al.: Universal dependencies 2.7 (2020). http://hdl.handle.net/11234/570 1–3424

  23. Conneau, A., et al.: XNLI: evaluating cross-lingual sentence representations. In: 23th International Conference on Empirical Methods in Natural Language Processing, pp. 2475–2485. Association for Computational Linguistics. Brussels, Belgium (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linjuan Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, L., Zhu, J., Zhang, X., Zhuang, Z., Feng, Z. (2022). Enhancing Low-Resource Languages Question Answering with Syntactic Graph. In: Rage, U.K., Goyal, V., Reddy, P.K. (eds) Database Systems for Advanced Applications. DASFAA 2022 International Workshops. DASFAA 2022. Lecture Notes in Computer Science, vol 13248. Springer, Cham. https://doi.org/10.1007/978-3-031-11217-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-11217-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-11216-4

  • Online ISBN: 978-3-031-11217-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics