A Cross-Layer Connection Based Approach for Cross-Lingual Open Question Answering

Li, Lin; Kong, Miao; Li, Dong; Zhou, Dong

doi:10.1007/978-3-030-60450-9_37

Lin Li¹²,
Miao Kong¹²,
Dong Li¹² &
…
Dong Zhou¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12430))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

3004 Accesses

Abstract

Cross-lingual open domain question answering (Open-QA) has become an increasingly important topic. When training a monolingual model, it is often necessary to use a large number of labeled data for supervised training, which makes it difficult to real applications, especially for low-resource languages. Recently, thanks to multilingual BERT model, a new task, so called zero-shot cross-lingual QA has emerged in this field, i.e., training a model for a language rich in resources and directly testing in other languages. The existing problems in the current research include two main points. The one is in document retrieval stage, directly working multilingual pretraining model for similarity calculation will result in insufficient retrieval accuracy. The other is in the stage of answer extraction, the answers will involve different levels of abstraction related to retrieved documents, which needs deep exploration. This paper puts forward a cross-layer connection based approach for cross-lingual Open-QA. It consists of Match-Retrieval module and Connection-Extraction module. The matching network in the retrieval module makes heuristic adjustment and expansion on the learned features to improve the retrieval quality. In the answer extraction module, the reuse of deep semantic features is realized at the network structure level through cross-layer connection. Experimental results on a public cross-lingual Open-QA dataset show the superiority of our proposed approach over the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gouws, S., Bengio, Y., Corrado, G.: BilBOWA: fast bilingual distributed representations without word alignments. In: 32nd International Conference on Machine Learning, ICML 2015, vol. 1, pp. 748–756. International Machine Learning Society (IMLS) (2015)
Google Scholar
Luong, T., Pham, H., Manning, C.D.: Bilingual word representations with monolingual quality in mind, pp. 151–159 (2015). https://doi.org/10.3115/v1/w15-1521
Artetxe, M., Labaka, G., Agirre, E.: Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 5012–5019 (2018)
Google Scholar
Zhang, M., Liu, Y., Luan, H., Sun, M., Izuha, T., Hao, J.: Building earth mover’s distance on bilingual word embeddings for machine translation. In: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2870–2876 (2016)
Google Scholar
Conneau, A, Lample, G, Ranzato, M.: Word translation without parallel data. CoRR abs/1710.04087 (2017)
Google Scholar
Artetxe, M., Schwenk, H.: Massively multilingual sentence embeddings for zero- shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Linguist. 7, 597–610 (2019). https://doi.org/10.1162/tacl_a_00288
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186. No. Mlm (2019). https://doi.org/10.18653/v1/N19-1423
Klementiev, A., Titov, I., Bhattarai, B.: Inducing crosslingual distributed representations of words. In: Proceedings of COLING 2012, pp. 1459–1474. The COLING 2012 Organizing Committee, Mumbai (2012)
Google Scholar
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: SemEval-2017 Task 1: semantic textual similarity multilingual and crosslingual focused evaluation, pp. 1–14 (2018). https://doi.org/10.18653/v1/s17-2001
Conneau, A., et al.: XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, pp. 2475–2485 (2018)
Google Scholar
Schuster, S., Gupta, S., Shah, R., Lewis, M.: Cross-lingual transfer learning for multilingual task oriented dialog. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 3795–3805. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1380
Liu, J., Lin, Y., Liu, Z., Sun, M.: XQA: a cross-lingual open-domain question answering dataset, pp. 2358–2368 (2019). https://doi.org/10.18653/v1/p19-1227
Schwenk, H., Li, X.: A corpus for multilingual document classification in eight languages. In: LREC 2018–11th International Conference on Language Resources and Evaluation, pp. 3548–3551 (2019)
Google Scholar
Lample, G., Conneau, A., Denoyer, L., Ranzato, M.: Unsupervised machine translation using monolingual corpora only. In: 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings (2018)
Google Scholar
Pires, T., Schlinger, E., Garrette, D.: How multilingual is multilingual BERT? pp. 4996–5001 (2019). https://doi.org/10.18653/v1/p19-1493
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vol. 2017 Janua, pp. 2261–2269 (2017). https://doi.org/10.1109/CVPR.2017.243
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 670–680 (2017). https://doi.org/10.18653/v1/d17-1070
Mou, L., et al.: Recognizing entailment and contradiction by tree-based convolution. arXiv preprint (2015)
Google Scholar
Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? pp. 3651–3657 (2019). https://doi.org/10.18653/v1/p19-1356
Gardner, M., Clark, C.: Simple and effective multi-paragraph reading comprehension. In: ACL 2018–56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, pp. 845–855 (2018). https://doi.org/10.18653/v1/p18-1078

Download references

Acknowledgements

The authors would like to thank anonymous reviewers for their helpful comments. This work was supported by NSFC Funding (No. 61876062).

Author information

Authors and Affiliations

Wuhan University of Technology, Wuhan, China
Lin Li, Miao Kong & Dong Li
Hunan University of Science and Technology, Xiangtan, Hunan, China
Dong Zhou

Authors

Lin Li
View author publications
You can also search for this author in PubMed Google Scholar
Miao Kong
View author publications
You can also search for this author in PubMed Google Scholar
Dong Li
View author publications
You can also search for this author in PubMed Google Scholar
Dong Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lin Li .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, L., Kong, M., Li, D., Zhou, D. (2020). A Cross-Layer Connection Based Approach for Cross-Lingual Open Question Answering. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-60450-9_37
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60449-3
Online ISBN: 978-3-030-60450-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)