research-article

GranCATs: Cross-Lingual Enhancement through Granularity-Specific Contrastive Adapters

Authors:

Siu Cheung Hui,

Fengyu ZhouAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 1461 - 1471

https://doi.org/10.1145/3583780.3614896

Published: 21 October 2023 Publication History

Abstract

Multilingual language models (MLLMs) have demonstrated remarkable success in various cross-lingual downstream tasks, facilitating the transfer of knowledge across numerous languages, whereas this transfer is not universally effective. Our study reveals that while existing MLLMs like mBERT can capturephrase-level alignments across the language families, they struggle to effectively capturesentence-level andparagraph-level alignments. To address this limitation, we propose GranCATs, Granularity-specific Contrastive AdapTers. We collect a new dataset that observes each sample at three distinct levels of granularity and employ contrastive learning as a pre-training task to train GranCATs on this dataset. Our objective is to enhance MLLMs' adaptation to a broader range of cross-lingual tasks by equipping them with improved capabilities to capture global information at different levels of granularity. Extensive experiments show that MLLMs with GranCATs yield significant performance advancements across various language tasks with different text granularities, including entity alignment, relation extraction, sentence classification and retrieval, and question-answering. These results validate the effectiveness of our proposed GranCATs in enhancing cross-lingual alignments across various text granularities and effectively transferring this knowledge to downstream tasks.

References

[1]

Mikel Artetxe, Sebastian Ruder, and Dani Yogatama. 2020a. On the Cross-lingual Transferability of Monolingual Representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 4623--4637. https://doi.org/10.18653/v1/2020.acl-main.421

[2]

Mikel Artetxe, Sebastian Ruder, and Dani Yogatama. 2020b. On the Cross-lingual Transferability of Monolingual Representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 4623--4637. https://doi.org/10.18653/v1/2020.acl-main.421

[3]

Mikel Artetxe and Holger Schwenk. 2019. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 597--610.

[4]

Muhao Chen, Weijia Shi, Ben Zhou, and Dan Roth. 2021. Cross-lingual Entity Alignment with Incidental Supervision. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 645--658. https://doi.org/10.18653/v1/2021.eacl-main.53

[5]

Muhao Chen, Yingtao Tian, Mohan Yang, and Carlo Zaniolo. 2017. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 1511--1517.

[6]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020b. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.

[7]

Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020a. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020).

[8]

Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, and Ming Zhou. 2021a. InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6--11, 2021, Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tü r, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (Eds.). Association for Computational Linguistics, 3576--3588. https://doi.org/10.18653/v1/2021.naacl-main.280

[9]

Zewen Chi, Li Dong, Bo Zheng, Shaohan Huang, Xian-Ling Mao, Heyan Huang, and Furu Wei. 2021b. Improving pretrained cross-lingual language models via self-labeled word alignment. arXiv preprint arXiv:2106.06381 (2021).

[10]

Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, He-Yan Huang, et al. 2022. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6170--6182.

[11]

Jonathan H. Clark, Jennimaria Palomaki, Vitaly Nikolaev, Eunsol Choi, Dan Garrette, Michael Collins, and Tom Kwiatkowski. 2020. TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages. Trans. Assoc. Comput. Linguistics, Vol. 8 (2020), 454--470. https://doi.org/10.1162/tacl_a_00317

[12]

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmá n, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020a. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 8440--8451. https://doi.org/10.18653/v1/2020.acl-main.747

[13]

Alexis Conneau, Ruty Rinott, Guillaume Lample, Adina Williams, Samuel R. Bowman, Holger Schwenk, and Veselin Stoyanov. 2018. XNLI: Evaluating Cross-lingual Sentence Representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2019, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun'ichi Tsujii (Eds.). Association for Computational Linguistics, 2475--2485. https://doi.org/10.18653/v1/d18--1269

[14]

Alexis Conneau, Shijie Wu, Haoran Li, Luke Zettlemoyer, and Veselin Stoyanov. 2020b. Emerging Cross-lingual Structure in Pretrained Language Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 6022--6034. https://doi.org/10.18653/v1/2020.acl-main.536

[15]

Wietse de Vries, Andreas van Cranenburgh, and Malvina Nissim. 2020. What's so special about BERT's layers? A closer look at the NLP pipeline in monolingual and multilingual models. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16--20 November 2020 (Findings of ACL, Vol. EMNLP 2020), Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 4339--4350. https://doi.org/10.18653/v1/2020.findings-emnlp.389

[16]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https://doi.org/10.18653/v1/n19--1423

[17]

Manaal Faruqui and Shankar Kumar. 2015. Multilingual Open Relation Extraction Using Cross-lingual Projection. In NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31 - June 5, 2015, Rada Mihalcea, Joyce Yue Chai, and Anoop Sarkar (Eds.). The Association for Computational Linguistics, 1351--1356. https://doi.org/10.3115/v1/n15--1151

[18]

Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, and Wei Wang. 2022. Language-agnostic BERT Sentence Embedding. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22--27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 878--891. https://doi.org/10.18653/v1/2022.acl-long.62

[19]

Negar Foroutan, Mohammadreza Banaei, Ré mi Lebret, Antoine Bosselut, and Karl Aberer. 2022. Discovering Language-neutral Sub-networks in Multilingual Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7--11, 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 7560--7575. https://aclanthology.org/2022.emnlp-main.513

[20]

Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021. Association for Computational Linguistics (ACL), 6894--6910.

[21]

John Giorgi, Osvald Nitski, Bo Wang, and Gary Bader. 2021. DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 879--895.

[22]

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16000--16009.

[23]

Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu, and Mrinmaya Sachan. 2022. Adapters for Enhanced Modeling of Multilingual Knowledge and Text. In EMNLP 2022.

[24]

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-Efficient Transfer Learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 2790--2799. http://proceedings.mlr.press/v97/houlsby19a.html

[25]

Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson. 2020a. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 4411--4421. http://proceedings.mlr.press/v119/hu20b.html

[26]

Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson. 2020b. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 4411--4421. http://proceedings.mlr.press/v119/hu20b.html

[27]

Yu Jin Kim, Beong-woo Kwak, Youngwook Kim, Reinald Kim Amplayo, Seung-won Hwang, and Jinyoung Yeo. 2022. Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10--15, 2022, Marine Carpuat, Marie-Catherine de Marneffe, and Ivá n Vladimir Meza Ru'i z (Eds.). Association for Computational Linguistics, 2244--2257. https://doi.org/10.18653/v1/2022.naacl-main.163

[28]

Abdullatif Köksal and Arzucan Özgür. 2020. The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 340--350. https://doi.org/10.18653/v1/2020.findings-emnlp.32

[29]

Patrick S. H. Lewis, Barlas Oguz, Ruty Rinott, Sebastian Riedel, and Holger Schwenk. 2020. MLQA: Evaluating Cross-lingual Extractive Question Answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 7315--7330. https://doi.org/10.18653/v1/2020.acl-main.653

[30]

Linlin Liu, Xin Li, Ruidan He, Lidong Bing, Shafiq Joty, and Luo Si. 2022. Enhancing multilingual language model with massive multilingual knowledge triples. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 6878--6890.

[31]

Fuli Luo, Wei Wang, Jiahao Liu, Yijia Liu, Bin Bi, Songfang Huang, Fei Huang, and Luo Si. 2021. VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 3980--3994.

[32]

Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajic, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis M. Tyers, and Daniel Zeman. 2020. Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection. In Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, May 11--16, 2020, Nicoletta Calzolari, Fré dé ric Bé chet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hé lè ne Mazo, Asunció n Moreno, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association, 4034--4043. https://aclanthology.org/2020.lrec-1.497/

[33]

Xuan Ouyang, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu, and Haifeng Wang. 2021. ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 27--38. https://doi.org/10.18653/v1/2021.emnlp-main.3

[34]

Xiao Pan, Mingxuan Wang, Liwei Wu, and Lei Li. 2021. Contrastive Learning for Many-to-many Multilingual Neural Machine Translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 244--258.

[35]

Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, and Heng Ji. 2017a. Cross-lingual Name Tagging and Linking for 282 Languages. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, 1946--1958. https://doi.org/10.18653/v1/P17--1178

[36]

Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, and Heng Ji. 2017b. Cross-lingual Name Tagging and Linking for 282 Languages. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, 1946--1958. https://doi.org/10.18653/v1/P17--1178

[37]

Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, and Sangdoo Yun. 2023 a. What Do Self-Supervised Vision Transformers Learn?. In The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=azCKuYyS74

[38]

Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, and Sangdoo Yun. 2023 b. What Do Self-Supervised Vision Transformers Learn? CoRR, Vol. abs/2305.00729 (2023). https://doi.org/10.48550/arXiv.2305.00729 showeprint[arXiv]2305.00729

[39]

Jonas Pfeiffer, Naman Goyal, Xi Lin, Xian Li, James Cross, Sebastian Riedel, and Mikel Artetxe. 2022a. Lifting the Curse of Multilinguality by Pre-training Modular Transformers. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3479--3495.

[40]

Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, and Mikel Artetxe. 2022b. Lifting the Curse of Multilinguality by Pre-training Modular Transformers. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10--15, 2022, Marine Carpuat, Marie-Catherine de Marneffe, and Ivá n Vladimir Meza Ru'i z (Eds.). Association for Computational Linguistics, 3479--3495. https://doi.org/10.18653/v1/2022.naacl-main.255

[41]

Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, and Iryna Gurevych. 2021. AdapterFusion: Non-Destructive Task Composition for Transfer Learning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 487--503.

[42]

Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych, and Sebastian Ruder. 2020a. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16--20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 7654--7673. https://doi.org/10.18653/v1/2020.emnlp-main.617

[43]

Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, and Sebastian Ruder. 2020b. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 7654--7673. https://doi.org/10.18653/v1/2020.emnlp-main.617

[44]

Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, and Sebastian Ruder. 2020c. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 7654--7673.

[45]

Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How Multilingual is Multilingual BERT?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4996--5001.

[46]

A. Santoro, S. Bartunov, M. Botvinick, D. Wierstra, and T. Lillicrap. 2016. One-shot Learning with Memory-Augmented Neural Networks. arXiv (2016).

[47]

Aditya Siddhant, Melvin Johnson, Henry Tsai, Naveen Ari, Jason Riesa, Ankur Bapna, Orhan Firat, and Karthik Raman. 2020. Evaluating the cross-lingual effectiveness of massively multilingual neural machine translation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 8854--8861.

[48]

A"a ron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR, Vol. abs/1807.03748 (2018). showeprint[arXiv]1807.03748 http://arxiv.org/abs/1807.03748

[49]

Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Jianshu Ji, Guihong Cao, Daxin Jiang, and Ming Zhou. 2021. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1--6, 2021 (Findings of ACL, Vol. ACL/IJCNLP 2021), Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 1405--1418. https://doi.org/10.18653/v1/2021.findings-acl.121

[50]

Yau-Shian Wang, Ashley Wu, and Graham Neubig. 2022. English Contrastive Learning Can Learn Universal Cross-lingual Sentence Embeddings. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7--11, 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 9122--9133. https://aclanthology.org/2022.emnlp-main.621

[51]

Zhuofeng Wu, Sinong Wang, Jiatao Gu, Madian Khabsa, Fei Sun, and Hao Ma. 2020. Clear: Contrastive learning for sentence representation. arXiv preprint arXiv:2012.15466 (2020).

[52]

Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6--11, 2021, Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tü r, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (Eds.). Association for Computational Linguistics, 483--498. https://doi.org/10.18653/v1/2021.naacl-main.41

[53]

Yuan Zhang, Jason Baldridge, and Luheng He. 2019. PAWS: Paraphrase Adversaries from Word Scrambling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 1298--1308. https://doi.org/10.18653/v1/n19--1131

[54]

Pierre Zweigenbaum, Serge Sharoff, and Reinhard Rapp. 2017. Overview of the second BUCC shared task: Spotting parallel sentences in comparable corpora. In Proceedings of the 10th Workshop on Building and Using Comparable Corpora. 60--67.

Index Terms

GranCATs: Cross-Lingual Enhancement through Granularity-Specific Contrastive Adapters
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Semantic networks
    2. Natural language processing
      1. Language resources
2. Theory of computation
  1. Design and analysis of algorithms
    1. Data structures design and analysis
      1. Pattern matching

Recommendations

Cross-lingual Sentence Embedding for Low-resource Chinese-Vietnamese Based on Contrastive Learning
Cross-lingual sentence embedding’s goal is mapping sentences with similar semantics but in different languages close together and dissimilar sentences farther apart in the representation space. It is the basis of many downstream tasks such as cross-...
Cross-lingual word sense disambiguation for languages with scarce resources
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligence

Word Sense Disambiguation has long been a central problem in computational linguistics. Word Sense Disambiguation is the ability to identify the meaning of words in context in a computational manner. Statistical and supervised approaches require a large ...
Automatic wordnet development for low-resource languages using cross-lingual WSD

Wordnets are an effective resource for natural language processing and information retrieval, especially for semantic processing and meaning related tasks. So far, wordnets have been constructed for many languages. However, the automatic development of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R and D Program of China
Shandong Province Agricultural Major Application Technology Innovation Project

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
142
Total Downloads

Downloads (Last 12 months)58
Downloads (Last 6 weeks)5

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten