Bridging the Domain Gap for Stance Detection for the Zulu Language

Dlamini, Gcinizwe; Bekkouch, Imad Eddine Ibrahim; Khan, Adil; Derczynski, Leon

doi:10.1007/978-3-031-16072-1_23

Gcinizwe Dlamini¹⁰,
Imad Eddine Ibrahim Bekkouch¹¹,
Adil Khan¹⁰ &
…
Leon Derczynski¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 542))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

964 Accesses

Abstract

Misinformation has become a major concern in recent last years given its spread across our information sources. In the past years, many NLP tasks have been introduced in this area, with some systems reaching good results on English language datasets. Existing AI based approaches for fighting misinformation in literature suggest automatic stance detection as an integral first step to success. Our paper aims at utilizing this progress made for English to transfers that knowledge into other languages, which is a non-trivial task due to the domain gap between English and the target languages. We propose a black-box non-intrusive method that utilizes techniques from Domain Adaptation to reduce the domain gap, without requiring any human expertise in the target language, by leveraging low-quality data in both a supervised and unsupervised manner. This allows us to rapidly achieve similar results for stance detection for the Zulu language, the target language in this work, as are found for English. We also provide a stance detection dataset in the Zulu language. Our experimental results show that by leveraging English datasets and machine translation we can increase performances on both English data along with other languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Distant Finetuning with Discourse Relations for Stance Classification

Stance Detection on Vietnamese Social Media

Extensive Feature Analysis and Baseline Model for Stance Detection Task

References

Allah, F.A., Boulaknadel, S.: Toward computational processing of less resourced languages: primarily experiments for Moroccan Amazigh language. Text Mining. Rijeka: InTech, pp. 197–218 (2012)
Google Scholar
Augenstein, I., Rocktäschel, T., Vlachos, A., Bontcheva, K.: Stance detection with bidirectional conditional encoding. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, volume abs/1606.05464 (2016)
Google Scholar
Bekkouch, I.E.I., Youssry, Y., Gafarov, R., Khan, A., Khattak, A.M.: Triplet loss network for unsupervised domain adaptation. Algorithms 12(5), 96 (2019)
Article Google Scholar
Besacier, L., Barnard, E., Karpov, A., Schultz, T.: Automatic speech recognition for under-resourced languages: a survey. Speech Commun. 56, 85–100 (2014)
Article Google Scholar
Bourquin, W.: Click-words which xhosa, zulu and sotho have in common. Afr. Stud. 10(2), 59–81 (1951)
Article Google Scholar
Carlucci, F.M., D’Innocente, A., Bucci, S., Caputo, B., Tommasi, T.: Domain generalization by solving jigsaw puzzles. CoRR, abs/1903.06864 (2019)
Google Scholar
Cope, A.T.: Zulu phonology, tonology and tonal grammar. Ph.D. thesis, University of Durban (1966)
Google Scholar
Derczynski, L., Maynard, D., Aswani, N., Bontcheva, K.: Microblog-genre noise and impact on semantic annotation accuracy. In: Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp. 21–30. ACM (2013)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018)
Google Scholar
Dungs, S., Aker, A., Fuhr, N., Bontcheva, K.: Can rumour stance alone predict veracity? In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3360–3370 (2018)
Google Scholar
Ferraro, J.P., Daumé, H., III., DuVall, S.L., Chapman, W.W., Harkema, H., Haug, P.J.: Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation. J. Am. Med. Inform. Assoc. 20(5), 931–939 (2013)
Article Google Scholar
Goodfellow, I.: Nips 2016 tutorial: generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016)
Gorrell, G., et al.: Semeval-2019 task 7: rumoureval, determining rumour veracity and support for rumours. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 845–854 (2019)
Google Scholar
Hercig, T., Krejzl, P., Král, P.: Stance and sentiment in czech. Computación y Sistemas 22(3) (2018)
Google Scholar
Howard, J., Ruder, S.: Fine-tuned language models for text classification. CoRR abs/1801.06146 (2018)
Google Scholar
Hu, L., Kan, M., Shan, S., Chen, X.: Duplex generative adversarial network for unsupervised domain adaptation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1498–1507, June 2018
Google Scholar
Kochkina, E., Liakata, M., Augenstein, I.: Proceedings of the 11th International Workshop on Semantic Evaluation (semeval-2017). In CoRR volume abs/1704.07221 (2017)
Google Scholar
Kotu, V., Deshpande, B.: Chapter 2 - data mining process. In: Kotu, V., Deshpande, B. (eds.) Predictive Analytics and Data Mining, pp. 17–36. Morgan Kaufmann, Boston (2015)
Google Scholar
Kotu, V., Deshpande, B.: Chapter 2 - data science process. In: Kotu, V., Deshpande, B. (ed.) Data Science, 2nd edn., pp. 19 – 37. Morgan Kaufmann (2019)
Google Scholar
Küçük, D.: Stance detection in Turkish tweets. arXiv preprint arXiv:1706.06894 (2017)
Li, D., Yang, Y., Song, Y.-Z., Hospedales, T.M.: Deeper, broader and artier domain generalization. CoRR, abs/1710.03077 (2017)
Google Scholar
Lillie, A.E., Middelboe, E.R., Derczynski, L.: Joint rumour stance and veracity prediction. In: Proceedings of the 22nd Nordic Conference on Computional Linguistics (NoDaLiDa), pp. 208–221 (2019)
Google Scholar
Mahsut, M., Ogawa, Y., Sugino, K., Inagaki, Y.: Utilizing agglutinative features in Japanese-Uighur machine translation. Proc. MT Summit 8, 217–222 (2001)
Google Scholar
Merity, S., Xiong, C., Bradbury, J., Socher, R.: Pointer sentinel mixture models. CoRR, abs/1609.07843 (2016)
Google Scholar
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: Semeval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016)
Google Scholar
Niesler, T., Louw, P., Roux, J.: Phonetic analysis of Afrikaans, English, Xhosa and Zulu using south African speech databases. South. Afr. Linguistics Appl. Language Studi. 23(4), 459–474 (2005)
Article Google Scholar
Nisbet, R., Elder, J., Miner, G.: Chapter 13 - model evaluation and enhancement. In: Nisbet, R., Elder, J., Miner, G. (eds.) Handbook of Statistical Analysis and Data Mining Applications, pp. 285–312. Academic Press, Boston (2009)
Chapter Google Scholar
Peters, M.E., Ruder, S., Smith, N.A.: To tune or not to tune? adapting pretrained representations to diverse tasks. CoRR, abs/1903.05987 (2019)
Google Scholar
Pires, T., Schlinger, E., Garrette, D.: How multilingual is multilingual bert? CoRR, abs/1906.01502 (2019)
Google Scholar
Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1589–1599. Association for Computational Linguistics (2011)
Google Scholar
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. CoRR, abs/1703.06907 (2017)
Google Scholar
Weng, L.: Domain randomization for sim2real transfer. lilianweng.github.io/lil-log (2019)
Google Scholar
Zhou, S., Lin, J., Tan, L., Liu, X.: Condensed convolution neural network by attention over self-attention for stance detection in twitter. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, July 2019
Google Scholar

Download references

Acknowledgments

This research was supported by the Independent Danish Research Fund through the Verif-AI project grant.

Author information

Authors and Affiliations

Innopolis University, Tatarstan, Russian Federation
Gcinizwe Dlamini & Adil Khan
Sorbonne Center for Artificial Intelligence, Sorbonne University, Paris, France
Imad Eddine Ibrahim Bekkouch
IT University of Copenhagen, Copenhagen, Denmark
Leon Derczynski

Authors

Gcinizwe Dlamini
View author publications
You can also search for this author in PubMed Google Scholar
Imad Eddine Ibrahim Bekkouch
View author publications
You can also search for this author in PubMed Google Scholar
Adil Khan
View author publications
You can also search for this author in PubMed Google Scholar
Leon Derczynski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gcinizwe Dlamini .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dlamini, G., Bekkouch, I.E.I., Khan, A., Derczynski, L. (2023). Bridging the Domain Gap for Stance Detection for the Zulu Language. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 542. Springer, Cham. https://doi.org/10.1007/978-3-031-16072-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-16072-1_23
Published: 31 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16071-4
Online ISBN: 978-3-031-16072-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Bridging the Domain Gap for Stance Detection for the Zulu Language