Abstract
The advancement of language models has showcased their tremendous potential for both good purposes, and harmful misuse. However, the majority of research have been concentrated on high-resource languages, leaving much to be desired in low-resource languages. This article focuses on exploring the use of language models in Norwegian, a low-resource language. Addressing the threats these models pose in the context of influence operations in social media.
The methodology uses a mixed-methods approach, combining quantitative analysis and qualitative investigations. The quantitative analysis entails evaluating the performance of language models across various contexts, assessing their ability to generate perceived authentic content, and analyzing user responses to such generated content. The qualitative investigations involve conducting interviews and surveys to gather insights from participants, aiming to understand their experiences, perceptions, and concerns regarding the use of language models.
By investigating the use of language models in a low-resource language, this thesis aims to contribute to the advancement of natural language processing research in an underrepresented linguistic context. As well as exploring the use of these language models for training purposes in isolated social networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, A.A.A., Aljabouh, A., Donepudi, P.K., Choi, M.S.: Detecting fake news using machine learning: a systematic literature review (2021)
Brown, T.B., et al.: Language models are few-shot learners. CoRR https://arxiv.org/abs/2005.14165(2020)
Buchanan, B., Lohn, A., Musser, M., Sedova, K.: Truth, lies, and automation. Technical report. Center for Security and Emerging Technology (2021)
Gereme, F., Zhu, W., Ayall, T., Alemu, D.: Combating fake news in “low-resource” languages: amharic fake news detection accompanied by resource crafting. Information (Basel) 12(1), 20 (2021)
Goldstein, J.A., Chao, J., Grossman, S., Stamos, A., Tomz, M.: Can AI write persuasive propaganda? (2023). https://osf.io/preprints/socarxiv/fp87b/
Helkala, K.M., Rønnfeldt, C.F.: Understanding and gaining human resilience against negative effects of digitalization. In: Lehto, M., Neittaanmaki, P. (eds.) Cyber Security, vol. 56, pp. 79–91. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-91293-2_4
Koch, T.K., Frischlich, L., Lermer, E.: Effects of fact-checking warning labels and social endorsement cues on climate change fake news credibility and engagement on social media. J. Appl. Social Psychol. (2023). https://doi.org/10.1111/jasp.12959
Kreps, S., McCain, R.M., Brundage, M.: All the news that’s fit to fabricate: Ai-generated text as a tool of media misinformation. J. Exp. Polit. Sci. 9(1), 104–117 (2022). https://doi.org/10.1017/XPS.2020.37
Kummervold, P.E., De la Rosa, J., Wetjen, F., Brygfjeld, S.A.: Operationalizing a national digital library: the case for a Norwegian transformer model. In: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pp. 20–29 (2021). https://aclanthology.org/2021.nodalida-main.3/
Linvill, D.L., Warren, P.L.: Troll factories: manufacturing specialized disinformation on twitter. Polit. Commun. 37(4), 447–467 (2020)
Mackey, R.R.: Information warfare (2014). https://www.oxfordbibliographies.com/view/document/obo-9780199791279/obo-9780199791279-0024.xml. Accessed 26 Apr 2022
Moravec, P.L., Minas, R.K., Dennis, A.R.: Fake news on social media: people believe what they want to believe when it makes no sense at all. MIS Q. 43(4) (2019)
of Norway, N.L.: Nbailab/nb-gpt-j-6b - huggingface. https://huggingface.co/NbAiLab/nb-gpt-j-6B. Accessed 15 Feb 2024
Pew Research Center: Social media and news fact sheet. Technical report, Washington, D.C. (2022). https://www.pewresearch.org/journalism/fact-sheet/social-media-and-news-fact-sheet/
Riedel, B., Augenstein, I., Spithourakis, G.P., Riedel, S.: A simple but tough-to-beat baseline for the fake news challenge stance detection task (2018)
Sanderson, Z., Brown, M.A., Bonneau, R., Nagler, J., Tucker, T.J.: Twitter flagged donald trump’s tweets with election misinformation: they continued to spread both on and off the platform (2021). https://doi.org/10.37016/mr-2020-77. https://misinforeview.hks.harvard.edu/article/twitter-flagged-donald-trumps-tweets-with-election-misinformation-they-continued-to-spread-both-on-and-off-the-platform/
Sharevski, F., Alsaadi, R., Jachim, P., Pieroni, E.: Misinformation warning labels: twitter’s soft moderation effects on covid-19 vaccine belief echoes (2021)
Sivertsen, E.G., Hellum, N., A., B., Bjørnstad, L.B.: Hvordan gjøre samfunnet mer robust mot uønsket påvirkning i sosiale medier (2021). https://www.ffi.no/publikasjoner/arkiv/hvordan-gjore-samfunnet-mer-robust-mot-uonsket-pavirkning-i-sosiale-medier
Sütterlin, S., et al.: The role of it background for metacognitive accuracy, confidence and overestimation of deep fake recognition skills. Lect. Notes Comput. Sci. 13310, 103–119 (2022)
Talwar, S., Dhir, A., Kaur, P., Zafar, N., Alrasheedy, M.: Why do people share fake news? associations between the dark side of social media use and fake news sharing behavior. J. Retail. Cons. Serv. 51 (2019)
Talwar, S., Dhir, A., Singh, D., Virk, G.S., Salo, J.: Sharing of fake news on social media: application of the honeycomb framework and the third-person effect hypothesis. J. Retail. Consum. Serv. 57, 102197 (2020)
Tarman, B., Yigit, M.F.: The impact of social media on globalization, democratization and participative citizenship. J. Soc. Sci. Educ. 12(1) (2012)
Wang, B., Komatsuzaki, A.: GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model (2021). https://github.com/kingoflolz/mesh-transformer-jax
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Aasen, O.J.A., Lugo, R.G., Knox, B.J. (2024). Small Languages and Big Models: Using ML to Generate Norwegian Language Social Media Content for Training Purposes. In: Schmorrow, D.D., Fidopiastis, C.M. (eds) Augmented Cognition. HCII 2024. Lecture Notes in Computer Science(), vol 14695. Springer, Cham. https://doi.org/10.1007/978-3-031-61572-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-61572-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-61571-9
Online ISBN: 978-3-031-61572-6
eBook Packages: Computer ScienceComputer Science (R0)