Abstract
Previous research suggested that supervised machine learning can be utilized to detect information operations (IO) on social media. Most of the related research assumes that the new data will always be available in the exact timing that models set to be updated. In practice, however, the detection and attribution of IO accounts is time-consuming. There is thus a mismatch between the performance assessment procedures in existing work and the real-world problem they seek to solve. We bridge this gap by demonstrating how active learning approaches can extend the application of classifiers by reducing their dependence on new data. We evaluate the performance of an existing classifier when it gets updated according to five active learning strategies. Using state-sponsored information operation Twitter data, the results show that if querying from Twitter is possible, the best active learning strategy requires 5–10 times less tweets than the original model while only showing 1–3% reduction in the average monthly F1 scores across countries and prediction tasks. If querying from Twitter is not possible, the corresponding active learning strategy requires 5–10 times less tweets while showing 1–9% reduction in the average monthly F1 scores. Depending on the country, a hand-full to few hundred new ground-truth examples would suffice to achieve a reasonable performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alizadeh, M., Shapiro, J.N., Buntain, C., Tucker, J.A.: Content-based features predict social media influence operations. Sci. Adv. 6(30), eabb5824 (2020)
Boyd, R.L., et al.: Characterizing the internet research agency’s social media operations during the 2016 us presidential election using linguistic analyses (2018)
Ghanem, B., Buscaldi, D., Rosso, P.: TexTrolls: Identifying Russian trolls on twitter from a textual perspective. arXiv preprint arXiv:1910.01340 (2019)
Grimme, C., Assenmacher, D., Adam, L.: Changing perspectives: is it sufficient to detect social bots? In: Meiselwitz, G. (ed.) SCSM 2018. LNCS, vol. 10913, pp. 445–461. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91521-0_32
Im, J., et al.: Still out there: Modeling and identifying Russian troll accounts on twitter. arXiv preprint arXiv:1901.11162 (2019)
Lukito, J.: Coordinating a multi-platform disinformation campaign: Internet Research Agency activity on three US social media platforms, 2015 to 2017. Polit. Commun. 37(2), 238–255 (2020)
Martin, D.A., Shapiro, J.N.: Trends in Online Foreign Influence Efforts. ESOC Report. Empirical Studies of Conflict Project, Princeton University, Princeton (2019)
Schoch, D., Keller, F.B., Stier, S., Yang, J.: Coordination patterns reveal online political astroturfing across the world. Sci. Rep. 12(1), 1–10 (2022)
Settles, B.: Active Learning Literature Survey. University of Wisconsin-Madison Department of Computer Sciences, Tech. rep. (2009)
Uyheng, J., Cruickshank, I.J., Carley, K.M.: Mapping state-sponsored information operations with multi-view modularity clustering. EPJ Data Sci. 11(1), 25 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Alizadeh, M., Shapiro, J.N. (2023). Few-Shot Information Operation Detection Using Active Learning Approach. In: Thomson, R., Al-khateeb, S., Burger, A., Park, P., A. Pyke, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2023. Lecture Notes in Computer Science, vol 14161. Springer, Cham. https://doi.org/10.1007/978-3-031-43129-6_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-43129-6_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43128-9
Online ISBN: 978-3-031-43129-6
eBook Packages: Computer ScienceComputer Science (R0)