Skip to main content
Log in

Misleading information in Spanish: a survey

  • Review Paper
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Misleading information spread on social networks is often supported by activists who promote this type of information and bots that amplify their visibility. The need for useful and timely mechanisms of credibility assessment in social media has become increasingly indispensable. Efforts to tackle this problem in Spanish are growing. The last years have witnessed many efforts to develop methods to detect fake news, rumors, stances, and bots on the Spanish social web. This work leads to a systematic review of the literature that relates the efforts to develop this area in the Spanish language. The work identifies pending tasks for this community and challenges that require coordination among the leading investigators on the subject.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abonizio HQ, de Morais JI, Tavares GM, Barbon Junior S (2020) Language-independent fake news detection: English, Portuguese, and spanish mutual features. Future Internet 12(5):87

    Article  Google Scholar 

  • Agirrezabal M (2020) KU-CST at the profiling fake news spreaders shared task. In: Working Notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Al-Zoubi A, Faris H, Alqatawna J, Hassonah M (2018) Evolving Support Vector Machines using Whale Optimization Algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl-Based Syst 153:91–104

    Article  Google Scholar 

  • Almendros Cuquerella C, Cervantes Rodríguez C (2018) CriCa Team: MultiModal Stance detection in tweets on Catalan 1Oct Referendum (MultiStanceCat). In: Proceedings of the third workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval) colocated with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN), Sevilla, Spain, volume 2150 of CEUR Workshop Proceedings, pp 167–172

  • Ambrosini L, Nicolò G (2017) Neural models for StanceCat shared task at IberEval 2017. In: CEUR-WS, Conference of 2nd Workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval 2017, vol 1881, pp 210–216

  • Aragón ME, Jarquín-Vásquez HJ, Montes-y-Gómez M, Escalante HJ, Pineda LV, Gómez-Adorno H, Posadas-Durán JP, Bel-Enguix G (2020) Overview of MEX-A3T at iberlef 2020: fake news and aggressiveness analysis in Mexican Spanish. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) co-located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), Málaga, Spain, 23 September 2020, volume 2664 of CEUR Workshop Proceedings, pp 222–235. CEUR-WS.org

  • Arce-Cardenas S, Fajardo-Delgado D, Carmona MÁÁ (2020) Tecnm at MEX-A3T 2020: fake news and aggressiveness analysis in Mexican Spanish. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) co-located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), Málaga, Spain, 23 September 2020, volume 2664 of CEUR Workshop Proceedings, pp 265–272. CEUR-WS.org

  • Ashraf S, Javed O, Adeel M, Rao H, Nawab M (2019) Bots and gender prediction using language independent stylometry-based approach notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th Working Notes of Conference and Labs of the Evaluation Forum. CLEF, vol 2380

  • Bacciu A, Morgia M, Mei A, Nemmi E, Neri V, Stefa J (2019) Bot and gender detection of twitter accounts using distortion and LSA notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th Working Notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Bakhteev O, Ogaltsov A, Ostroukhov P (2020) Fake news spreader detection using neural tweet aggregation. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Barbieri F (2017) Shared task on stance and gender detection in tweets on catalan independence—LaSTUS system Description. In: CEUR-WS Conference of 2nd Workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 217–221

  • Barrón-Cedeño A, Elsayed T, Nakov P, Da San Martino G, Hasanain M, Suwaileh R, Haouari F (2020) CheckThat! at CLEF 2020: enabling the automatic identification and verification of claims in social media. In: Conference of 42nd European Conference on IR Research, ECIR, in Lecture Notes in Computer Science, 12036 LNCS. Springer, pp 499–507

  • Basile V, Bosco C, Fersini E, Nozza D, Patti V, Pardo FMR, Rosso P, Sanguinetti M (2019) Semeval-2019 task 5: multilingual detection of hate speech against immigrants and women in twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, 6–7 June 2019. Association for Computational Linguistics, pp 54–63

  • Bello HRM, Heilmann L, Ronan E (2020) Detecting fake news spreaders with behavioural, lexical and psycholinguistic features. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Boididou C, Papadopoulos S, Zampoglou M, Apostolidis L, Papadopoulou O, Kompatsiaris Y (2018) Detection and visualization of misleading content on Twitter. Int J Multimed Inf Retrieval 7(1):71–86

    Article  Google Scholar 

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist (TACL) 5:135–146

    Article  Google Scholar 

  • Bolonyai F, Buda J, Katona E (2019) Bot or not: a two-level approach in author profiling notebook for PAN at CLEF 2019. In: CEUR-WS Conference of 20th Working Notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Bounaama R, Abderrahim M (2019) Tlemcen university: bots and gender profiling task notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th Working Notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Brereton P, Kitchenham BA, Budgen D, Turner M, Khalil M (2007) Lessons from applying the systematic literature review process within the software engineering domain. J Syst Softw 80(4):571–583

    Article  Google Scholar 

  • Buda J, Bolonyai F (2020) An ensemble model using n-grams and statistical features to identify fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Bugueño M, Mendoza M (2020) Learning to combine classifiers outputs with the transformer for text classification. Intell Data Anal 24(S1):15–41

    Article  Google Scholar 

  • Caled D, Silva M (2019) FTR-18: collecting rumours on football transfer news. In: CEUR-WS, Conference on Information and Knowledge Management Workshops, CIKM, vol 2482

  • Cardaioli M, Cecconello S, Conti M, Pajola L, Turrin F (2020) Fake news spreaders profiling through behavioural analysis. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on World Wide Web, WWW 2011, Hyderabad, India, 28 March–1 April 2011, pp 675–684

  • Castillo S, Allende-Cid H, Palma W, Alfaro R, Ramos H, Gonzalez C, Elortegui C, Santander P (2019) Detection of bots and cyborgs in Twitter: a study on the Chilean Presidential Election in 2017. In: Conference of 11th international conference on Social Computing and Social Media, SCSM 2019, held as part of the 21st International Conference on Human–Computer Interaction, HCI, in Lecture Notes in Computer Science, LNCS, vol 11578. Springer, pp 311–323

  • Cegarra-Navarro J-G, Martelo-Landroguez S (2020) The effect of organizational memory on organizational agility: testing the role of counter-knowledge and knowledge application. J Intellect Capital 21(3):459–479

    Article  Google Scholar 

  • Cer D, Yang Y, Kong S, Hua N, Limtiaco N, John RS, Constant N, Guajardo-Cespedes M, Yuan S, Tar C, Strope B, Kurzweil R (2018) Universal sentence encoder for English. In: Proceedings of the 2018 conference on Empirical Methods in Natural Language Processing, EMNLP 2018: System Demonstrations, Brussels, Belgium, 31 October–4 November 2018. Association for Computational Linguistics, pp 169–174

  • Chung CK, Pennebaker JW (2012) Linguistic inquiry and word count (liwc): pronounced luke . . . and other useful facts

  • Clark K, Luong M, Le QV, Manning CD (2020) ELECTRA: pre-training text encoders as discriminators rather than generators. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020

  • Congosto M, Basanta-Val P, Sanchez-Fernandez L (2017) T-Hoarder: a framework to process Twitter data streams. J Netw Comput Appl 83:28–39

    Article  Google Scholar 

  • Cresci S (2020) A decade of social bot detection. Commun ACM 63(10):72–83

    Article  Google Scholar 

  • Cruz FL, Troyano JA, Pontes B, Ortega FJ (2014) Building layered, multilingual sentiment lexicons at synset and lemma levels. Expert Syst Appl 41(13):5984–5994

    Article  Google Scholar 

  • Cücük D, Can F (2020) Stance detection: a survey. ACM Comput Surv 53(1):1–37

    Article  Google Scholar 

  • Das KA, Baruah A, Barbhuiya FA, Dey K (2020) Ensemble of ELECTRA for profiling fake news spreaders. In: Cappellato L, Eickhoff C, Ferro N, Névéol A (eds) Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Davis CA, Varol O, Ferrara E, Flammini A, Menczer F (2016) Botornot: a system to evaluate social bots. In: Proceedings of the 25th international conference on World Wide Web, WWW 2016, Montreal, Canada, 11–15 April 2016, Companion volume, pp 273–274

  • Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, volume 1 (Long and Short Papers), pp 4171–4186

  • Espinosa D, Gómez-Adorno H, Sidorov G (2019) Bots and gender profiling using character bigrams notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Espinosa DY, Gómez-Adorno H, Sidorov G (2020a) Profiling fake news spreaders using character and words n-grams. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Espinosa MS, Centeno R, Rodrigo Á (2020b) Analyzing user profiles for detection of fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Fagni T, Tesconi M (2019) Profiling twitter users using autogenerated features invariant to data distribution notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Fernández JL, Ramírez JAL (2020) Approaches to the profiling fake news spreaders on twitter task in English and Spanish. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Fernquist J (2019) A four feature types approach for detecting bot and gender of twitter users notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Gallagher E, Suárez-Serrato P, Velazquez Richards E (2019) Socialbots whitewashing contested elections; a case study from Honduras. Adv Intell Syst Comput 797:547–552

    Google Scholar 

  • Gamallo P, Almatarneh S (2019) Naive-Bayesian classification for bot detection in twitter notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • García D, Larriba Flor A (2017) Stance detection at IberEval 2017: a biased representation for a biased problem. In: CEUR-WS, Conference of 2nd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 204–209

  • Germani F, Biller-Adorno N (2020) The anti-vaccination infodemic on social media: a behavioral analysis. Lancet Digit Health 2(10):504–505

    Article  Google Scholar 

  • Giachanou A, Ghanem B (2019) Bot and gender detection using textual and stylistic information notebook for pan at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Giachanou A, Rosso P (2020) The battle against online harmful information: the cases of fake news and hate speech. In: CIKM ’20: the 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, 19–23 October 2020. ACM, pp 3503–3504

  • Giachanou A, Rosso P, Crestani F (2019) Leveraging emotional signals for credibility detection. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 2019. ACM, pp 877–880

  • Giglou HB, Razmara J, Rahgouy M, Sanaei M (2020) Lsaconet: a combination of lexical and conceptual features for analysis of fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Gishamer F (2019) Using hashtags and pos-tags for author profiling notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • González J-A, Pla F, Hurtado L (2017) ELiRF-UPV at IberEval 2017: stance and gender detection in tweets. In: CEUR-WS, Conference of 2nd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 193–198

  • González J, Hurtado L, Pla F (2018) ELiRF-UPV at MultiStanceCat 2018. In: Proceedings of the third workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval) colocated with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN), Sevilla, Spain, volume 2150 of CEUR Workshop Proceedings, pp 173–179

  • Goubin R, Lefeuvre D, Alhamzeh A, Mitrović J, Egyed-Zsigmond E, Ghemmogne Fossi L (2019) Bots and gender profiling using a multi-layer architecture notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Graells-Garrido E, Baeza-Yates R, Lalmas M (2020) Every colour you are: stance prediction and turnaround in controversial issues. In: 12th ACM Conference on Web Science, pp 174–183

  • Gómez V, Kappen H, Litvak N, Kaltenbrunner A (2013) A likelihood-based framework for the analysis of discussion threads. World Wide Web 16(5–6):645–675

    Article  Google Scholar 

  • HaCohen-Kerner Y, Manor N, Goldmeier M (2019) Bots and gender profiling of tweets using word and character N-grams notebook for PAN at CLEF

  • Halvani O, Marquardt P (2019) An unsophisticated neural bots and gender profiling system notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Hashemi A, Zarei MR, Moosavi MR, Taheri M (2020) Fake news spreader identification in twitter using ensemble modeling. notebook for PAN at CLEF 2020. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15–20 July 2018, volume 1: Long Papers. Association for Computational Linguistics, pp 328–339

  • Ikae C, Savoy J (2020) Unine at PAN-CLEF 2020: profiling fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Jimenez-Villar V, Sánchez-Junquera J, Montes-Y-Gómez M, Villaseñor-Pineda L, Ponzetto S (2019) Bots and gender profiling using masking techniques notebook for pan at clef 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Johansson F (2019) Supervised classification of twitter accounts based on textual content of tweets notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Khaund T, Al-Khateeb S, Tokdemir S, Agarwal N (2018) Analyzing social bots and their coordination during natural disasters. In: Conference of 11th International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction conference and Behavior Representation in Modeling and Simulation, SBP-BRiMS, in Lecture Notes in Computer Science, LNCS, vol 10899. Springer, pp 207–212

  • Kollanyi B, Howard PN, Woolley SC (2016) Bots and automation over twitter during the first U.S. election. Data Memo 2016.4. Oxford, UK: Project on Computational Propaganda

  • Koloski B, Pollak S, Skrlj B (2020) Multilingual detection of fake news spreaders via sparse matrix factorization. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Labadie R, Castro-Castro D, Bueno RO (2020) Fusing stylistic features with deep-learning methods for profiling fake news spreader. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Lai M, Cignarella A, Farías D (2017) ITACOS at IberEval2017: detecting stance in Catalan and Spanish tweets. In: CEUR-WS, Conference of 2nd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 185–192

  • Lai M, Cignarella A, Hernández Farías D, Bosco C, Patti V, Rosso P (2020) Multilingual stance detection in social media political debates. Comput Speech Lang 63:101075

    Article  Google Scholar 

  • Lichouri M, Abbas M, Benaziz B (2020) Profiling fake news spreaders on twitter based on TFIDF features and morphological process. Notebook for PAN at CLEF 2020. In: Working Notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Liu H, Singh P (2004) Conceptnet—a practical commonsense reasoning tool-kit. BT Technol J 22:211–226

    Article  Google Scholar 

  • Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized BERT pretraining approach. CoRR arXiv:1907.11692

  • López Á, Martí P (2020) Profiling fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • López-Santillán R, González-Gurrola L, Montes-Y-Gómez M, Ramírez-Alonso G, Prieto-Ordaz O (2019) An evolutionary approach to build user representations for profiling of bots and humans in twitter notebook for PaN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Ma J, Gao W, Joty SR, Wong K (2020) An attention-based rumor detection model with tree-structured recursive neural networks. ACM Trans Intell Syst Technol (ACM-TIST) 11(4):42:1-42:28

    Google Scholar 

  • Magallón Rosa R (2019) Verificado Mexico 2018. Disinformation and fact-checking on electoral campaign [Verificado México (2018) Desinformación y fact-checking en campaña electoral]. Revista de Comunicacion 18(1):234–258

    Article  Google Scholar 

  • Majumder S, Das D (2020) Detecting fake news spreaders on twitter using universal sentence encoder. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Manna R, Pascucci A, Monti J (2020) Profiling fake news spreaders through stylometry and lexical features. unior NLP @pan2020. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Mendoza M, Poblete B, Castillo C (2010) Twitter under crisis: can we trust what we rt? In: Proceedings of the 1st workshop on Social Media Analytics, SOMA 2010, Washington, USA, 28 June 2010, pp 71–79

  • Mendoza M, Tesconi M, Cresci S (2020) Bots in social and interaction networks: detection and impact estimation. ACM Trans Inf Syst (TOIS) 39(1):1–32

    Article  Google Scholar 

  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J, (2013) Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems, 2013 Proceedings of a meeting held December 5–8, 2013. Lake Tahoe, Nevada, United States, pp 3111–3119

  • Mohammad S, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465

    Article  MathSciNet  Google Scholar 

  • Molina-González MD, Martínez-Cámara E, Martín-Valdivia MT, Perea-Ortega JM (2013) Semantic orientation for polarity classification in Spanish reviews. Expert Syst Appl 40(18):7250–7257

    Article  Google Scholar 

  • Montañés R, Aznar R, Nogueras S, Segura P, Langarita R, Meléndez E, Peña P, Del Hoyo R (2018) Social media monitoring [Monitorizacion de Social Media]. Procesamiento de Lenguaje Natural 61:177–180

    Google Scholar 

  • Oliveira R, De Andrade C, Figuerêdo J, Rocha-Junior J, Calumby R, Da Conceição Silva I, Da Silva Neto A (2019) Bot and gender identification: textual analysis of tweets notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Onose C, Nedelcu C-M, Cercel D-C, Trausan-Matu S (2019) A hierarchical attention network for bots and gender profiling notebook for PaN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Pardo FMR, Giachanou A, Ghanem B, Rosso P (2020) Overview of the 8th author profiling task at PAN 2020: profiling fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings

  • Pastor-Galindo J, Zago M, Nespoli P, Bernal SL, Celdrán AH, Pérez MG, Valiente JAR, Pérez GM, Mármol FG (2020a) Spotting political social bots in twitter: a use case of the 2019 Spanish general election. IEEE Trans Netw Serv Manag 17(4):2156–2170

    Article  Google Scholar 

  • Pastor-Galindo J, Zago M, Nespoli P, Bernal SL, Celdrán AH, Pérez MG, Valiente JAR, Pérez GM, Mármol FG (2020b) Twitter social bots: the 2019 Spanish general election data. Data Brief 32:106047

    Article  Google Scholar 

  • Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1532–1543

  • Petrik J, Chuda D (2019) Bots and gender profiling with convolutional hierarchical recurrent neural network notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Pimentel B, Portugal R (2020) Fake news in Spanish: towards the building of a corpus based on Twitter. Commun Comput Inf Sci (CCIS) 1070:333–339

    Google Scholar 

  • Pinnaparaju N, Indurthi V, Varma V (2020) Identifying fake news spreaders in social media. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Pizarro J (2019) Using N-grams to detect Bots on Twitter Notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th Working Notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Pizarro J (2020) Using n-grams to detect fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Polignano M, De Pinto M, Lops P, Semeraro G (2019) Identification of bot accounts in Twitter using 2D CNNs on user-generated contents notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Posadas-Durán J-P, Gomez-Adorno H, Sidorov G, Escobar J (2019) Detection of fake news in a new corpus for the Spanish language. J Intell Fuzzy Syst 36(5):4868–4876

    Google Scholar 

  • Przybyła P (2019) Detecting bot accounts on twitter by measuring message predictability notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Rangel F, Rosso P (2019) Overview of the 7th author profiling task at Pan 2019: bots and gender profiling in twitter. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Russo I (2020) Sadness and fear: classification of fake news spreaders content on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Salazar ME, Tenorio AG, Naranjo ZL (2020) Evaluation of the precision of the binary classification models for the identification of true or false news in Costa Rica. Revista Iberica de Sistemas e Tecnologias de Informacao (RISTI) 2020(E38):156–170

    Google Scholar 

  • Saralegi X, Vicente IS (2013) Elhuyar at tweet-norm 2013. In: Proceedings of the tweet normalization workshop co-located with 29th conference of the Spanish Society for Natural Language Processing (SEPLN 2013), Madrid, Spain, 20 September 2013, pp 64–68

  • Segura-Bedmar I (2018) LABDA’s early steps toward multimodal stance detection. In: Proceedings of the third workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval) colocated with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN), Sevilla, Spain, volume 2150 of CEUR Workshop Proceedings, pp 180–186

  • Shashirekha HL, Balouchzahi F (2020) Ulmfit for twitter fake news spreader profiling. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Shashirekha HL, Anusha MD, Prakash NS (2020) Ensemble model for profiling fake news spreaders on twitter. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Shrestha A, Spezzano F, Joy A (2020) Detecting fake news spreaders in social networks via linguistic and personality features. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Speer R, Chin J, Havasi C (2016) Conceptnet 5.5: an open multilingual graph of general knowledge. CoRR, arXiv:1612.03975

  • Srinivasarao M, Manu S (2019) Bots and gender profiling using character and word N-grams notebook for PAN at CLEF 2019. In: Conference of 20th working notes of CLEF Conference and Labs of the Evaluation Forum, vol 2380

  • Suárez-Serrato P, Richards E. Velázquez, Yazdani M (2018) Socialbots supporting human rights. In: AIES—Proceedings AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery, Inc, Conference of 1st AAAI/ACM—AI, Ethics, and Society, AIES, pp 290–296

  • Swami S, Khandelwal A, Shrivastava M, Akhtar S (2017) LTRC IIITH at IBEREVAL 2017: stance and gender detection in tweets on catalan independence. In: CEUR-WS, Conference of 2nd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 199–203

  • Swire-Thompson B, Lazer D (2020) Public health and online misinformation: challenges and recommendations. Annu Rev Public Health 41(1):433–451

    Article  Google Scholar 

  • Sánchez-Casado N, Cegarra-Navarro J, Tomaseti-Solano E (2015) Linking social networks to utilitarian benefits through counter-knowledge. Online Inf Rev 39(2):179–196

    Article  Google Scholar 

  • Taulé M, Martí M, Rangel F, Rosso P, Bosco C, Patti V (2017) Overview of the task on stance and gender detection in tweets on catalan independence at IberEval 2017. In: CEUR-WS, Conference of 2nd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 157–177

  • Taulé M, Rangel F, Martí M Antònia, Rosso P (2018) Overview of the task on multimodal stance detection in Tweets on catalan #1Oct referendum. In: CEUR-WS, Conference of 3rd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 2150, pp 149–166

  • Tiedemann J (2012) Parallel data, tools and interfaces in OPUS. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, Istanbul, Turkey, 23–25 May 2012. European Language Resources Association (ELRA), pp 2214–2218

  • Valarezo-Cambizaca L-M, Rodríguez-Hidalgo C (2019) Innovation in journalism as an antidote to fake news [La innovación en el periodismo como antídoto ante las fake news]. RISTI Revista Iberica de Sistemas e Tecnologias de Informacao E20:24–35

    Google Scholar 

  • Valencia A Valencia, Adorno H, Rhodes C, Pineda G (2019) Bots and gender identification based on stylometry of tweet minimal structure and n-grams model notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Van Halteren H (2019) Bot and gender recognition on tweets using feature count deviations Notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Varol O, Ferrara E, Davis CA, Menczer F, Flammini A (2017) Online human-bot interactions: detection, estimation, and characterization. In: Proceedings of the eleventh International Conference on Web and Social Media, ICWSM 2017, Montréal, Québec, Canada, 15–18 May 2017, pp 280–289

  • Velazquez Richards E, Gallagher E, Suárez-Serrato P (2019) Boostnet: bootstrapping detection of socialbots, and a case study from Guatemala. In: Conference of 33rd National Forum of Statistics, FNE 2018 and 13th Latin-American Congress of Statistical Societies, CLATSE, vol 301. Springer, pp 145–154

  • Villatoro-Tello E, Ramírez-de-la-Rosa G, Kumar S, Parida S, Motlícek P (2020) Idiap and UAM participation at MEX-A3T evaluation campaign. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) co-located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), Málaga, Spain, 23 September 2020, volume 2664 of CEUR Workshop Proceedings. CEUR-WS.org, pp 252–257

  • Vinayakumar R, Kumar S Sachin, Premjith B, Prabaharan P, Soman K (2017) Deep stance and gender detection in tweets on catalan independence@Ibereval 2017. In: CEUR-WS, Conference of 2nd Workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 222–229

  • Vogel I, Jiang P (2019) Bot and gender identification in Twitter using word and character n-grams notebook for PAN at CLEF 2019. In: CEUR-WS, Conference of 20th working notes of Conference and Labs of the Evaluation Forum, CLEF, vol 2380

  • Vogel I, Meghana M (2020) Fake news spreader detection on twitter using character n-grams. In: Working notes of CLEF 2020—Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 22–25 September 2020, volume 2696 of CEUR Workshop Proceedings. CEUR-WS.org

  • Volkova S, Bell E (2017) Identifying effective signals to predict deleted and suspended accounts on Twitter across languages. In: Proceedings of the 11th International Conference on Web and Social Media, ICWSM. AAAI Press, pp 290–298

  • Vosoughi S, Roy D, Aral S (2018) The spread of true and false news online. Science 359(6380):1146–1151

    Article  Google Scholar 

  • Wojatzki M, Zesch T (2017) Neural, non-neural and hybrid stance detection in tweets on catalan independence. In: CEUR-WS, Conference of 2nd workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval, vol 1881, pp 178–184

  • Yang Y, Cer D, Ahmad A, Guo M, Law J, Constant N, Ábrego GH, Yuan S, Tar C, Sung Y, Strope B, Kurzweil R (2020) Multilingual universal sentence encoder for semantic retrieval. In: Proceedings of the 58th annual meeting of the Association for Computational Linguistics: System Demonstrations, ACL 2020, Online, 5–10 July 2020. Association for Computational Linguistics, pp 87–94

  • Zaizar-Gutiérrez D, Fajardo-Delgado D, Carmona M Á Á (2020) Itcg’s participation at MEX-A3T 2020: aggressive identification and fake news detection based on textual features for Mexican Spanish. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) co-located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), Málaga, Spain, 23 September 2020, volume 2664 of textitCEUR Workshop Proceedings. CEUR-WS.org, pp 258–264

  • Zhang X, Ghorbani AA (2020) An overview of online fake news: characterization, detection, and discussion. Inf Process Manag 57(2):102025

    Article  Google Scholar 

  • Zhou X, Zafarani R (2020) A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput Surv (CSUR) 53(5):1–40

    Article  Google Scholar 

  • Zotova E, Agerri R, Nuñez M, Rigau G (2020) Multilingual stance detection: the catalonia independence corpus, 03

  • Zubiaga A, Kochkina E, Liakata M, Procter R, Lukasik M (2016) Stance classification in rumours as a sequential task exploiting the tree structure of social media conversations. In: COLING 2016, 26th international conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, 11–16 December 2016, Osaka, Japan, pp 2438–2448

  • Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2018) Detection and resolution of rumours in social media: a survey. ACM Comput Surv 51(2):32:1-32:36

    Article  Google Scholar 

Download references

Acknowledgements

Mr. Mendoza acknowledge funding from the Millennium Institute for Foundational Research on Data. Mr. Mendoza was also funded by ANID PIA/APOYO AFB180002 and ANID FONDECYT 1200211.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcelo Mendoza.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Review planning

The review planning step starts by defining search keywords, which will retrieve the first body of literature. These search keywords were defined using logical AND and OR connectors to control the coverage of documents matched by the search system. The search strings used for this process include the three variants of the problem that are the object of this study: stance, rumors, and bots. To locate the Spanish-speaking community’s results, we added the keyword Spanish to each of these terms. We also include Twitter as a keyword, the social media platform that concentrates the most cited studies in English (Zhang and Ghorbani 2020). To avoid restricting the search results to English publications, we also use these search strings in Spanish. The set of search strings used in the review is shown in Table 1.

Table 1 Search strings used in the SLR, A total of 4506 results were retrieved using these search strings

The search in Scopus was restricted to works published since 2009, ruling out works of rumors not related to this phenomenon’s explosion in social media. The works were restricted to two specific areas of knowledge: Computer Science and Engineering. In this way, the retrieved papers will include works on automatic detection methods, which is the focus of this study.

The review planning process also considers the definition of inclusion/exclusion criteria. These criteria are subsequently used in the literature screening stage, during which the content of the works retrieved during the search phase is reviewed. The works that meet the inclusion criteria and do not match any exclusion criteria are included within the literature’s definitive body. We first define a list of exclusion criteria with four items:

  • Exclusion criteria 1 (ExCr1): When an article appears in more than one search, it will be considered only once. Accordingly, the articles repeated in the search results are eliminated, as well as versions of the same work published in different media (duplication by media).

  • Exclusion criteria 2 (ExCr2): Articles written in a language other than Spanish or English are not considered.

  • Exclusion criteria 3: Reviews (ExCr3-a), editorials (ExCr3-b), notes and erratum (ExCr3-c), and conference reviews ((ExCr3-d)) are not considered.

  • Exclusion criteria 4 (ExCr4): Articles whose title or abstract do not refer to the study (semantic mismatch) are discarded.

The inclusion criteria consider two items:

  • Inclusion criteria 1 (InCr1): Three sections of the work are reviewed. These are abstract, introduction, and conclusion. We verify if the work focuses on solving any of the tasks object of this study in Spanish.

  • Inclusion criteria 2 (InCr2): If there is no conclusive evidence identified when applying inclusion criteria 1, the full article is read. If the work does not address any of the tasks in the Spanish language, the paper is discarded.

1.2 Literature search and screening

The search for papers was carried out during 2020. By applying the search strings to Scopus, we retrieved a total of 4506 documents. This first body of literature was examined, applying exclusion and inclusion criteria defined in this study. Figure 7 shows how many documents were deleted after applying the criteria. The reduction of the initial set is notorious. A total of 4360 documents were eliminated using the exclusion criteria. The remaining 146 documents were analyzed using inclusion criteria. The first inclusion criterion was validated in 102 documents, of which 67 also match the second inclusion criterion. As a result, the first body of literature records 67 documents.

Fig. 7
figure 7

Exclusion/inclusion criteria applied to the documents detected in this SLR. The study considered two stages, the first based on the documents identified using search strings and the second based on the articles that cited the first body of literature. A total of 94 documents met the exclusion/inclusion criteria. Finally, after reviewing the selected documents’ references, 3 more papers were added to the survey validating the exclusion/inclusion criteria

The second body of literature was created by analyzing the works that cite the first body of literature. The citations include related work relevant to these articles, which provides an important source of papers connected to the survey subject that was not detected using search strings. A total of 194 documents were identified in this process, which was reduced to 69 after applying the exclusion criteria, and 27 after applying the inclusion criteria.

Both stages of the systematic review made it possible to identify a total of 94 documents. We conducted an exhaustive review of their references for these documents, looking for works related to this survey subject that had not been detected in the previous two stages. In this last process, three more papers were identified, which passed the exclusion criteria matching both inclusion criteria. In total, the SLR allowed the identification of 97 works related to the subject of this survey. The total number of papers per task is shown in Fig. 8.

Fig. 8
figure 8

Papers per task

Table 2 Labels used to deploy the citation network (see Fig. 5) of the work surveyed in this study

1.3 Acronyms

  • Systematic literature review: SLR

  • Bag-of-Words: BOW

  • Part-of-Speech: POS

  • Term Frequency Inverted Document Frequency: TF-IDF

  • Latent Semantic Analysis: LSA

  • Universal Language Model Fine-Tuning: ULMFiT

  • Singular Value Decomposition: SVD

  • Recurrent Neural Network: RNN

  • Bidirectional Encoder Representations based on the Transformer: BERT

  • Supervised Autoencoder: SAE

  • Pointwise Mutual Information: PMI

  • Affective Norms for English Words: AFINN

  • Linguistic Inquiry and Word Count: LIWC

  • Named Entity Recognition: NER

  • Global Vectors for word representation: GloVe

  • Support Vector Machines: SVM

  • Random Forests: RF

  • Logistic Regression: LR

  • Convolutional Neural Networks: CNN

  • Long Short-Term Memory: LSTM

  • Adaptive Boosting: ADABOOST

  • Feed-Forward Neural Networks: FFNN

  • Multinomial Naive Bayes: MNB

  • Document frequency selection: DF

  • Frequently co-occurring entropy: FCE

  • Information Gain: IG

  • Whale optimization: WO

  • Genetic algorithms: GA

  • Particle swarm optimization: PSO

  • Hierarchical Attention Networks: HAN

  • Bidirectional LSTM: Bi-LSTM

  • Gated Recurrent Unit: GRU

  • Spanish Billion Word Corpus and Embeddings: SBWCE

  • The Catalonia Independence Corpus: CIC

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Providel, E., Mendoza, M. Misleading information in Spanish: a survey. Soc. Netw. Anal. Min. 11, 36 (2021). https://doi.org/10.1007/s13278-021-00746-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-021-00746-y

Keywords

Navigation