Skip to main content

Fake News Detection via English-to-Spanish Translation: Is It Really Useful?

  • Conference paper
  • First Online:
Social Computing and Social Media: Experience Design and Social Network Analysis (HCII 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12774))

Included in the following conference series:

Abstract

Social networks are used every day to report daily events, although the information published in them many times correspond to fake news. Detecting these fake news has become a research topic that can be approached using deep learning. However, most of the current research on the topic is available only for the English language. When working on fake news detection in other languages, such as Spanish, one of the barriers is the low quantity of labeled datasets available in Spanish. Hence, we explore if it is convenient to translate an English dataset to Spanish using Statistical Machine Translation. We use the translated dataset to evaluate the accuracy of several deep learning architectures and compare the results from the translated dataset and the original dataset in fake news classification. Our results suggest that the approach is feasible, although it requires high-quality translation techniques, such as those found in the translation’s neural-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://opus.nlpl.eu/.

  2. 2.

    https://www.statmt.org/europarl/.

  3. 3.

    https://github.com/jpposadas/FakeNewsCorpusSpanish.

  4. 4.

    https://github.com/dcaled/FTR-18.

  5. 5.

    http://opus.nlpl.eu/News-Commentary.php.

References

  1. Ajao, O., Bhowmik, D., Zargari, S.: Fake news identification on Twitter with hybrid CNN and RNN models. In: Proceedings of the 9th International Conference on Social Media and Society, SMSociety 2018, pp. 226–230 (2018)

    Google Scholar 

  2. Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31, 211–36 (2017)

    Article  Google Scholar 

  3. Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)

    Google Scholar 

  4. Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., Kompatsiaris, Y.: Detection and visualization of misleading content on Twitter. Int. J. Multimedia Inf. Retrieval 7(1), 71–86 (2017). https://doi.org/10.1007/s13735-017-0143-x

    Article  Google Scholar 

  5. Caled, D., Silva, M.: FTR-18: Collecting rumours on football transfer news. In: Conference on Information and Knowledge Management Workshops, CIKM, vol. 2482. CEUR-WS (2019)

    Google Scholar 

  6. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, WWW, Hyderabad, India, pp. 675–684 (2011)

    Google Scholar 

  7. Cañete, J., Chaperon, G., Fuentes, R., Ho, J.-H., Kang, H., Pérez, J.: Spanish pre-trained BERT model and evaluation data. In: PML4DC at ICLR 2020 (2020)

    Google Scholar 

  8. Costa-jussà, M.R., Zampieri, M., Pal, S.: A neural approach to language variety translation. In: Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, pp. 275–282. Association for Computational Linguistics (2018)

    Google Scholar 

  9. Deepak, S., Bhadrachalam, C.: Deep neural approach to fake-news identification. Procedia Comput. Sci. 167, 2236–2243 (2020)

    Article  Google Scholar 

  10. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, USA, vol. 1. (Long and Short Papers), pp. 4171–4186 (2019)

    Google Scholar 

  11. Ferrara, E.: Manipulation and abuse on social media. ACM SIGWEB Newsletter, pp. 1–9 (2015)

    Google Scholar 

  12. Jehl, L.: Machine Translation for Twitter. Master’s thesis, University of Edinburgh (2010)

    Google Scholar 

  13. Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of ACL, System Demonstrations, pp. 67–72 (2017)

    Google Scholar 

  14. Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pp. 177–180 (2007)

    Google Scholar 

  15. Kwon, S., Cha, M., Jung, K.: Rumor detection over varying time windows. PLOS One 12, e0168344 (2017)

    Article  Google Scholar 

  16. Liu, Y.: Early detection of fake news on social media. PhD thesis, New Jersey Institute of Technology (2019)

    Google Scholar 

  17. Lohar, P., Popović, M., Way, A.: Building English-to-Serbian machine translation system for IMDb movie reviews. In: Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, pp. 105–113 (2019)

    Google Scholar 

  18. Ma, J., et al.: Detecting rumors from microblogs with recurrent neural networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI2016, pp. 3818–3824 (2016)

    Google Scholar 

  19. Ma, J., Gao, W., Wong, K.-F.: Detect rumors in microblog posts using propagation structure via kernel learning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 708–717, (2017)

    Google Scholar 

  20. Ma, J., Gao, W., Wong, K.-F.: Rumor detection on Twitter with tree-structured recursive neural networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1980–1989 (2018)

    Google Scholar 

  21. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150 (2011)

    Google Scholar 

  22. Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the 1st Workshop on Social Media Analytics, SOMA, Washington, USA, pp. 71–79 (2010)

    Google Scholar 

  23. Nouhaila, B., Habib, A., Abdellah, A., Abdelhamid, I.E.F.: Arabic machine translation using bidirectional LSTM encoder-decoder (2018)

    Google Scholar 

  24. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  25. Posadas-Durán, J.-P., Gomez-Adorno, H., Sidorov, G., Escobar, J.: Detection of fake news in a new corpus for the Spanish language. J. Intell. Fuzzy Syst. 36(5), 4868–4876 (2019)

    Google Scholar 

  26. Pourebrahim, N., Sultana, S., Edwards, J., Gochanour, A., Mohanty, S.: Understanding communication dynamics on Twitter during natural disasters: a case study of hurricane sandy. Int. J. Disaster Risk Reduct. 37, 101176 (2019)

    Article  Google Scholar 

  27. Providel, E., Mendoza, M.: Using deep learning to detect rumors in Twitter. In: Meiselwitz, G. (ed.) HCII 2020, Part I. LNCS, vol. 12194, pp. 321–334. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49570-1_22

    Chapter  Google Scholar 

  28. Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1589–1599 (2011)

    Google Scholar 

  29. Ramírez, V.: Plebiscito Colombia 2016 (2016). https://data.world/bikthor/plebiscito-colombia-2016

  30. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 851–860 (2010)

    Google Scholar 

  31. Sen, S., Banik, D., Ekbal, A., Bhattacharyya, P.: IITP English-Hindi machine translation system at WAT 2016. In: Proceedings of the 3rd Workshop on Asian Translation (WAT2016), pp. 216–222, Osaka, Japan (2016)

    Google Scholar 

  32. Tiedemann, J.: Parallel data, tools and interfaces in OPUS. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC2012), pp. 2214–2218 (2012)

    Google Scholar 

  33. Tiedemann, J.: Parallel data, tools and interfaces in OPUS. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC2012) (2012)

    Google Scholar 

  34. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, 2–4 May 2013, Workshop Track Proceedings (2013)

    Google Scholar 

  35. Vathsala, M., Holi, G.: RNN based machine translation and transliteration for Twitter data. Int. J. Speech Technol. 23, 499–504 (2020)

    Article  Google Scholar 

  36. Wang, Y., et al.: EANN: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, pp. 849–857 (2018)

    Google Scholar 

  37. Yu, F., Liu, Q., Wu, S., Wang, L., Tan, T.: A convolutional approach for misinformation identification. In: IJCAI2017, pp. 3901–3907 (2017)

    Google Scholar 

  38. Zubiaga, A., Aker, A., Bontcheva, K., Liakata, M., Procter, R.: Detection and resolution of rumours in social media: a survey. ACM Comput. Surv. 51, 1–36 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

Mr. Mendoza acknowledge funding from the Millennium Institute for Foundational Research on Data. Mr. Mendoza was also funded by ANID PIA/APOYO AFB180002 and ANID FONDECYT 1200211.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcelo Mendoza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ruíz, S., Providel, E., Mendoza, M. (2021). Fake News Detection via English-to-Spanish Translation: Is It Really Useful?. In: Meiselwitz, G. (eds) Social Computing and Social Media: Experience Design and Social Network Analysis . HCII 2021. Lecture Notes in Computer Science(), vol 12774. Springer, Cham. https://doi.org/10.1007/978-3-030-77626-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77626-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77625-1

  • Online ISBN: 978-3-030-77626-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics