Abstract
Artificial Intelligence technology has been constantly advancing and becoming more noticeable in various areas of our daily lives. One remarkable instance is the creation of a chatbot named ChatGPT (Chat Generative Pre-trained Transformer), which has a conversational AI interface and was developed by OpenAI. ChatGPT is considered one of the most advanced AI applications and has attracted significant attention worldwide. In this aspect, this paper aims to investigate how AI-generated data affects the ability of fake news detection by evaluating this task on two political fake news datasets. To accomplish this task, we create two ChatGPT-generated datasets from two fake news datasets. We extract features using three different embedding methods and train models on the original training set to compare the model performance on the original news with ChatGPT-generated news. Likewise, we train models based on the ChatGPT-generated training set to perform a comparison. The findings of this study show that ChatGPT can poison data and mislead fake news detection systems trained using real-life news. These systems lose their ability to detect fake news in real-life scenarios when trained with ChatGPT-generated data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Bagdasaryan, E., Shmatikov, V.: Spinning language models: risks of propaganda-as-a-service and countermeasures. In: 2022 SP, pp. 769–786. IEEE (2022)
Bang, Y., et al.: A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity (2023)
Breiman, L.: Mach. Learn. 45(1), 5–32 (2001)
Chen, T., Guestrin, C.: Xgboost. In: KDD (2016)
Conroy, N.K., Rubin, V.L., Chen, Y.: Automatic deception detection: methods for finding fake news. Proc. AIST 52(1), 1–4 (2015)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: ACL, pp. 4171–4186. Association for Computational Linguistics, Minneapolis (2019)
Feng, K.J., Gao, A., Karras, J.S.: Towards semantically aware word cloud shape generation. In: UIST (2022)
Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: NAACL (2016)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. AoS 29(5), 1189–1232 (2001)
Gilda, S.: Notice of violation of IEEE publication principles: evaluating machine learning algorithms for fake news detection. In: SCOReD (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Josh, A.G., Girish, S., Micah, M., Renee, D., Matthew, G., Katerina, S.: Generative language models and automated influence operations: emerging threats and potential mitigations (2023)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI, vol. 29, no. 1 (2015)
Lau, J.H., Baldwin, T.: An empirical evaluation of doc2vec with practical insights into document embedding generation (2016)
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality (2013)
OpenAI: Gpt-4 technical report (2023)
Ouyang, L., et al.: Training language models to follow instructions with human feedback. NeurIPS 35, 27730–27744 (2022)
Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding (2019)
Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., Wu, X.: Unifying large language models and knowledge graphs: a roadmap. arXiv:2306.08302 (2023)
Pan, S., Wu, J., Zhu, X.: Cogboost: boosting for fast cost-sensitive graph classification. IEEE TKDE 27(11), 2933–2946 (2015)
Paul, C., Jan, L., Tom, B.B., Miljan, M., Shane, L., Dario, A.: Deep reinforcement learning from human preferences (2017)
Qin, C., Zhang, A., Zhang, Z., Chen, J., Yasunaga, M., Yang, D.: Is chatgpt a general-purpose natural language processing task solver? (2023)
Ruchansky, N., Seo, S., Liu, Y.: CSI: a hybrid deep model for fake news detection. In: CIKM (2017)
Sarah, E.K., Miles, M., Miles, B.: All the news that’s fit to fabricate: AI-generated text as a tool of media misinformation (2020)
Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media. ACM SIGKDD Explor. Newsl. 19(1), 22–36 (2017)
Svozil, D., Kvasnicka, V., Jiri, P.: Introduction to multi-layer feed-forward neural networks. CILS 39(1), 43–62 (1997)
Thorne, J., Vlachos, A.: Automated fact checking: task formulations, methods and future directions (2018)
Vosoughi, S., Roy, D., Aral, S.: The spread of true and false news online. Science 359(6380), 1146–1151 (2018)
Witteveen, S., Andrews, M.: Paraphrasing with large language models. arXiv preprint arXiv:1911.09661 (2019)
Xinyi, Z., Reza, Z.: A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput. Surv. 53, 1–40 (2021)
Yang, Y., Zheng, L., Zhang, J., Cui, Q., Li, Z., Yu, P.S.: TI-CNN: convolutional neural networks for fake news detection (2023)
Zhang, Y.T., Gong, L., Wang, Y.C.: An improved TF-IDF approach for text classification. J. Zhejiang Univ. Sci. 6(1), 49–55 (2005)
Zhou, Y., et al.: Large language models are human-level prompt engineers. arXiv:2211.01910 (2022)
Zubiaga, A., Aker, A., Bontcheva, K., Liakata, M., Procter, R.: Detection and resolution of rumours in social media. ACM Comput. Surv. 51(2), 1–36 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, B., Ju, J., Wang, C., Pan, S. (2023). How Does ChatGPT Affect Fake News Detection Systems?. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14177. Springer, Cham. https://doi.org/10.1007/978-3-031-46664-9_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-46664-9_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46663-2
Online ISBN: 978-3-031-46664-9
eBook Packages: Computer ScienceComputer Science (R0)