A hybrid model for the detection of multi-agent written news articles based on linguistic features and BERT

Lin, Ching-Sheng

doi:10.1007/s11227-024-06882-4

A hybrid model for the detection of multi-agent written news articles based on linguistic features and BERT

Published: 08 January 2025

Volume 81, article number 381, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Ching-Sheng Lin¹

375 Accesses
Explore all metrics

Abstract

Large language models (LLMs) are central to AI systems and Excel in natural language processing tasks. They blur the line between human and machine-generated text and are widely used by professional writers across domains including news article generation. The challenge of detecting LLM-written articles introduces novel obstacles regarding misuse and the generation of fake content. In this work, we aim to recognize two kinds of LLM-written news where one type is entirely generated by LLMs and another is paraphrased based on existing news sources. We propose a neural network model that incorporates linguistic features and BERT contextual embedding features for LLM-written news article detection. In conjunction with the proposed model, we also produce a news article corpus based on the BBC dataset to generate and paraphrase news articles through multi-agent cooperation using ChatGPT. Our model obtains 96.57% accuracy and 96.44% F1_macro score, respectively, outperforming other existing models and indicating the capability of helping readers to identify LLM-written news articles. To assess the model’s robustness, we also construct another corpus based on the BBC dataset using a different language model, Claude, and demonstrate that our detection model achieves strong results. Furthermore, we apply our model to text generation detection in the medical domain, where it also delivers promising performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

BRaG: a hybrid multi-feature framework for fake news detection on social media

Article 29 January 2024

Semantic web-based propaganda text detection from social media using meta-learning

Article 02 August 2024

Fake news detection using dual BERT deep neural networks

Article 16 October 2023

Data availability

No datasets were generated or analyzed during the current study.

References

Pelau C, Dabija DC, Ene I (2021) What makes an AI device human-like? The role of interaction quality, empathy and perceived psychological anthropomorphic characteristics in the acceptance of artificial intelligence in the service industry. Comput Hum Behav 122:106855
Article Google Scholar
Roumeliotis KI, Tselikas ND (2023) Chatgpt and open-ai models: a preliminary review. Future Internet 15(6):192
Article MATH Google Scholar
Xiao C, Xu SX, Zhang K, Wang Y, Xia L (2023, July) Evaluating reading comprehension exercises generated by LLMs: a showcase of ChatGPT in education applications. In: Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) (pp. 610–625)
Bishop L (2023) A computer wrote this paper: What ChatGPT means for education, research, and writing. research, and writing (January 26, 2023)
Lo CK (2023) What is the impact of ChatGPT on education? A rapid review of the literature. Educ Sci 13(4):410
Article MATH Google Scholar
Shahriar S, Hayawi K (2024) Let’s have a chat! A conversation with ChatGPT: technology, applications, and limitations. In Artificial Intelligence and Applications (Vol. 2, No. 1, pp. 11–20)
Abdalla MHI, Malberg S, Dementieva D, Mosca E, Groh G (2023) A benchmark dataset to distinguish human-written and machine-generated scientific papers. Information 14(10):522
Article Google Scholar
Liu Y, Yao Y, Ton JF, Zhang X, Cheng RGH, Klochkov Y, Li H (2023) Trustworthy LLMs: a survey and guideline for evaluating large language models’ alignment. arXiv preprint arXiv:2308.05374
Mireshghallah N, Kim H, Zhou X, Tsvetkov Y, Sap M, Shokri R, Choi Y (2023) Can llms keep a secret? Testing privacy implications of language models via contextual integrity theory. arXiv preprint arXiv:2310.17884
Sullivan M, Kelly A, McLaughlan P (2023) ChatGPT in higher education: considerations for academic integrity and student learning
Gill SS, Xu M, Patros P, Wu H, Kaur R, Kaur K, Buyya R (2024) Transformative effects of ChatGPT on modern education: emerging Era of AI Chatbots. Internet Things Cyber-Phys Syst 4:19–23
Article MATH Google Scholar
Li X, Zhang Y, Malthouse EC (2023) A preliminary study of chatgpt on news recommendation: Personalization, provider fairness, fake news. arXiv preprint arXiv:2306.10702
Wang Z, Cheng J, Cui C, Yu C (2023) Implementing BERT and fine-tuned RobertA to detect AI generated news by ChatGPT. arXiv preprint arXiv:2306.07401
Dalalah D, Dalalah OM (2023) The false positives and false negatives of generative AI detection tools in education and academic research: the case of ChatGPT. Int J Manag Educ 21(2):100822
MATH Google Scholar
Weber-Wulff D, Anohina-Naumeca A, Bjelobaba S, Foltýnek T, Guerrero-Dib J, Popoola O, Waddington L (2023) Testing of detection tools for AI-generated text. Int J Educ Integr 19(1):26
Article Google Scholar
Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y (2019) Defending against neural fake news. Advances in neural information processing systems, 32
Maronikolakis A, Schutze H, Stevenson M (2020) Identifying automatically generated headlines using transformers. arXiv preprint arXiv:2009.13375
Huang Y, Sun L (2024) FakeGPT: fake news generation, explanation and detection of large language models. arXiv preprint
Heppell F, Bakir ME, Bontcheva K (2024) Lying Blindly: bypassing ChatGPT’s safeguards to generate hard-to-detect disinformation claims at scale. arXiv preprint arXiv:2402.08467
Xu H, Ren J, He P, Zeng S, Cui Y, Liu A, Tang J (2023) On the generalization of training-based chatgpt detection methods. arXiv preprint arXiv:2310.01307
Wu T, He S, Liu J, Sun S, Liu K, Han QL, Tang Y (2023) A brief overview of ChatGPT: The history, status quo and potential future development. IEEE/CAA J Autom Sinica 10(5):1122–1136
Article MATH Google Scholar
Greene D, Cunningham P (2006, June) Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd International Conference on MACHINE Learning (pp. 377–384)
Anthropic (2024, March 4) Introducing the next generation of Claude. url: https://www.anthropic.com/news/claude-3-family, retrieved Dec 10, 2024
Zeng G, Yang W, Ju Z, Yang Y, Wang S, Zhang R, Xie P (2020, November) MedDialog: Large-scale medical dialogue datasets. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 9241–9250)
Yazdani SF, Murad MAA, Sharef NM, Singh YP, Latiff ARA (2017) Sentiment classification of financial news using statistical features. Int J Pattern Recognit Artif Intell 31(03):1750006
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems, 30
Minaee S, Mikolov T, Nikzad N, Chenaghlu M, Socher R, Amatriain X, Gao J (2024) Large language models: A survey. arXiv preprint arXiv:2402.06196
Kalyan KS (2023) A survey of GPT-3 family large language models including ChatGPT and GPT-4. Natural Language Processing Journal, 100048
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, Tseng V (2023) Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLoS Digit Health 2(2):e0000198
Article Google Scholar
Rudolph J, Tan S, Tan S (2023) ChatGPT: bullshit spewer or the end of traditional assessments in higher education? J Appl Learn Teach 6(1):342–363
MATH Google Scholar
Cao Y, Zhai J (2023) Bridging the gap–the impact of ChatGPT on financial research. J Chin Econ Bus Stud 21(2):177–191
Article MATH Google Scholar
Guo C, Lu Y, Dou Y, Wang FY (2023) Can ChatGPT boost artistic creation: the need of imaginative intelligence for parallel art. IEEE/CAA J Autom Sinica 10(4):835–838
Article Google Scholar
Zhang T, Patil SG, Jain N, Shen S, Zaharia M, Stoica I, Gonzalez JE (2024) RAFT: Adapting Language Model to Domain Specific RAG. arXiv preprint arXiv:2403.10131
Zhou C, Liu P, Xu P, Iyer S, Sun J, Mao Y, Levy O (2024) Lima: less is more for alignment. Advances in Neural Information Processing Systems, 36
Guu K, Lee K, Tung Z, Pasupat P, Chang M (2020, November) Retrieval augmented language model pre-training. In: International Conference on Machine Learning (pp. 3929–3938). PMLR
Asai A, Wu Z, Wang Y, Sil A, Hajishirzi H (2023) Self-rag: learning to retrieve, generate, and critique through self-reflection. arXiv preprint arXiv:2310.11511
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J (2022) Training language models to follow instructions with human feedback. Adv Neural Inf Process Syst 6(35):27730–27744
Google Scholar
Zhang H, Chen J, Jiang F, Yu F, Chen Z, Li J, Li H (2023) Huatuogpt, towards taming language model to be a doctor. arXiv preprint arXiv:2305.15075
Hu X, Chen PY, Ho TY (2023) Radar: robust ai-text detection via adversarial learning. Adv Neural Inf Process Syst 36:15077–15095
Google Scholar
Verma V, Fleisig E, Tomlin N, Klein D (2023) Ghostbuster: detecting text ghostwritten by large language models. arXiv preprint arXiv:2305.15047
Yang X, Pan L, Zhao X, Chen H, Petzold L, Wang WY, Cheng W (2023) A survey on detection of llms-generated content. arXiv preprint arXiv:2310.15654
Chen Y, Kang H, Zhai V, Li L, Singh R, Ramakrishnan B (2023) Gpt-sentinel: distinguishing human and chatgpt generated content. arXiv preprint arXiv:2305.07969
Wu K, Pang L, Shen H, Cheng X, Chua TS (2023) Llmdet: a large language models detection tool. arXiv preprint arXiv:2305.15004
Wang LZ, Ma Y, Gao R, Guo B, Zhu H, Fan W, Ng KC (2024) Megafake: a theory-driven dataset of fake news generated by large language models. arXiv preprint arXiv:2408.11871
Lavergne T, Urvoy T, Yvon F (2008, July) Detecting fake content with relative entropy scoring. In: Proceedings of the 2008 International Conference on Uncovering Plagiarism, Authorship and Social Software Misuse-Volume 377 (pp. 27–31)
Yang X, Cheng W, Petzold L, Wang, WY, Chen H (2023) Dna-gpt: Divergent n-gram analysis for training-free detection of gpt-generated text. arXiv preprint arXiv:2305.17359
Krishna K, Song Y, Karpinska M, Wieting J, Iyyer M (2024) Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. Advances in Neural Information Processing Systems, 36
Mitchell E, Lee Y, Khazatsky A, Manning CD, Finn C (2023, July) Detectgpt: Zero-shot machine-generated text detection using probability curvature. In: International Conference on Machine Learning (pp. 24950–24962). PMLR
Choudhry A, Khatri I, Jain M, Vishwakarma DK (2022) An emotion-aware multitask approach to fake news and rumor detection using transfer learning. IEEE Trans Comput Soc Syst 11(1):588–599
Article Google Scholar
Cavalcante AAB, Freire PMS, Goldschmidt RR, Justel CM (2024) Early detection of fake news on virtual social networks: a time-aware approach based on crowd signals. Expert Syst Appl 247:123350
Article Google Scholar
Karaoğlan KM (2024) Novel approaches for fake news detection based on attention-based deep multiple-instance learning using contextualized neural language models. Neurocomputing 602:128263
Article MATH Google Scholar
White J, Fu Q, Hays S, Sandborn M, Olea C, Gilbert H, Schmidt DC (2023) A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382
Dey RK, Das AK (2023) Modified term frequency-inverse document frequency based deep hybrid framework for sentiment analysis. Multimed Tools Appl 82(21):32967–32990
Article MATH Google Scholar
Mindner L, Schlippe T, Schaaff K (2023, June) Classification of human-and ai-generated texts: Investigating features for Chatgpt. In: International Conference on Artificial Intelligence in Education Technology (pp. 152–170). Singapore: Springer Nature Singapore
Phani S, Lahiri S, Biswas A (2016, December) Sentiment analysis of tweets in three Indian languages. In: Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016) (pp. 93–102)
Chapman AB, Peterson KS, Alba PR, DuVall SL, Patterson OV (2019) Detecting adverse drug events with rapidly trained classification models. Drug Saf 42:147–156
Article Google Scholar
Corizzo R, Leal-Arenas S (2023, December) One-GPT: a one-class deep fusion model for machine-generated text detection. In: 2023 IEEE International Conference on Big Data (BigData) (pp. 5743–5752). IEEE
Nguyen TT, Hatua A, Sung AH (2023, October) How to detect AI-generated texts?. In: 2023 IEEE 14th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) (pp. 0464–0471). IEEE
Zhou S, Jeong H, Green PA (2017) How consistent are the best-known readability equations in estimating the readability of design standards? IEEE Trans Prof Commun 60(1):97–111
Article MATH Google Scholar
Guo B, Zhang X, Wang Z, Jiang M, Nie J, Ding Y, Wu Y (2023) How close is chatgpt to human experts? Comparison corpus, evaluation, and detection. arXiv preprint arXiv:2301.07597
Corizzo R, Leal-Arenas S (2023) One-class learning for ai-generated essay detection. Appl Sci 13(13):7901
Article MATH Google Scholar
Holtzman A, Buys J, Du L, Forbes M, Choi Y (2019) The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751
Kettunen K (2014) Can type-token ratio be used to show morphological complexity of languages? J Quant Linguist 21(3):223–245
Article MATH Google Scholar
Montemurro MA, Zanette DH (2002) Entropic analysis of the role of words in literary texts. Adv Complex Syst 5(01):7–17
Article MATH Google Scholar
Gargiulo F, Silvestri S, Ciampi M, De Pietro G (2019) Deep neural network for hierarchical extreme multi-label text classification. Appl Soft Comput 79:125–138
Article Google Scholar
Bhattacharjee A, Liu H (2024) Fighting fire with fire: can ChatGPT detect AI-generated text? ACM SIGKDD Explor Newsl 25(2):14–21
Article MATH Google Scholar
Wang R, Chen H, Zhou R, Ma H, Duan Y, Kang Y, Tan T (2024) LLM-detector: improving AI-generated chinese text detection with open-source LLM instruction tuning. arXiv preprint arXiv:2402.01158
Steponenaite A, Barakat, B (2023, July) Plagiarism in AI empowered world. In: International Conference on Human-Computer Interaction (pp. 434–442). Cham: Springer Nature Switzerland

Download references

Funding

This work was partially supported by the National Science and Technology Council (NSTC), Taiwan, under Grants Number 112–2622-E-029 -009.

Author information

Authors and Affiliations

Master Program of Digital Innovation, Tunghai University, Taichung City, 40704, Taiwan
Ching-Sheng Lin

Authors

Ching-Sheng Lin
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

CS Lin contributed to concept development, methodology, investigation, data collection, experiment design and writing.

Corresponding author

Correspondence to Ching-Sheng Lin.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, CS. A hybrid model for the detection of multi-agent written news articles based on linguistic features and BERT. J Supercomput 81, 381 (2025). https://doi.org/10.1007/s11227-024-06882-4

Download citation

Accepted: 21 December 2024
Published: 08 January 2025
DOI: https://doi.org/10.1007/s11227-024-06882-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid model for the detection of multi-agent written news articles based on linguistic features and BERT

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BRaG: a hybrid multi-feature framework for fake news detection on social media

Semantic web-based propaganda text detection from social media using meta-learning

Fake news detection using dual BERT deep neural networks

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now