Survey of fake news detection using machine intelligence approach

https://doi.org/10.1016/j.datak.2022.102118Get rights and content

Abstract

With the extensive spreading of all information through digital platforms, it is of maximal importance that each people get to differentiate between them. Fake news is a vast problem in our society we cannot predict which news is fake or real without having knowledge or proof of that particular news. This has become a supreme problem, so we decided to create a solution to this problem. Thus, we built a small model which helps in detecting fake news, where we are dealing with some articles which have been collected from the internet. We have labeled each of them as either fake or true. We have trained our dataset using these articles and have used different machine learning algorithms like Passive Aggressive Classifier, Naïve Bayes, Logistic Regression, Decision Tree, Long short term memory (LSTM), and Bidirectional Encoder Representations from Transformers (BERT) to compare the results. Our experimental result has achieved 99.6% accuracy from Decision Tree algorithm and obtained 99.8% recall from LSTM for detection of fake news. Passive Aggressive Classifier performs excellent on a large data set.

Introduction

With the advent of globalization and the rapid development of online platforms (including Facebook and Twitter), a good approach to information exchange has opened up that has never been ever seen in human history [1], [2]. The propagation of false news also has a tremendous impact on the rest of the world. Fake news is also propagated via social media platforms such as Facebook and Twitter [3], [4]. Our empowerment to create judgments is largely determined by the knowledge we absorb; our viewpoint is influenced by the information we consume. There is mounting evidence that people have responded irrationally to news that afterward proved to be false [5], [6]. Our ability to make decisions is primarily impacted by the information we ingest; our perspective is influenced by the events we intake. People have reacted unreasonably to news that later turned out to be untrue, according to accumulating evidence.

This comprehensive machine learning (ML) based research article for identifying false news is concerned with both fake and genuine news [7]. Using the sklearn module, free python language-based ML library and term frequency inverse document frequency (TF-IDF) vectorizer, we can say about a token in our dataset. Then, we initialize ML models and fit the token. Here we have considered different ML models Passive Aggressive Classifier, Naïve Bayes algorithm, Logistic Regression, Decision Tree, LSTM and BERT.

In the end, the accuracy score and the confusion matrix tell us how well our model fares. This provides us an approximate value of our model, indicating if it is performing properly or not. After that, we can accept user input and determine whether it is phony or real [8], [9].

This provides us an approximate value of our model, indicating if it is performing properly or not. After that, we can accept user input and determine whether it is phony or real [10]. Our remaining paper is organized in the form of the different types of algorithms those are explained in the related work section. The problem regarding the fake news is noted early. The solution to this problem and the algorithm used is presented in the evaluation section, then there are the final results written in the result and discussion section. Finally, conclusion is presented.

Section snippets

Related work

This section deals with the basics of the existing algorithms.

Evaluation

The social media sites are highly strong and beneficial for discussing different vital issues for the society. It needs to follow ethics also. It is our responsibility to maintain veracity of spreading correct news.

Results and discussion

We have trained this model every time, this false news detection algorithm always returns an accurate score. When we train our model with a larger set of data, we get approximate and the finest data-testing outcome. It is simple to calculate the approximate value using the machine learning idea of calculating the mean value of the generated vector and comparing it to our train datasets to see whether the number of positive outcomes is more than the news is true or false.

According to accuracy,

Conclusions

We categorized the news in a word set and discovered the proper key terms to create an appropriate model. We have used different ML algorithms to train our model. Confusion matrix is depicted for measuring performance to make correct decisions concerning articles labeled as real or fake.

There are several outstanding challenges in the identification of fake news that researchers must address. Identifying essential aspects involved in the distribution of news, for example, can help to minimize

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We downloaded the data from Kaggle and [ISDDC 2017] ISOT [27]. GitHub provided us with the procedures for various areas of the code, which we appreciate. We appreciated it because we were able to learn about all of these current events and placed them into being used in this project.

BERT model is very huge because of the training structure. Due to its size and the number of weights that need to be updated, training is sluggish. So, we have used a short data set for it which we have uploaded on

Aishika Pal was born in 1999 in West Bengal, India. She is completing her undergraduate B-Tech degree in the field of Information Technology from Dr. B C Roy Engineering College, Durgapur. Her research interests include Cyber Security, Machine Learning and Artificial Intelligence.

References (28)

  • AldwairiM. et al.

    Detecting Fake News in Social Media Networks

    (2018)
  • MarrB.

    Coronavirus fake news: how Facebook, Twitter, and Instagram are tackling the problem

  • ShuK. et al.

    Fake news detection on social media: a data mining perspective

  • N. Ruchansky, S. Seo, Y. Liu, CSI: A Hybrid Deep Model for Fake News Detection, in: Proceedings of the 2017 ACM on...
  • OkoroE.M. et al.

    A hybrid approach to fake news detection on social media

    Niger. J. Technol.

    (2018)
  • KwonSejeong et al.

    Prominent features of rumor propagation in online social media

  • Pérez-RosasV. et al.

    Automatic detection of fake news

    (2017)
  • VosoughiS. et al.

    The spread of true and false news online

    Science

    (2018)
  • HuaJ. et al.

    Corona virus (covid-19) infodemic and emerging issues through a data lens: the case of China

    Int. J. Environ. Res. Public Health

    (2020)
  • WadhwaL.

    Detecting fake political news online

    (2019)
  • M. Dranik, V. Mesyura, Fake News Detection Using Naive Bayes Classifier. http://dx.doi.org/10.1109/UKRCON.2017.8100379....
  • TacchiniE. et al.

    Automated fake news detection in social networks

    (2017)
  • AhmadI. et al.

    Fake News Detection using Machine Learning Ensemble Methods

    (2020)
  • DouglasA.

    News consumption and the new electronic media

    Int. J. Press/Polit.

    (2006)
  • Cited by (11)

    View all citing articles on Scopus

    Aishika Pal was born in 1999 in West Bengal, India. She is completing her undergraduate B-Tech degree in the field of Information Technology from Dr. B C Roy Engineering College, Durgapur. Her research interests include Cyber Security, Machine Learning and Artificial Intelligence.

    Pranav was born in 2000 in Bihar, India. He is completing his undergraduate B-Tech degree in the field of Information Technology from Dr. B C Roy Engineering College, Durgapur. His research interests include Cryptography, Artificial Intelligence, Cyber Security and Data Science.

    Moumita Pradhan (Dr.) was born in West Bengal, India. She received her B. Tech in Computer Science and Engineering in 2005, M. Tech from NIT Durgapur in 2009. She completed her Ph.D. in Computer Science and Engineering in 2019 from NIT Durgapur, India. Her field of research is soft computing, swarm intelligence, Machine Learning, Artificial Intelligence, economic load dispatch, Hydrothermal scheduling.

    View full text