skip to main content
10.1145/3529190.3534755acmotherconferencesArticle/Chapter ViewAbstractPublication PagespetraConference Proceedingsconference-collections
research-article

Social Media vs. News Platforms: A Cross-analysis for Fake News Detection Using Web Scraping and NLP

Published: 11 July 2022 Publication History

Abstract

With the widespread use of social media platforms within our modern society, these platforms have become a popular medium for disseminating news across the globe. While some of these platforms are considered reliable sources for sharing news, others publicize the information without much validation. The transmission of fake news on social media impacts people’s behavior and negatively influences people’s decisions. During the COVID-19 outbreak, it was more evident than ever. This has led to a demand for conducting research studies to explore sophisticated approaches to assess the integrity of news worldwide. The main objective of this research paper was to outline our proposed experimental methodology to detect and access fake news using Data Mining and Natural Language Processing. The presented research effort provides a method to verify the authenticity of the news disseminated in social networks by dividing the process into four significant stages: news aggregation, publication collection, data analysis, and matching results.

References

[1]
Iftikhar Ahmad, Muhammad Yousaf, Suhail Yousaf, and Muhammad Ovais Ahmad. 2020. Fake news detection using machine learning ensemble methods. Complexity 2020(2020).
[2]
Bashar Al Asaad and Madalina Erascu. 2018. A tool for fake news detection. In 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). IEEE, 379–386.
[3]
Vimala Balakrishnan and Ethel Lloyd-Yemoh. 2014. Stemming and Lemmatization: A Comparison of Retrieval Performances. SCEI Seoul Conferences(2014).
[4]
Alessandro Bessi and Emilio Ferrara. 2016. Social bots distort the 2016 US Presidential election online discussion. First monday 21, 11-7 (2016).
[5]
Johannes Boegershausen, Abhishek Borah, Hannes Datta, and Andrew Stephen. 2021. Fields of Gold: Generating Relevant and Credible Insights Via Web Scraping and APIs. ACR North American Advances(2021).
[6]
Cedric De Boom, Steven Van Canneyt, Thomas Demeester, and Bart Dhoedt. 2016. Representation learning for very short texts using weighted word embedding aggregation. Pattern Recognition Letters 80 (2016), 150–156.
[7]
William Ferreira and Andreas Vlachos. 2016. Emergent: a novel data-set for stance classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: Human language technologies. ACL.
[8]
Saram Han and Christopher K Anderson. 2021. Web scraping for hospitality research: Overview, opportunities, and implications. Cornell Hospitality Quarterly 62, 1 (2021), 89–104.
[9]
Mayank Kumar Jain, Dinesh Gopalani, Yogesh Kumar Meena, and Rajesh Kumar. 2020. Machine Learning based Fake News Detection using linguistic features and word vector features. In 2020 IEEE 7th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON). IEEE, 1–6.
[10]
Wasim Ahmed Joseph Downing. 2020. COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data. J Med Internet Res (2020).
[11]
David D Parsons. 2020. The impact of fake news on company value: evidence from tesla and galena biopharma. CHANCELLOR’S HONORS PROGRAM PROJECTS(2020).
[12]
Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, and Yejin Choi. 2017. Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the 2017 conference on empirical methods in natural language processing. 2931–2937.
[13]
Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2007. An introduction to information retrieval.
[14]
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19, 1 (2017), 22–36.
[15]
Pinky Sitikhu, Kritish Pahi, Pujan Thapa, and Subarna Shakya. 2019. A comparison of semantic similarity methods for maximum human interpretability. In 2019 artificial intelligence for transforming business and society (AITB), Vol. 1. IEEE, 1–4.
[16]
Pinky Sitikhu, Kritish Pahi, Pujan Thapa, and Subarna Shakya. 2019. A comparison of semantic similarity methods for maximum human interpretability. In 2019 artificial intelligence for transforming business and society (AITB), Vol. 1. IEEE, 1–4.
[17]
Amy Watson. 2020. Statista: Sharing of made-up news online in the U.S. 2019. https://www.statista.com/statistics/657111/fake-news-sharing-online/
[18]
Bo Zhao. 2017. Web scraping. Encyclopedia of big data(2017), 1–3.
[19]
Chunmei Zheng, Guomei He, and Zuojie Peng. 2015. A Study of Web Information Extraction Technology Based on Beautiful Soup.J. Comput. 10, 6 (2015), 381–387.

Cited By

View all
  • (2024)DeepNews: enhancing fake news detection using generative round network (GRN)International Journal of Information Technology10.1007/s41870-024-02017-316:7(4289-4298)Online publication date: 23-Jun-2024

Index Terms

  1. Social Media vs. News Platforms: A Cross-analysis for Fake News Detection Using Web Scraping and NLP
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Other conferences
            PETRA '22: Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments
            June 2022
            704 pages
            ISBN:9781450396318
            DOI:10.1145/3529190
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 11 July 2022

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Fake News Detection
            2. Natural Language Processing
            3. Social Media
            4. Web Crawling
            5. Web Scraping

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            PETRA '22

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)86
            • Downloads (Last 6 weeks)10
            Reflects downloads up to 08 Mar 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)DeepNews: enhancing fake news detection using generative round network (GRN)International Journal of Information Technology10.1007/s41870-024-02017-316:7(4289-4298)Online publication date: 23-Jun-2024

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media