ABSTRACT
The arrival of the internet in the late twentieth century, followed by social media in the twenty-first century, greatly increased the hazards of misinformation, disinformation, propaganda, and hoaxes. New ways of writing news have emerged to insert bias intelligently without making the news a piece of fake news. The correct news is usually manipulated to benefit a person, a group of individuals, a political party, or other factors, or changed to reflect sentiment or prominence. It is a challenging task to Sanitize such news content before presenting it to the reader. In this paper, we deal with the problematic English news sentences defined as Septic sentences. We have identified the Septic sentences and their Septic phrases using Machine Learning algorithms. Sanitization is the process of converting a Septic sentence into a Pure sentence. We illustrate the process of Sanitization in this paper with the help of paraphrasing. The model is able to Sanitize 76% of Septic sentences.
- Hunt Allcott and Matthew Gentzkow. 2017. Social media and fake news in the 2016 election. Journal of economic perspectives 31, 2 (2017), 211–36.Google ScholarCross Ref
- Colin Bannard and Chris Callison-Burch. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). 597–604.Google ScholarDigital Library
- Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349(2015).Google Scholar
- Prithiviraj Damodaran. 2021. Parrot: Paraphrase generation for NLU.Google Scholar
- Soma Das and Sanjay Chatterji. 2019. Identification of synthetic sentence in Bengali news using hybrid approach. In Proceedings of the 16th International Conference on Natural Language Processing. 193–200.Google Scholar
- Soma Das, Pooja Rai, and Sanjay Chatterji. 2022. Deep Level Analysis of Legitimacy in Bengali News Sentences. Transactions on Asian and Low-Resource Language Information Processing 21(2022), 1 – 18.Google ScholarDigital Library
- Nicholas DiFonzo and Prashant Bordia. 2007. Rumor, gossip and urban legends. Diogenes 54, 1 (2007), 19–35.Google ScholarCross Ref
- Ankush Gupta, Arvind Agarwal, Prawaan Singh, and Piyush Rai. 2018. A deep generative framework for paraphrase generation. In Proceedings of the aaai conference on artificial intelligence, Vol. 32.Google ScholarCross Ref
- David MJ Lazer, Matthew A Baum, Yochai Benkler, Adam J Berinsky, Kelly M Greenhill, Filippo Menczer, Miriam J Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, 2018. The science of fake news. Science 359, 6380 (2018), 1094–1096.Google Scholar
- Zichao Li, Xin Jiang, Lifeng Shang, and Hang Li. 2017. Paraphrase generation with deep reinforcement learning. arXiv preprint arXiv:1711.00279(2017).Google Scholar
- Kathleen R McKeown. 1980. Paraphrasing using given and new information in a question-answer system. Technical Reports (CIS)(1980), 723.Google Scholar
- Marie Meteer and Varda Shaked. 1988. Strategies for effective paraphrasing. In Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics.Google Scholar
- Julie Posetti. 2018. News industry transformation: digital technology, social platforms and the spread of misinformation and disinformation. Journalism,‘fake news’ and disinformation: A handbook for journalism education and training. Unesco. https://bit. ly/2XLRRlA (2018).Google Scholar
- Aaditya Prakash, Sadid A Hasan, Kathy Lee, Vivek Datla, Ashequl Qadir, Joey Liu, and Oladimeji Farri. 2016. Neural paraphrase generation with stacked residual LSTM networks. arXiv preprint arXiv:1610.03098(2016).Google Scholar
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J Liu, 2020. Exploring the limits of transfer learning with a unified text-to-text transformer.J. Mach. Learn. Res. 21, 140 (2020), 1–67.Google Scholar
- AB Siddique, Samet Oymak, and Vagelis Hristidis. 2020. Unsupervised paraphrasing via deep reinforcement learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1800–1809.Google ScholarDigital Library
- Fatemeh Torabi Asr and Maite Taboada. 2019. Big Data and quality data for fake news and misinformation detection. Big Data & Society 6, 1 (2019), 2053951719843310.Google ScholarCross Ref
- Su Wang, Rahul Gupta, Nancy Chang, and Jason Baldridge. 2019. A task in a suit and a tie: paraphrase generation with semantic augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7176–7183.Google ScholarDigital Library
- Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V Le. 2018. Qanet: Combining local convolution with global self-attention for reading comprehension. arXiv preprint arXiv:1804.09541(2018).Google Scholar
- Jiawei Zhang, Bowen Dong, and S Yu Philip. 2020. Fakedetector: Effective fake news detection with deep diffusive neural network. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1826–1829.Google ScholarCross Ref
- Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2019. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. arxiv:1912.08777 [cs.CL]Google Scholar
- Arkaitz Zubiaga, Maria Liakata, Rob Procter, Kalina Bontcheva, and Peter Tolmie. 2015. Towards detecting rumours in social media. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.Google Scholar
Index Terms
- Sanitization of Sepsis News Sentences with the help of Paraphrasing
Recommendations
Sanitization's slippery slope: the design and study of a text revision assistant
SOUPS '09: Proceedings of the 5th Symposium on Usable Privacy and SecurityFor privacy reasons, sensitive content may be revised before it is released. The revision often consists of redaction, that is, the "blacking out" of sensitive words and phrases. Redaction has the side effect of reducing the utility of the content, ...
Paraphrasing Arabic Metaphor with Neural Machine Translation
AbstractThe task of recognizing and generating paraphrases is an essential component in many Arabic natural language processing (NLP) applications. A well-established machine translation approach for automatically extracting paraphrases, leverages ...
Pneumonia and Sepsis Trends with Watson Analytics
ICBDE '19: Proceedings of the 2019 International Conference on Big Data and EducationPneumonia and sepsis are two prevalent diseases in Southeast Asia. Pneumonia is a respiratory disease which is often caused by bacteria, viruses, or fungi. The bacteria then inflames air sacs in the lungs, known as alveoli, and blocks the interstitial ...
Comments