Abstract
Natural language processing is one the most interesting study leading to huge research solutions in the modern era. Multilingual toxic comment classification can be served as a huge benefit to the existing social media life where comments, tweets, etc. can be analyzed when a topic is known to the system. This would help in the prevention of false commenting and better interpretation and analysis of the miscommunication that occurred on different social media platforms over a certain issue. Multilingual toxic comment classification refers to the analysis of a hypothetical sentence proposed given a premise. This classification is divided into three categories are the hypothetical sentence proposed can be either an entailment to the premise, neutral, or contradictory to the known premise statement. Natural language inference is considered as one of the most trending problems under the field of natural language processing which helps to determine how two statements given the premise and hypothesis are related to each other. Thus, the paper proposes different models such as CBOW, ESIM, BiLSTM and fine-tuned XML-RoBERTa model to predict the relationship between two statements. The prediction helps in the determination of whether the given hypothesis is in an entailment, neutral, contradictory relation with the given premise. The paper shows a study over various algorithms that can be used to solve the natural language inference problem. After analysis, the paper also proposes a model that obtained an accuracy of 95.35% with a ROC score of 0.9629 for the entailment relationship, 0.97076 for the neutral relationships, and 0.9797 for the contradictory relationships between the sentence pairs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fuchs, C.: Social media: A critical introduction. SAGE publications Limited (2021)
Arbieu, U., Helsper, K., Dadvar, M., Mueller, T., Niamir, A.: Natural language processing as a tool to evaluate emotions in conservation conflicts. Biol. Cons. 256, 109030 (2021)
Zhang, K., et al.: Multilevel image-enhanced sentence representation net for natural language inference. IEEE Trans. Syst. Man Cybern. Syst. 51(6), 3781–3795 (2019)
Li, Z., Ding, X., Liu, T.: TransBERT: a three-stage pre-training technology for story-ending prediction. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 20(1), 1–20 (2021)
Saeed, H.H., Ashraf, M.H., Kamiran, F., Karim, A., Calders, T.: Roman Urdu toxic comment classification. Lang. Resour. Eval. 55(4), 971–996 (2021). https://doi.org/10.1007/s10579-021-09530-y
Lees, A., Sorensen, J., Kivlichan, I.: Jigsaw@ AMI and HaSpeeDe2: Fine-Tuning a Pre-Trained Comment-Domain BERT Model. In: Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), Bologna, Italy (2020). http://ceur.org
Nie, Y., Wang, Y., Bansal, M.: Analyzing compositionality-sensitivity of NLI models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, No. 01, pp. 6867–6874, July 2019
Du, Q., Zong, C., Su, K.Y.: Conducting natural language inference with word-pair-dependency and local context. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 19(3), 1–23 (2020)
Guo, M., Zhang, Y., Liu, T.: Gaussian transformer: a lightweight approach for natural language inference. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 6489–6496, July 2019
Naik, A., Ravichander, A., Sadeh, N., Rose, C., Neubig, G.: Stress test evaluation for natural language inference (2018). arXiv preprint arXiv:1806.00692
Poliak, A., et al.: Collecting diverse natural language inference problems for sentence representation evaluation (2018). arXiv preprint arXiv:1804.08207
Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural language processing. In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, pp. 1–10, April 2017
Chen, Q., Zhu, X., Ling, Z., Wei, S., Jiang, H., Inkpen, D.: Enhanced lstm for natural language inference (2016). arXiv preprint arXiv:1609.06038
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J. A decomposable attention model for natural language inference (2016). arXiv preprint arXiv:1606.01933
Tanana, M.J., et al.: How do you feel? Using natural language processing to automatically rate emotion in psychotherapy. Behav. Res. Methods 53(5), 2069–2082 (2021). https://doi.org/10.3758/s13428-020-01531-z
Stewart, R., Velupillai, S.: Applied natural language processing in mental health big data. Neuropsychopharmacology 46(1), 252–253 (2021)
Sabarmathi, K.R., Gowthami, K., Kumar, S.S.: Fake news detection using machine learning and Natural Language Inference (NLI). In: IOP Conference Series: Materials Science and Engineering, vol. 1084, No. 1, p. 012018. IOP Publishing (2021)
Abzianidze, L.: Solving textual entailment with the theorem prover for natural language. AMIM 25(2), 114–136 (2020)
Pathak, A., Manna, R., Pakray, P., Das, D., Gelbukh, A., Bandyopadhyay, S.: Scientific text entailment and a textual-entailment-based framework for cooking domain question answering. Sādhanā 46(1), 1–19 (2021). https://doi.org/10.1007/s12046-021-01557-9
Zhao, R., Yongquan, Y., Zeng, T.: The identification of main contradictory information. In: Wei, L., Cai, G., Liu, W., Xing, W. (eds.) Proceedings of the 2012 International Conference on Information Technology and Software Engineering, pp. 945–953. Springer Berlin Heidelberg, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-34522-7_99
Sai, S., Jacob, A.W., Kalra, S., Sharma, Y.: Stacked embeddings and multiple fine-tuned XLM-roBERTa models for enhanced hostility identification. In: Chakraborty, T., Shu, K., Bernard, H.R., Liu, H., Akhtar, M.S. (eds.) Combating Online Hostile Posts in Regional Languages during Emergency Situation. CCIS, vol. 1402, pp. 224–235. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73696-5_21
Macková, K., Straka, M.: Reading comprehension in Czech via machine translation and cross-lingual transfer. In: Sojka, P., Kopeček, I., Pala, K., Horák, A. (eds.) Text, Speech, and Dialogue. LNCS (LNAI), vol. 12284, pp. 171–179. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58323-1_18
Jain, N., et al.: Prediction modelling of COVID using machine learning methods from B-cell dataset. Results in Physics 21, 103813 (2021)
Sameer, M., Gupta, B.: ROC analysis of EEG subbands for epileptic seizure detection using Naïve Bayes classifier. J. Mob. Multimed. 299–310 (2021)
Bunn, C., et al.: Application of machine learning to the prediction of postoperative sepsis after appendectomy. Surgery 169(3), 671–677 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jhunthra, S., Garg, H., Gupta, V. (2023). Identifying the Relationship Between Hypothesis and Premise. In: Santosh, K., Goyal, A., Aouada, D., Makkar, A., Chiang, YY., Singh, S.K. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2022. Communications in Computer and Information Science, vol 1704. Springer, Cham. https://doi.org/10.1007/978-3-031-23599-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-031-23599-3_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23598-6
Online ISBN: 978-3-031-23599-3
eBook Packages: Computer ScienceComputer Science (R0)