Abstract
Detecting suicidal ideation on communication platforms such as social media is critical for suicide prevention, as these platforms are frequently used for emotional expression and can reflect significant behavior changes. Many machine learning and deep learning techniques have been employed to address this issue, utilizing embedding methods such as Count Vector, Term Frequency-Inverse Document Frequency, Bidirectional Encoder Representations from Transformers, Multilingual Universal Sentence Encoder etc generate high-dimensional vectors. Directly inputting word embeddings into models can introduce noise and outliers, which may negatively impact predictive accuracy. Therefore, feature selection to optimize the dimensionality of word embedding vectors has emerged as a promising direction for future research. This study proposes a feature selection method called Propose Best Feature Selection, which combines Grey Wolf Optimization, Recursive Feature Elimination, and Stepwise Feature Selection. It uses a Voting Classifier to identify and filter the most significant features, reducing dimensionality. These optimized features are then fed into a stacked ensemble hybrid model, with Bi-Directional Gated Recurrent Unit with Attention and Convolutional Neural Network, acting like base and Extreme Gradient Boostis working like the meta-classifier, achieving an accuracy of 98% in Reddit and 97% in Twitter(X) dataset, outperforming similar methods in the field. This work is focused on textual data, and future efforts may expand to include multimodal analysis, incorporating image-based emotional cues. Scalability challenges for large datasets and real-time applications remain a key limitation.















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data Availability
Data set is available on Kaggale and Github. Kaggle (Reddit Dataset): https://www.kaggle.com/datasets/nikhileswarkomati/suicide-watch Github (Twitter Dataset): https://github.com/laxmimerit/twitter-suicidal-intention-dataset
References
Organization WH (2021) Suicide. Accessed 17 Dec 2024. https://www.who.int/news-room/fact-sheets/detail/suicide
Mental Health NI (2023) Suicide. Accessed 17 Dec 2024. https://www.nimh.nih.gov/health/statistics/suicide
Zhao X, Lin S, Huang Z (2018) Text classification of micro-blog’s tree hole based on convolutional neural network. In: Proceedings of the 2018 international conference on algorithms, computing and artificial intelligence, pp 1–5
Skaik R, Inkpen D (2020) Using social media for mental health surveillance: a review. ACM Computing Surveys (CSUR) 53(6):1–31
Hevia AG, Menéndez RC, Gayo-Avello D (2019) Analyzing the use of existing systems for the clpsych 2019 shared task. In: Proceedings of the sixth workshop on computational linguistics and clinical psychology, pp 148–151
Just MA, Pan L, Cherkassky VL, McMakin DL, Cha C, Nock MK, Brent D (2017) Retracted article: machine learning of neural representations of suicide and emotion concepts identifies suicidal youth. Nat Hum Behav 1(12):911–919
Morales M, Dey P, Theisen T, Belitz D, Chernova N (2019) An investigation of deep learning systems for suicide risk assessment. In: Proceedings of the sixth workshop on computational linguistics and clinical psychology, pp 177–181
Cao L, Zhang H, Feng L, Wei Z, Wang X, Li N, He X (2019) Latent suicide risk detection on microblog via suicide-oriented word embeddings and layered attention. arXiv:1910.12038
Liu J, Shi M, Jiang H (2022) Detecting suicidal ideation in social media: an ensemble method based on feature fusion. Int J Environ Res Public Health 19(13):8197
Zhang T, Schoene AM, Ji S, Ananiadou S (2022) Natural language processing applied to mental illness detection: a narrative review. NPJ Digital Med 5(1):1–13
Walsh CG, Ribeiro JD, Franklin JC (2017) Predicting risk of suicide attempts over time through machine learning. Clinical Psychol Sci 5(3):457–469
Gunn JF, Lester D (2012) Twitter postings and suicide: An analysis of the postings of a fatal suicide in the 24 hours prior to death. Suicidologi 17(3)
Lotito M, Cook E (2015) A review of suicide risk assessment instruments and approaches. Mental Health Clinician 5(5):216–223
Buda M, Maki A, Maki M (2020) Optimizing feature selection in machine learning algorithms for classification tasks. J Data Sci Anal 8(3):215–228
Zhou X, Chen H, Li F (2021) Improving machine learning algorithms for mental health detection in social media data. J Artif Intell Health 12(2):94–107
De Choudhury M, Kiciman E, Madhusudan S (2016) Predicting suicide risk from social media posts. In: Proceedings of the 2016 ACM CHI conference on human factors in computing systems, pp 36–45
Pustokhina I, Ilyasov R, Bukhtiyarov A (2021) Detecting suicidal ideation in social media using behavioral signals. In: Proceedings of the 2021 international conference on machine learning and data science, pp 110–120
Coppersmith G, Dredze M, Harman C (2016) Quantifying mental health signals in social media. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 51–61
Gao J, Zhou Y, Chen X (2019) Understanding social media posts in mental health prediction. In: Proceedings of the 2019 international conference on natural language processing, pp 212–218
Zhao L, Lee C, Lee S (2020) Integrating ai in suicide prevention strategies: Challenges and opportunities. J Psychiatry Mental Health 41(2):121–129
Tan Z, Liu X, Liu X, Cheng Q, Zhu T (2017) Designing microblog direct messages to engage social media users with suicide ideation: interview and survey study on weibo. J Med Internet Res 19(12):381
Huang X, Li X, Liu T, Chiu D, Zhu T, Zhang L (2015) Topic model for identifying suicidal ideation in chinese microblog. In: Proceedings of the 29th Pacific Asia conference on language, information and computation, pp 553–562
Katchapakirin K, Wongpatikaseree K, Yomaboot P, Kaewpitakkun Y (2018) Facebook social media for depression detection in the thai community. In: 2018 15th International Joint Conference on Computer Science and Software Engineering (JCSSE), IEEE, pp 1–6
Valeriano K, Condori-Larico A, Sulla-Torres J (2020) Detection of suicidal intent in spanish language social networks using machine learning. Int J Advan Comput Sci Appl 11(4)
Colombo GB, Burnap P, Hodorog A, Scourfield J (2016) Analysing the connectivity and communication of suicidal users on twitter. Comput Commun 73:291–300
Varathan KD, Talib N (2014) Suicide detection system based on twitter. In: 2014 Science and information conference, IEEE, pp 785–788
Aladağ AE, Muderrisoglu S, Akbas NB, Zahmacioglu O, Bingol HO (2018) Detecting suicidal ideation on forums: proof-of-concept study. J Med Internet Res 20(6):9840
Sawhney R, Manchanda P, Mathur P, Shah R, Singh R (2018) Exploring and learning suicidal ideation connotations on social media with deep learning. In: Proceedings of the 9th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 167–175
Coppersmith G, Leary R, Crutchley P, Fine A (2018) Natural language processing of social media as screening for suicide risk. Biomedical informatics insights 10:1178222618792860
Shing H-C, Nair S, Zirikly A, Friedenberg M, Daumé H III, Resnik P (2018) Expert, crowdsourced, and machine assessment of suicide risk via online postings. In: Proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic, pp 25–36
Hochreiter S (1997) Long short-term memory. Neural Computation MIT-Press
Gao J, Cheng Q, Yu PL (2019) Detecting comments showing risk for suicide in youtube. In: Proceedings of the Future Technologies Conference (FTC) 2018: vol 1, Springer, pp 385–400
Ji S, Yu CP, Fung S-F, Pan S, Long G (2018) Supervised learning for suicidal ideation detection in online user content. Complexity 2018(1):6157249
Haque F, Nur RU, Al Jahan S, Mahmud Z, Shah FM (2020) A transformer based approach to detect suicidal ideation using pre-trained language models. In: 2020 23rd International Conference on Computer and Information Technology (ICCIT), IEEE, pp 1–5
Abdulsalam A, Alhothali A (2024) Suicidal ideation detection on social media: a review of machine learning methods. Soc Netw Anal Min 14(1):1–16
Renjith S, Abraham A, Jyothi SB, Chandran L, Thomson J (2022) An ensemble deep learning technique for detecting suicidal ideation from posts in social media platforms. J King Saud University-Comput Inform Sci 34(10):9564–9575
Matero M, Idnani A, Son Y, Giorgi S, Vu H, Zamani M, Limbachiya P, Guntuku SC, Schwartz HA (2019) Suicide risk assessment with multi-level dual-context language and bert. In: Proceedings of the sixth workshop on computational linguistics and clinical psychology, pp 39–44
Tadesse MM, Lin H, Xu B, Yang L (2019) Detection of suicide ideation in social media forums using deep learning. Algorithms 13(1):7
Wang N, Luo F, Shivtare Y, Badal VD, Subbalakshmi K, Chandramouli R, Lee E (2021) Learning models for suicide prediction from social media posts. arXiv:2105.03315
Sawhney R, Joshi H, Gandhi S, Shah RR (2021) Towards ordinal suicide ideation detection on social media. In: Proceedings of the 14th ACM international conference on web search and data mining, pp 22–30
Li Z, Zhou J, An Z, Cheng W, Hu B (2022) Deep hierarchical ensemble model for suicide detection on imbalanced social media data. Entropy 24(4):442
Chadha A, Kaushik B (2022) A hybrid deep learning model using grid search and cross-validation for effective classification and prediction of suicidal ideation from social network data. N Gener Comput 40(4):889–914
Priyamvada B, Singhal S, Nayyar A, Jain R, Goel P, Rani M, Srivastava M (2023) Stacked cnn-lstm approach for prediction of suicidal ideation on social media. Multimed Tool Appl 82(18):27883–27904
Desmet B, Hoste V (2018) Online suicide prevention through optimised text classification. Inf Sci 439:61–78
Ghosal S, Jain A (2023) Depression and suicide risk detection on social media using fasttext embedding and xgboost classifier. Procedia Comput Sci 218:1631–1639
Shukla S, Singh MP (2024) Identifying key indicators words of suicidal ideation through ensemble voting classifier learning methods. pp 384–391. CRC Press. https://doi.org/10.1201/9781003501244-59
Shukla SSP, Singh MP (2024) Stacked classification approach using optimized hybrid deep learning model for early prediction of behaviour changes on social media. ACM Trans Asian Low-Resour Lang Inf Process. https://doi.org/10.1145/3689906. Just Accepted
Devlin J (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Yang Y, Cer D, Ahmad A, Guo M, Law J, Constant N, Abrego GH, Yuan S, Tar C, Sung Y-H et al (2019) Multilingual universal sentence encoder for semantic retrieval. arXiv:1907.04307
Raiaan M, Hossain MS, Fatema K, Fahad N, Sakib S, Mim MMJ, Ahmad J, Ali ME, Azam S (2024) A review on large language models: architectures, applications, taxonomies, open issues and challenges. IEEE Access PP, 1–1. https://doi.org/10.1109/ACCESS.2024.3365742
Almazini HF, Ku-Mahamud KR, Almazini H (2023) Heuristic initialization using grey wolf optimizer algorithm for feature selection in intrusion detection. Int. J. Intell. Eng. Syst 16(1):410–418
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
Pan H, Chen S, Xiong H (2023) A high-dimensional feature selection method based on modified gray wolf optimization. Applied Soft Computing 135:110031. https://doi.org/10.1016/j.asoc.2023.110031
Sadeghi AH, Bani EA, Fallahi A, Handfield R (2023) Grey wolf optimizer and whale optimization algorithm for stochastic inventory management of reusable products in a two-level supply chain. IEEE access 11:40278–40297
Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: Algorithm and applications. Futur Gener Comput Syst 97:849–872
Emary E, Zawbaa HM, Ghany KKA, Hassanien AE, Parv B (2015) Firefly optimization algorithm for feature selection. In: Proceedings of the 7th balkan conference on informatics conference, pp 1–7
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73:4773–4795
Castelli M, Manzoni L, Mariot L, Nobile MS, Tangherloni A (2022) Salp swarm optimization: a critical review. Expert Syst Appl 189:116029
Wiegand RE (2010) Performance of using multiple stepwise algorithms for variable selection. Stat Med 29(15):1647–1659
Chen JX, Jiang DM, Zhang YN (2019) A hierarchical bidirectional gru model with attention for eeg-based emotion classification. IEEE Access 7:118530–118540. https://doi.org/10.1109/ACCESS.2019.2936817
O’Shea K, Nash R (2015) An introduction to convolutional neural networks. ArXiv e-prints
Braithwaite SR, Giraud-Carrier C, West J, Barnes MD, Hanson CL (2016) Validating machine learning algorithms for twitter data against established measures of suicidality. JMIR Mental Health 3(2):4822
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
Shiv Shankar Prasad Shukla and Maheshwari Prasad Singh conceived the idea of using Voting classifier optimization PBFS for optimal feature selection and using an advance embedding method like BERT and hybrid model Classifier to detect behaviour changes from social media posts. The experiments and initial draft were developed by Shiv Shankar Prasad Shukla. Maheshwari Prasad Singh corrected the initial draft. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Ethical Approval
Not applicable
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shukla, S.S.P., Singh, M.P. Enhancing suicidal ideation detection through advanced feature selection and stacked deep learning models. Appl Intell 55, 303 (2025). https://doi.org/10.1007/s10489-025-06256-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-025-06256-0