Abstract
With the increase in accumulated data and usage of the Internet, social media such as Twitter has become a fundamental tool to access all kinds of information. Therefore, it can be expressed that processing, preparing data, and eliminating unnecessary information on Twitter gains its importance rapidly. In particular, it is very important to analyze the information and make it available in emergencies such as disasters. In the proposed study, an earthquake with the magnitude of Mw = 6.8 on the Richter scale that occurred on January 24, 2020, in Elazig province, Turkey, is analyzed in detail. Tweets under twelve hashtags are clustered separately by utilizing the Social Spider Optimization (SSO) algorithm with some modifications. The sum-of intra-cluster distances (SICD) is utilized to measure the performance of the proposed clustering algorithm. In addition, SICD, which works in a way of assigning a new solution to its nearest node, is used as an integer programming model to be solved with the GUROBI package program on the test data-sets. Optimal results are gathered and compared with the proposed SSO results. In the study, center tweets with optimal results are found by utilizing modified SSO. Moreover, results of the proposed SSO algorithm are compared with the K-means clustering technique which is the most popular clustering technique. The proposed SSO algorithm gives better results. Hereby, the general situation of society after an earthquake is deduced to provide moral and material supports.










Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets.php.
References
Thalamala RC, Venkata Swamy Reddy A, Janet B (2020) A novel bio-inspired algorithm based on social spiders for improving performance and efficiency of data clustering. J Intell Syst 29(1):311–326
Thalamala R, Barnabas J, Reddy AV (2019) A novel variant of social spider optimization using single centroid representation and enhanced mating for data clustering. PeerJ Comput Sci 5:201
Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft Comput 20(3):1113–1126
Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Liu X, Fu H (2010) An effective clustering algorithm with ant colony. J Comput 5(4):598–605
Song W, Park SC (2009) Genetic algorithm for text clustering based on latent semantic indexing. Comput Math with Appl 57(11–12):1901–1907
Hong SS, Lee W, Han MM (2015) The feature selection method based on genetic algorithm for efficient of text clustering and text classification. Int J Adv Soft Comput its Appl 7(1):22–40
TR Chandran, AV Reddy, and B Janet (2019) Performance comparison of social spider optimization for data clustering with other clustering methods. In: Proceedings 2nd International Conference Intelligent Computer Control Systems ICICCS 2018, no. Iciccs, pp 1119–1125
A Aghamohseni and R Ramezanian (2015) An efficient hybrid approach based on K-means and generalized fashion algorithms for cluster analysis. In: 2015 AI Robot. IRANOPEN 2015 - 5th Conference Artificial Intelligence Robotics, pp 1–7
Nandwalkar JR, Pete DJ (2021) Social spider optimization based optimized heat management for wet-electrospun polymer fiber. Microw Opt Technol Lett 63(2):670–678
Yu JJQ, Li VOK (2015) A social spider algorithm for global optimization. Appl Soft Comput J 30:614–627
R Zhao, A Zhou, and K Mao (2016) Automatic detection of cyberbullying on social networks based on bullying features. In: ACM International Conference Proceeding Series, vol 04–07, pp. 1–6
Deerwester S, Dumais ST, Furnas GW, Landauer TK (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Yilmaz S, Toklu S (2020) A deep learning analysis on question classification task using Word2vec representations. Neural Comput Appl 32(7):2909–2928
Corallo A et al (2020) Sentiment analysis of expectation and perception of MILANO EXPO2015 in twitter data: a generalized cross entropy approach. Soft Comput 24(18):13597–13607
Aaron Sonabend W et al (2020) Integrating questionnaire measures for transdiagnostic psychiatric phenotyping using word2vec. PLoS One 15(4):1–14
T. Hofmann (1999) Probabilistic latent semantic analysis. In: Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval (SIGIR-99)
E Altszyler, M Sigman, S Ribeiro, and DF Slezak, (2016) Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database. arXiv preprint 1–14
J Pennington, R Socher, and CD Manning (2014) GloVe: Global Vectors forWord Representation Jeffrey. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543
Naili M, Chaibi AH, Ben Ghezala HH (2017) Comparative study of word embedding methods in topic segmentation. Procedia Comput Sci 112:340–349
Aguilar J, Salazar C, Velasco H, Monsalve-Pulido J, Montoya E (2020) Comparison and evaluation of different methods for the feature extraction from educational contents. Computation 8(2):1–20
C Hua and W Wei, (2019) A particle swarm optimization k-means algorithm for mongolian elements clustering. In: 2019 IEEE Symposium Series Computer Intelligence SSCI 2019, pp. 1559–1564
Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Syst Appl 134:192–200
P Nema and V Sharma, (2016) Multi-label text categorization based on feature optimization using ant colony optimization and relevance clustering technique. In: Proceedings - 2015 International Conference Computer Communication Systems ICCCS 2015, pp. 1–5
P Hailong, Z Hui, L Wanglong, and M Ying, (2017) The research on the improved ant colony text clustering algorithm. In: 2017 IEEE 2nd International Conference Big Data Analysis ICBDA 2017, pp. 323–328
Cuevas E, Cienfuegos M, Zaldívar D, Pérez-cisneros M (2013) A swarm optimization algorithm inspired in the behavior of the social-spider. Expert Syst Appl 40(16):6374–6384
Abirami E (2019) Social spider optimization algorithm: theory and its applications. Int J Innov Technol Explor Eng 8(10):327–331
HM Zawbaa, E Emary, AE Hassanien, and B Parv, (2016) A wrapper approach for feature selection based on swarm optimization algorithm inspired from the behavior of social-spiders. In: Proceedings 2015 7th International Conference Soft Computer Pattern Recognition, SoCPaR 2015, pp. 25–30
Baş E, Ülker E (2020) An efficient binary social spider algorithm for feature selection problem. Expert Syst Appl 146:113185
Abd El Aziz M, Hassanien AE (2018) An improved social spider optimization algorithm based on rough sets for solving minimum number attribute reduction problem. Neural Comput Appl 30(8):2441–2452
TR Chandran, AV Reddy, and B Janet, (2016) A social spider optimization approach for clustering text documents. In: Proceeding IEEE - 2nd International Conference Advance Electrical and Electronical Information, Communication Bio-Informatics, IEEE - AEEICB 2016, pp. 22–26
Chandran TR, Reddy AV, Janet B (2017) Text clustering quality improvement using a hybrid social spider optimization. Int J Appl Eng Res 12(6):995–1008
Hart EM, Avile L (2014) reconstructing local population dynamics in noisy metapopulations — the role of random catastrophes and allee effects. PLoS One 9(10):110049
Ochoa I, Juárez-Casimiro A, Olivier K, Camarena T, Vázquez R (2017) Social spider algorithm to improve intelligent drones used in humanitarian disasters related to floods. Nature-inspired design of hybrid intelligent systems. Springer, Cham, pp 457–476
Wang W, Chau K, Xu D, Qiu L, Liu C (2017) The annual maximum flood peak discharge forecasting using hermite projection pursuit regression with SSO and LS method. Water Resour. Manag 31:461–477
Cuevas E, Cienfuegos M (2014) A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst Appl 41(2):412–425
L Webb and Y Wang, (2013) Techniques for sampling online text-based data sets. In: Advances in Data Mining and Database Management (ADMDM), no. May 2015
Indrayan A, Gupta P (2000) Clinical research methods sampling techniques, confidence intervals, and sample size. Natl Med J India 13:29–36
Pawde K, Purbey N, Gangan S, Kurup L (2014) Latent semantic analysis in information retrieval. Int J Eng Tech Res 2(10):243–246
Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2–3):259–284
Papadimitriou CH, Raghavan P, Tamaki H, Vempala S (2000) Latent semantic indexing: a probabilistic analysis. J Comput Syst Sci 61(2):217–235
JC Valle-Lisbo and E Mizraji, (2006) The uncovering of hidden structures by latent semantic analysis. arXiv
Levy O, Goldberg Y, Dagan I (2015) Improving distributional similarity with lessons learned from word embeddings. Trans Assoc Comput Linguist 3:211–225
Chueh C-H, Wang H-M, Chien J-T (2006) A maximum entropy approach for semantic language modeling. Comput Linguist Chin Lang Process 11(1):37–56
N Alnajran, K Crockett, D McLean, and A Latham (2017) Cluster analysis of twitter data: a review of algorithms. In: ICAART 2017 - Proceedings 9th International Conference Agents Artificial Intelligence, vol. 2, no. Icaart, pp. 239–249
Morissette L, Chartier S (2013) The k-means clustering technique: general considerations and implementation in Mathematica. Tutor Quant Methods Psychol 9(1):15–24
Haq EU, Hussain A, Ahmad I (2019) Performance evaluation of novel selection processes through hybridization of k-means clustering and genetic algorithm. Appl Ecol Environ Res 17(6):14159–14177
AP Bhopale and KS Sowmya (2017) Novel hybrid feature selection models for unsupervised document categorization.In: 2017 International Conference Advance Computer Communications Informatics, ICACCI 2017, vol. 2017–January, pp. 1471–1477
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Eligüzel, N., Çetinkaya, C. & Dereli, T. A state-of-art optimization method for analyzing the tweets of earthquake-prone region. Neural Comput & Applic 33, 14687–14705 (2021). https://doi.org/10.1007/s00521-021-06109-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06109-0