Abstract
In this paper, we have proposed a fusion of two architectures, self-organizing map and granular self-organizing map (SOM + GSOM), for solving the microblog summarization task where a set of relevant tweets are extracted from the available set of tweets. SOM is used to reduce the available set of tweets to a smaller subset, and GSOM is used for extracting relevant tweets. The fusion of SOM + SOM is also accomplished to illustrate the effectiveness of GSOM over SOM in the second architecture. Moreover, only SOM version is also utilized to illustrate the potentiality of fusion in our proposed approaches. As similarity/dissimilarity measures play major role in any summarization system; therefore, to measure the same between tweets, various measures like word mover distance, cosine distance and Euclidean distance are also explored. The results obtained are evaluated on four datasets related to disaster events using ROUGE measures. Experimental results demonstrate that our best-proposed approach (SOM + GSOM) has obtained \(17\%\) and \(5.9\%\) improvements in terms of ROUGE-2 and ROUGE-L scores, respectively, over the existing techniques. The results are also validated using statistical significance t-test.
Similar content being viewed by others
References
De Maio C, Fenza G, Loia V, Parente M (2016) Time aware knowledge extraction for microblog summarization on twitter. Inf Fus 28:60–74
Dutta S, Chandra V, Mehra K, Das AK, Chakraborty T, Ghosh S (2018) Ensemble algorithms for microblog summarization. IEEE Intell Syst 33(3):4–14
Dutta S, Chandra V, Mehra K, Ghatak S, Das AK, Ghosh S (2019) Summarizing microblogs during emergency events a comparison of extractive summarization algorithms. In: Emerging technologies in data mining and information security. Springer, Berlin, pp 859–872
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Faigl J, Hollinger GA (2017) Autonomous data collection using a self-organizing map. IEEE Trans Neural Netw Learn Syst 29(5):1703–1715
Galanis D, Lampouras G, Androutsopoulos I (2012) Extractive multi-document summarization with integer linear programming and support vector regression. In: Proceedings of COLING pp 911–926
Ganivada A, Dutta S, Pal SK (2011) Fuzzy rough granular neural networks, fuzzy granules, and classification. Theor Comput Sci 412(42):5834–5853
Ganivada A, Ray SS, Pal SK (2012) Fuzzy rough granular self-organizing map and fuzzy rough entropy. Theor Comput Sci 466:37–63
Garg N, Favre B, Reidhammer K, Hakkani-Tür D (2009) Clusterrank: a graph based method for meeting summarization. In: Tenth annual conference of the international speech communication association
Gharib TF, Fouad MM, Mashat A, Bidawi I (2012) Self organizing map-based document clustering using wordnet ontologies. Inter J Comput Sci Issues (IJCSI) 9(1):88
Gong Y, Liu X (2001) Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp 19–25
Haykin SS, Haykin SS, Haykin SS, Haykin SS (2009) Neural networks and learning machines, vol 3. Pearson Upper Saddle River, NJ
He Z, Chen C, Bu J, Wang C, Zhang L, Cai D, He X (2012) Document summarization based on data reconstruction. In: Twenty-sixth AAAI conference on artificial intelligence, pp 620–626
Huang A (2008) Similarity measures for text document clustering. In: Proceedings of the sixth new zealand computer science research student conference (NZCSRSC), Christchurch, New Zealand, pp 9–56
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:150801991
Imran M, Mitra P, Castillo C (2016) Twitter as a lifeline: human-annotated twitter corpora for NLP of crisis-related messages. In: Proceedings of the tenth international conference on language resources and evaluation (LREC), pp 1638–1643
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall Inc., New Jersey
Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6
Kumar K, Shrimankar DD, Singh N (2018) Somes: An efficient som technique for event summarization in multi-view surveillance videos. In: Recent Findings in Intelligent Computing Techniques, Springer, pp 383–389
Kusner M, Sun Y, Kolkin N, Weinberger K (2015) From word embeddings to document distances. In: International conference on machine learning, pp 957–966
Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165
Nam TM, Phong PH, Khoa TD, Huong TT, Nam PN, Thanh NH, Thang LX, Tuan PA, Loi VD, et al. (2018) Self-organizing map-based approaches in DDOS flooding detection using SDN. In: 2018 International conference on information networking (ICOIN), IEEE, pp 249–254
Nenkova A, Vanderwende L (2005) The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech Rep MSR-TR-2005 101
Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:171204621
Radev DR, Hovy E, McKeown K (2002) Introduction to the special issue on summarization. Comput Linguist 28(4):399–408
Ravì D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B, Yang GZ (2016) Deep learning for health informatics. IEEE J Biomed Health Inf 21(1):4–21
Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 29(9):2352–2449
Ray SS, Ganivada A, Pal SK (2016) A granular self-organizing map for clustering and gene selection in microarray data. IEEE Trans Neural Netw Learn Syst 27(9):1890–1906
Rosenthal S, Farra N, Nakov P (2019) Semeval-2017 task 4: sentiment analysis in twitter. arXiv preprint arXiv:191200741
Rudra K, Ghosh S, Ganguly N, Goyal P, Ghosh S (2015) Extracting situational information from microblogs during disaster events: a classification-summarization approach. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 583–592
Rudra K, Goyal P, Ganguly N, Mitra P, Imran M (2018a) Identifying sub-events and summarizing disaster-related information from microblogs. In: The 41st international ACM SIGIR conference on research & development in information retrieval, ACM, pp 265–274
Rudra K, Sharma A, Ganguly N, Ghosh S (2018b) Characterizing and countering communal microblogs during disaster events. IEEE Trans Comput Soc Syst 5(2):403–417
Saini N, Saha S, Bhattacharyya P (2018) Automatic scientific document clustering using self-organized multi-objective differential evolution. Cogn Comput 11:1–23. https://doi.org/10.1007/s12559-018-9611-8
Saini N, Saha S, Bhattacharyya P (2019a) Automatic scientific document clustering using self-organized multi-objective differential evolution. Cogn Comput 11(2):271–293
Saini N, Saha S, Bhattacharyya P (2019b) Multiobjective-based approach for microblog summarization. IEEE Trans Comput Soc Syst 6(6):1219–1231
Saini N, Saha S, Chakraborty D, Bhattacharyya P (2019) Extractive single document summarization using binary differential evolution: ptimization of different sentence quality measures. PloS one 14(11):e0223477
Silva B, Marques NC (2015) The ubiquitous self-organizing map for non-stationary data streams. J Big Data 2(1):27
Smith KA, Ng A (2003) Web page clustering using a self-organizing map of user navigation patterns. Decis Support Syst 35(2):245–256
Sousa RG, Neto ARR, Cardoso JS, Barreto GA (2015) Robust classification with reject option using the self-organizing map. Neural Comput Appl 26(7):1603–1619
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association
Welch BL (1947) The generalization of student’s problem when several different population variances are involved. Biometrika 34(1/2):28–35
Yin W, Kann K, Yu M, Schütze H (2017) Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:170201923
Zhang H, Zhou A, Song S, Zhang Q, Gao XZ, Zhang J (2016) A self-organizing multiobjective evolutionary algorithm. IEEE Trans Evolut Comput 20(5):792–806
Acknowledgements
Dr. Sriparna Saha would like to acknowledge the support of SERB Women in Excellence Award-SB/WEA-08/2017 for conducting this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Saini, N., Saha, S., Mansoori, S. et al. Fusion of self-organizing map and granular self-organizing map for microblog summarization. Soft Comput 24, 18699–18711 (2020). https://doi.org/10.1007/s00500-020-05104-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05104-2