Skip to main content
Log in

A semantic graph-based keyword extraction model using ranking method on big social data

  • Published:
Wireless Networks Aims and scope Submit manuscript

Abstract

Identification of an influential node is essential to control Online Social Networks. It is a crucial task with various kinds of real-time usages such as information retrieval and recommendation, instinctive keyword indexing, viral marketing, instinctive classification, and filtering. Existing graph-based methods consider only numeric measures such as centrality, degree, and betweenness for keyword extraction. In this research, a relatively effective method called Semantic graph-based Keyword Extraction Method (SKEM) from Twitter using ranking methods is proposed. In the proposed model, the exhaustive preprocessing is carried out, and a semantic graph-based model has been constructed. The numeric graph metrics are then used to weigh the nodes of the semantic graph. Page Rank algorithm is applied to arrange the nodes, and the top ten nodes that are found to be very relevant and effectively represent the most influential node. Combining both semantic as well as numeric graph metrics, it greatly enhances the quality of keywords extracted. The extensive preprocessing enhances the quality of the input and minimizes the noise in the input. The keywords extracted by the proposed model have been more relevant and meaningful. The performance of the proposed SKEM model is validated with real-time tweets of Twitter API. The experimental results are confirmed that the proposed method is achieving high performance in terms of precision, recall, and F-measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Hasan, K. S., & Ng, V. (2014). Automatic keyphrase extraction: A survey of the state of the art. In Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 1262–1273).

  2. Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404–411).

  3. Batziou, E., Gialampoukidis, I., Vrochidis, S., Antoniou, I., & Kompatsiaris, I. (2017). Unsupervised keyword extraction using the GoW model and centrality scores. In International conference on internet science (pp. 344–351). Springer, Cham.

  4. Lahiri, S., Choudhury, S. R., & Caragea, C. (2014). Keyword and keyphrase extraction using centrality measures on collocation networks. arXiv:1401.6571.

  5. Batziou, E., Gialampoukidis, I., Vrochidis, S., Antoniou, I., & Kompatsiaris, I. (2017). Unsupervised keyword extraction using the GoW model and centrality scores. In International conference on internet science (pp. 344–351). Springer, Cham.

  6. Beliga, S., Meštrović, A., & Martinčić-Ipšić, S. (2015). An overview of graph-based keyword extraction methods and approaches. Journal of information and organizational sciences, 39(1), 1–20.

    Google Scholar 

  7. Shabunina, E., & Pasi, G. (2018). A graph-based approach to ememes identification and tracking in social media streams. Knowledge-Based Systems, 139, 108–118.

    Article  Google Scholar 

  8. Rousseau, F., Kiagias, E., & Vazirgiannis, M. (2015). Text categorization as a graph classification problem. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers) (pp. 1702–1712).

  9. Abilhoa, W. D., & De Castro, L. N. (2014). A keyword extraction method from twitter messages represented as graphs. Applied Mathematics and Computation, 240, 308–325.

    Article  Google Scholar 

  10. Bellaachia, A., & Al-Dhelaan, M. (2012). Ne-rank: A novel graph-based keyphrase extraction in twitter. In 2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology (Vol. 1, pp. 372–379). IEEE.

  11. Biswas, S. K., Bordoloi, M., & Shreya, J. (2018). A graph-based keyword extraction model using collective node weight. Expert Systems with Applications, 97, 51–59.

    Article  Google Scholar 

  12. Abilhoa, W. D., & De Castro, L. N. (2014). A keyword extraction method from twitter messages represented as graphs. Applied Mathematics and Computation, 240, 308–325.

    Article  Google Scholar 

  13. Chen, P. I., & Lin, S. J. (2010). Automatic keyword prediction using Google similarity distance. Expert Systems with Applications, 37(3), 1928–1938.

    Article  Google Scholar 

  14. Ediger, D., Jiang, K., Riedy, J., Bader, D. A., Corley, C., Farber, R., & Reynolds, W. N. (2010). Massive social network analysis: Mining twitter for social good. In 2010 39th international conference on parallel processing (pp. 583–593). IEEE.

  15. Grineva, M., Grinev, M., & Lizorkin, D. (2009). Extracting key terms from noisy and multitheme documents. In Proceedings of the 18th international conference on World Wide Web (pp. 661–670). ACM.

  16. Yang, Y., & Xie, G. (2016). Efficient identification of node importance in social networks. Information Processing and Management, 52(5), 911–922.

    Article  Google Scholar 

  17. Maharani, W., & Gozali, A. A. (2014). Degree centrality and eigenvector centrality in twitter. In 2014 8th international conference on telecommunication systems services and applications (TSSA) (pp. 1–5). IEEE.

  18. Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404–411).

  19. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1–7), 107–117.

    Article  Google Scholar 

  20. Tsatsaronis, G., Varlamis, I., & Norvag, K. (2010). Semanticrank: Ranking keywords and sentences using semantic graphs. In Proceedings of COLING2010, Beijing, China, August 2010.

  21. Liu, Z., & Sun, M. (2012). Can prior knowledge help graph-based methods for keyword extraction? Frontiers of Electrical and Electronic Engineering, 7(2), 242–253.

    Google Scholar 

  22. Florescu, C., & Caragea, C. (2017). A position-biased pagerank algorithm for keyphrase extraction. In Thirty-first AAAI conference on artificial intelligence.

  23. W an, X., & Xiao, J. (2008). Single document keyphrase extraction using neighborhood knowledge. In Association for the advancement of artificial intelligence (AAAI) (Vol. 8, pp. 855–860).

  24. Bougouin, A., Boudin, F., & Daille, B. (2013). Topic rank: Graph-based topic ranking for keyphrase extraction. In International joint conference on natural language processing (IJCNLP) (pp. 543–551).

  25. El BazzI, M. S., Mammass, D., Zaki, T., & Ennaji, A. (2017). A graph-based ranking model for automatic keyphrases extraction from Arabic documents. In Industrial conference on data mining (pp. 313–322).

  26. Rousseau, F., Kiagias, E., & Vazirgiannis, M. (2015). Text categorization as a graph classification problem. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers) (pp. 1702–1712).

  27. Tixier, A., Malliaros, F., & Vazirgiannis, M. (2016). A graph degeneracy-based approach to keyword extraction. In Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 1860–1870).

  28. Hotho, A., Nürnberger, A., & Paaß, G. (2005). A brief survey of text mining. LDV Forum - GLDV Journal for Computational Linguistics and Language Technology, 20(1), 19–62.

    Google Scholar 

  29. Beliga, S., Meštrović, A., & Martinčić-Ipšić, S. (2014). Toward selectivity based keyword extraction for Croatian news. arXiv:1407.4723.

  30. Boudin, F. (2013). A comparison of centrality measures for graph-based keyphrase extraction. In Proceedings of the sixth international joint conference on natural language processing (pp. 834–838).

  31. Vega-Oliveros, D. A., Gomes, P. S., Milios, E. E., & Berton, L. (2019). A multi-centrality index for graph-based keyword extraction. Information Processing and Management, 56(6), 102063.

    Article  Google Scholar 

  32. Benyahia, O., & Largeron, C. (2015). Centrality for graphs with numerical attributes. In 2015 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM) (pp. 1348–1353). IEEE.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Subramaniyaswamy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Devika, R., Subramaniyaswamy, V. A semantic graph-based keyword extraction model using ranking method on big social data. Wireless Netw 27, 5447–5459 (2021). https://doi.org/10.1007/s11276-019-02128-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11276-019-02128-x

Keywords

Navigation