Skip to main content

Robustness and Sensitivity of Network-Based Topic Detection

  • Conference paper
  • First Online:
Complex Networks and Their Applications XI (COMPLEX NETWORKS 2016 2022)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1078))

Included in the following conference series:

  • 1618 Accesses

Abstract

In the context of textual analysis, network-based procedures for topic detection are gaining attention as an alternative to classical topic models. Network-based procedures are based on the idea that documents can be represented as word co-occurrence networks, where topics are defined as groups of strongly connected words. Although many works have used network-based procedures for topic detection, there is a lack of systematic analysis of how different design choices, such as the building of the word co-occurrence matrix and the selection of the community detection algorithm, affect the final results in terms of detected topics. In this work, we present the results obtained by analysing a widely used corpus of news articles, showing how and to what extent the choices made during the design phase affect the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alghamdi, R., Alfalqi, K.: A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 6, 147–153 (2015)

    Google Scholar 

  2. Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E.D., Gutierrez, J.B., Kochut, K.: A brief survey of text mining: Classification, clustering and extraction techniques, pp. 1–13 (2017). arXiv:1707.02919

  3. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  4. Blondel, V.D., Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. 1–12 (2008)

    Google Scholar 

  5. Bullinaria, J.A., Levy, J.P.: Extracting semantic representations from word co-occurrence statistics: a computational study. Behav. Res. Methods 39, 510–526 (2007)

    Article  Google Scholar 

  6. Dang, T., Nguyen, V.T.: ComModeler: topic modeling using community detection. In: Tominski, C., von Landesberger, T. (eds.), EuroVis Workshop on Visual Analytics (EuroVA). The Eurographics Association, pp. 1–5. (CH) (2018)

    Google Scholar 

  7. de Arruda, H.F., Costa, L.F., Amancio, D.R.: Topic segmentation via community detection in complex networks. Chaos 26, 1–10 (2015)

    MATH  Google Scholar 

  8. Greene, D., Cunningham, P.: Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings 23rd International Conference on Machine learning (ICML’06), pp. 377–384. ACM Press, New York (2006)

    Google Scholar 

  9. Hamm, A., Odrowski, S.: Term-community-based topic detection with variable resolution. Information 12, 221–252 (2021)

    Article  Google Scholar 

  10. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  MATH  Google Scholar 

  11. Kim, M., Sayama, H.: The power of communities: a text classification model with automated labeling process using network community detection. In: International Conference on Network Science, pp. 231–243. Springer, Berlin (2020)

    Google Scholar 

  12. Lancichinetti, A., Sirer, M.I., Wang, J.X., Acuna, D., K öording, K., Amaral, L.A.N.: High-reproducibility and high-accuracy method for automated topic classification. Phys. Rev. X. 5, 1–11 (2015)

    Google Scholar 

  13. Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 1–2 (2006)

    Article  Google Scholar 

  14. Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005)

    Article  Google Scholar 

  15. Salerno, M.D., Tataru, C.A., Mallory, M.R.: Word community allocation: discovering latent topics via word co-occurrence network structure (2015). http://snap.stanford.edu/class/cs224w-2015/projects_2015/Word_Community_Allocation.pdf

  16. Sayyadi, H., Raschid, L.: A graph analytical approach for topic detection. ACM Trans. Internet Technol. 1–23 (2013)

    Google Scholar 

  17. Uysal, A.K., Gunal, S.: The impact of preprocessing on text classification. IInf. Process. Manage. 50, 104–112 (2014)

    Article  Google Scholar 

  18. Usai, A., Pironti, M., Mital, M., Mejri, C.A.: Knowledge discovery out of text data: a systematic review via text mining. J. Knowl. Manag. 22, 1471–1488 (2018)

    Article  Google Scholar 

  19. Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Comput. Surv. 45, 1–35 (2013)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support provided by the “Dipartimenti Eccellenti 2018–2022” ministerial funds. This work has also been partly funded by eSSENCE, an e-Science collaboration funded as a strategic research area of Sweden, and by EU CEF grant number 2394203 (NORDIS—NORdic observatory for digital media and information DISorder).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carla Galluccio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Galluccio, C., Magnani, M., Vega, D., Ragozini, G., Petrucci, A. (2023). Robustness and Sensitivity of Network-Based Topic Detection. In: Cherifi, H., Mantegna, R.N., Rocha, L.M., Cherifi, C., Micciche, S. (eds) Complex Networks and Their Applications XI. COMPLEX NETWORKS 2016 2022. Studies in Computational Intelligence, vol 1078. Springer, Cham. https://doi.org/10.1007/978-3-031-21131-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21131-7_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21130-0

  • Online ISBN: 978-3-031-21131-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics