Abstract
Traditional hot topic detection algorithms cannot show its optimal performance on microblogs for their inherent flaws in constructing short-text representation model, implementing the core algorithm in large corpus with short time and evaluating the algorithms’ qualities during the process of detecting hot topics. In this paper, a novel method for detecting hot topics in microblogs is presented. This approach takes advantage of a probabilistic correlation-based representation measure in order to ensure a dense and low-dimension microblog representation matrix. Besides, we take the clustering as an optimization problem and introduce a discrete particle swarm optimization (DPSO) to simplify the clustering process to detect topics. Furthermore, the clustering quality evaluation criteria is adopted as the optimization objective function for topic detection which can evaluate the algorithms’ qualities after each iteration. Experimental results with corpora containing more than 148,000 twitters show that our algorithm is an effective hot topic detection method for microblog.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ding, Z.Y., Jia, Y., Zhou, B.: Survey of data mining for micro-blogs. J. Comput. Res. Dev. 04, 691–706 (2014)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Deerwester, S., Dumais, S.T., Furnas, G.W., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391 (1990)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. the. J. Mach. Learn. Res. 3, 993–1022 (2003)
Ma, H.F., Zhao, W.Z., Shi, Z.Z.: A nonnegative matrix factorization framework for semi-supervised document clustering with dual constraints. Knowl. Inf. Syst. 36(3), 629–651 (2013)
Ma, H., Jia, M., Xie, M., Lin, X.: A microblog recommendation algorithm based on multi-tag correlation. In: Zhang, S., et al. (eds.) KSEM 2015. LNCS, vol. 9403, pp. 483–488. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25159-2_43
Tang, J., Wang, X., Gao, H., et al.: Enriching short text representation in microblog for clustering. Front. Comput. Sci. 6(1), 88–101 (2012)
Cheng, X., Miao, D., Wang, C., et al.: Coupled term-term relation analysis for document clustering. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2013)
Song, S., Zhu, H., Chen, L.: Probabilistic correlation-based similarity measure on text records. Inf. Sci. 289, 8–24 (2014)
Kenndy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)
Zhao, X.C., Liu, G.L., Liu, H.Q., et al.: Particle swarm optimization algorithm based on non-uniform mutation and multiple stages perturbation. Chin. J. Comput. 9, 2058–2070 (2014)
Omran, M., Engelbrecht, A.P., Salman, A.: Particle swarm optimization method for image clustering. Int. J. Pattern Recogn. Artif. Intell. 19(03), 297–321 (2005)
Zhang, W.J., Liu, C.H., Li, F.Y.: Method of quality evaluation for clustering. J. Comput. Eng. 31(20), 10–12 (2005)
Cagnina, L.C., Errecalde, M.L., Ingaramo, D.A., et al.: An efficient particle swarm optimization approach to cluster short texts. Inf. Sci. 265, 36–49 (2014)
Cagnina, L.C., Errecalde, M.L., Ingaramo, D.A., et al.: A discrete particle swarm optimizer for clustering short-text corpora. In: Proceedings of the Bioinspired Optimization Methods and their Applications, BIOMA-2008, Ljubljana, Slovenia (2008)
Liu, S.P., Yin, J., Ouyang, J., et al.: Topic mining from microblogs based on MB-HDP model. Chin. J. Comput. 7(008), 1408–1419 (2015)
Guo, K., Shi, L.: A topic detection method based on microblog weight. In: 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), pp. 209–212. IEEE (2015)
Acknowledgement
This work is supported by the National Natural Science Foundation of China (No.61363058), Youth Science and technology support program of Gansu Province (145RJZA232, 145RJYA259), 2016 undergraduate innovation capacity enhancement program and 2016 annual public record open space Fund Project 1505JTCA007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ma, H., Ji, Y., Li, X., Zhou, R. (2016). A Microblog Hot Topic Detection Algorithm Based on Discrete Particle Swarm Optimization. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-42911-3_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)