Abstract
The accurate modelling and forecasting of the popularity of emerging topics can benefit researchers by allocating resources and efforts on promising research directions. While existing forecasting approaches enjoy various levels of success, most suffer from at least one of the following three limitations: a limited scope due to having to mine topic terms from only a few documents, low generalizability due to assigning arbitrary binary classifications on topics to be either “emerging” or not, or using an emerging topic or field of study’s historical features as inputs to forecast its future popularity while disregarding the existing effect of a “cold start”. In this paper we propose a forecasting algorithm that address all three limitations in three steps. Firstly, we leverage the field of study taxonomy present in most academic databases to obtain a neighborhood of trending fields within the discipline of the field of study of interest. Then, dynamic time warping is used to measure the similarity of each neighbour’s trending pattern compared to the trending pattern of the field of study of interest. Lastly, we conduct multivariate forecasting using a LSTM model while utilizing the historical popularity scores of similar trending neighbours as input. Experimental results on 5 emerging fields of study showcases the “cold start” phenomenon as well as the proposed algorithm reducing RMSE, MAE, and MAPE by half for 4 emerging topics. This validates the claim of the limitations for existing methods and provides insight on the dependency structure of emerging topics with their historical features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aboagye-Sarfo, P., Mai, Q., Sanfilippo, F.M., Preen, D.B., Stewart, L.M., Fatovich, D.M.: A comparison of multivariate and univariate time series approaches to modelling and forecasting emergency department demand in Western Australia. J. Biomed. Inform. 57, 62–73 (2015)
Asooja, K., Bordea, G., Vulcu, G., Buitelaar, P.: Forecasting emerging trends from scientific literature. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation, LREC 2016, pp. 417–420, May 2016
Bauwens, L., Laurent, S., Rombouts, J.V.: Multivariate GARCH models: a survey. J. Appl. Economet. 21(1), 79–109 (2006)
Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: Proceedings of KDD Workshop, vol. 10, no. 16, pp. 359–370, July 1994
Cataldi, M., Di Caro, L., Schifanella, C.: Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the Tenth International Workshop on Multimedia Data Mining, pp. 1–10, July 2010
Chen, B., Tsutsui, S., Ding, Y., Ma, F.: Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. J. Informet. 11(4), 1175–1189 (2017)
Chu, V.W., Wong, R.K., Chi, C.H., Chen, F.: Extreme topic model for market eAlert service. In: Proceedings of the 2018 IEEE International Conference on Services Computing (SCC), pp. 145–152. IEEE, July 2018
Dridi, A., Gaber, M.M., Azad, R.M.A., Bhogal, J.: Leap2Trend: a temporal word embedding approach for instant detection of emerging scientific trends. IEEE Access 7, 176414–176428 (2019)
Effendy, S., Yap, R.H.: Analysing trends in computer science research: a preliminary study using the Microsoft academic graph. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 1245–1250, April 2017
Gao, Z., et al.: SeCo-LDA: mining service co-occurrence topics for recommendation. In: Proceedings of the 2016 IEEE International Conference on Web Services (ICWS), pp. 25–32. IEEE, June 2016
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864, August 2016
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Jung, S., Datta, R., Segev, A.: Identification and prediction of emerging topics through their relationships to existing topics. In: Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), pp. 5078–5087. IEEE, December 2020
Kim, M., Baek, I., Song, M.: Topic diffusion analysis of a weighted citation network in biomedical literature. J. Am. Soc. Inf. Sci. 69(2), 329–342 (2018)
Lam, X.N., Vu, T., Le, T.D., Duong, A.D.: Addressing cold-start problem in recommendation systems. In: Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication, pp. 208–211, January 2008
Liang, Z., Mao, J., Lu, K., Ba, Z., Li, G.: Combining deep neural network and bibliometric indicator for emerging research topic prediction. Inf. Process. Manag. 58(5), 102611 (2021)
Malarya, A., Ragunathan, K., Kamaraj, M.B., Vijayarajan, V.: Emerging trends demand forecast using dynamic time warping. In: Proceedings of the 2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI), August 2021
Qiu, Y., Chen, Y., Jiao, L., Huang, S.: RTA: real time actionable events detection as a service. In: Proceedings of the 2016 IEEE International Conference on Web Services, June 2016
Ribeiro, L.F., Saverese, P.H., Figueiredo, D.R.: struc2vec: learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394, August 2017
Saeed, Z., Abbasi, R.A., Razzak, I., Maqbool, O., Sadaf, A., Xu, G.: Enhanced heartbeat graph for emerging event detection on Twitter using time series networks. Expert Syst. Appl. 136, 115–132 (2019)
Shi, M., Liu, J., Zhou, D., Tang, M., Xie, F., Zhang, T.: A probabilistic topic model for mashup tag recommendation. In: Proceedings of the 2016 IEEE International Conference on Web Services (ICWS), pp. 444–451. IEEE, June 2016
Small, H., Boyack, K.W., Klavans, R.: Identifying emerging topics in science and technology. Res. Policy 43(8), 1450–1467 (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 27 (2014)
Tang, J.: AMiner: toward understanding big scholar data. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, p. 467, February 2016
Taskaya-Temizel, T., Casey, M.C.: A comparative study of autoregressive neural network hybrids. Neural Netw. 18(5–6), 781–789 (2005)
Wang, K., Shen, Z., Huang, C., Wu, C.H., Dong, Y., Kanakia, A.: Microsoft academic graph: when experts are not enough. Quant. Sci. Stud. 1(1), 396–413 (2020)
Wang, Q.: A bibliometric model for identifying emerging research topics. J. Am. Soc. Inf. Sci. 69(2), 290–304 (2018)
Xu, M., Du, J., Xue, Z., Guan, Z., Kou, F., Shi, L.: A scientific research topic trend prediction model based on multi‐LSTM and graph convolutional network. Int. J. Intell. Syst. (2022)
Xu, S., Hao, L., An, X., Yang, G., Wang, F.: Emerging research topics detection with multiple machine learning models. J. Informet. 13(4), 100983 (2019)
Zheng, Z., Zhu, J., Lyu, M.R.: Service-generated big data and big data-as-a-service: an overview. In: Proceedings of the 2013 IEEE International Congress on Big Data, June 2013
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chi, Y., Wong, R., Shepherd, J. (2022). Popularity Forecasting for Emerging Research Topics at Its Early Stage of Evolution. In: Chen, W., Yao, L., Cai, T., Pan, S., Shen, T., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2022. Lecture Notes in Computer Science(), vol 13725. Springer, Cham. https://doi.org/10.1007/978-3-031-22064-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-22064-7_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22063-0
Online ISBN: 978-3-031-22064-7
eBook Packages: Computer ScienceComputer Science (R0)