Skip to main content
Log in

Emerging topic identification from app reviews via adaptive online biterm topic modeling

基于自适应在线双词主题模型的应用程序评论新兴主题识别

  • Research Article
  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

Emerging topics in app reviews highlight the topics (e.g., software bugs) with which users are concerned during certain periods. Identifying emerging topics accurately, and in a timely manner, could help developers more effectively update apps. Methods for identifying emerging topics in app reviews based on topic models or clustering methods have been proposed in the literature. However, the accuracy of emerging topic identification is reduced because reviews are short in length and offer limited information. To solve this problem, an improved emerging topic identification (IETI) approach is proposed in this work. Specifically, we adopt natural language processing techniques to reduce noisy data, and identify emerging topics in app reviews using the adaptive online biterm topic model. Then we interpret the implicature of emerging topics through relevant phrases and sentences. We adopt the official app changelogs as ground truth, and evaluate IETI in six common apps. The experimental results indicate that IETI is more accurate than the baseline in identifying emerging topics, with improvements in the F1 score of 0.126 for phrase labels and 0.061 for sentence labels. Finally, we release the codes of IETI on Github (https://github.com/wanizhou/IETI).

摘要

应用程序评论中的新兴主题突出了用户在一定时期内关注的主题 (如软件漏洞) 。准确、及时地识别新兴主题能帮助开发者更有效地更新应用程序。已有文献基于主题模型或聚类方法识别应用程序评论中的新兴主题。然而, 由于评论文本长度较短, 提供的信息有限, 新兴主题识别准确率较低。为解决该问题, 提出一种改进的新兴主题识别方法 (IETI) 。首先采用自然语言处理技术减少评论文本中的噪音数据, 然后使用自适应在线双词主题模型识别评论中的新兴主题。最后利用新兴主题中相关的短语和句子解释新兴主题的含义。采用官方更新日志作为新兴主题的评估标准, 选择6个常见的应用程序对IETI进行评估。实验结果表明, IETI在识别新兴主题方面优于传统方法, 短语标签F1值增量为0.126, 句子标签F1值增量为0.061。我们在Github (https://github.com/wanizhou/IETI) 上发布了IETI的代码。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Wan ZHOU and Yong WANG designed the research. Wan ZHOU processed the data and drafted the paper. Yong WANG, Cuiyun GAO, and Fei YANG helped organize the paper. Wan ZHOU and Yong WANG revised and finalized the paper.

Corresponding author

Correspondence to Yong Wang  (王勇).

Additional information

Compliance with ethics guidelines

Wan ZHOU, Yong WANG, Cuiyun GAO, and Fei YANG declare that they have no conflict of interest.

Project supported by the Anhui Provincial Natural Science Foundation of China (No. 1908085MF183), the National Natural Science Foundation of China (Nos. 62002084 and 61976005), the Training Program for Young and Middle-Aged Top Talents of Anhui Polytechnic University, China (No. 201812), the Zhejiang Provincial Natural Science Foundation of China (No. LQ21F020004), the State Key Laboratory for Novel Software Technology (Nanjing University) Research Program, China (No. KFKT2019B23), the Open Research Fund of Anhui Key Laboratory of Detection Technology and Energy Saving Devices, Anhui Polytechnic University, China (No. DTESD2020B03), and the Stable Support Plan for Colleges and Universities in Shenzhen, China (No. GXWD20201230155427003-20200730101839009)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, W., Wang, Y., Gao, C. et al. Emerging topic identification from app reviews via adaptive online biterm topic modeling. Front Inform Technol Electron Eng 23, 678–691 (2022). https://doi.org/10.1631/FITEE.2100465

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.2100465

Key words

关键词

CLC number

Navigation