Elsevier

Information Sciences

Volume 187, 15 March 2012, Pages 15-32
Information Sciences

A time-varying propagation model of hot topic on BBS sites and Blog networks

https://doi.org/10.1016/j.ins.2011.09.025Get rights and content

Abstract

Modeling the propagation of hot online topic is a preliminary requirement of predicting the trend of hot online topic. We propose a time-varying hot topic propagation model in online discussion context based upon the collective behavior of users who are in different social subgroups on blog networks and bulletin board system (BBS) sites. By analyzing the stability of the equilibrium of our model, we search for the threshold to be watershed of the trend of hot online topic and generalize about two theorems from the results of analysis, they exposit two sufficient conditions under which the trend of hot online topic will die out or remain uniformly weakly persistent. Furthermore, we propose methods to predict the trend of hot online topic on the strength of our model and theorems. For different motivation, we design two methods: Method (I) is mainly served as a way of theoretical research for predicting long trend of single-peak hot online topic by the thresholds of theorems; and for application, we design method (II) to predict the number of users writing or commenting upon article posts with respect to multi-peak hot online topic and single-peak one in the following two days with the help of Method (I). Experiments of two methods are performed on widely-discussed topics on the Sina Blog and the famous Liang Quan Qi Mei (LQQM) BBS and Xi’an Jiaotong University (BMY) BBS in China. The experimental results show that our methods predict the trend of hot online topic efficiently not only for theoretical motivation but also for applicable motivation, and reduce the computational complexity. Hence, our model can serve as basis for predicting trends in hot online topic propagation.

Introduction

With the rapid development of Internet technology, particularly the emergence of Web 2.0 sites [20] such as blog networks, BBS sites and social networks, people are not only exchanging information online but also expressing their ideas and opinions openly. Generally speaking, blogs and BBS are two kinds of web 2.0 sites which are extremely popular in China, much like twitter and Facebook in Europe and the US. Essentially BBS site serves as electronic information center and emerging media, which is used to offer public information releases, chatting, mail and other services as an alternative to physical bulletin platform. In 2005, there were at least 123 BBS sites at 78 universities and colleges in China, and these BBS sites have several million registered users. Popular BBS sites in China include the LQQM BBS established in Tsing University and the BMY BBS established at Xi’an Jiaotong University. For the Sina blog in China, there are almost 230 million users totally all over the world. Many people like expressing their opinion about hot topics on blog and BBS in recent years. For example, “08 Olympic” and “Melamine contaminated milk powder incident” are both hot topic on the BMY BBS and LQQM BBS, whereas “Roh Moo-Hyun’s death” and “Influenza A[H1N1]” are hot topics on the Sina blog.

Factually online topics are playing significant roles in public life undeniably. To understand the evolutionary trend of different hot online topic, it is necessary to study the propagation mechanism of hot online topic.

Theoretically if we can analyze each single user’s behavior for individuals participating in online discussion, maybe we can give a clear explanation for all users’ behavior trend as to hot online topic. However, the information of users’ behavior is often embedded in a vast size of network information; for example, mining web information from Sina blog will produce several millions MB data per day, because of web sites updating every day, thus this way is very time-consuming and is not feasible. In another way, we can model the propagation of hot online topics by studying users’ group collective behavior. This understanding of users’ collective behavior helps us to establish the propagation process of hot online topic as well as effectively let us tackle the disorientation problem in modeling.

In this paper, we propose a time-varying hot topic propagation model (THTPM for abbr.) on BBS or blog to describe the collective behavior of users who are in different online social subgroups. The greatest merit of our model is that it can be applied to reflect the user’s state transition process efficiently and does not depend on any empirical parameters. For accessing to the threshold by which we can prejudge the trend of hot topic discussion, we analyze the stability of equilibrium of THTPM and conclude two theorems. They reflect: when threshold Q < 1, hot topic will die out, and when threshold Q > 1, hot topic will keep uniformly weakly persistent. Furthermore, we propose method (I) and method (II) to predict the trend of hot online topic based on our model for the consideration of theoretical motivation and practical application respectively. By Method (I) we can predict the trend of single- peak hot topic mainly for theoretical motivation, however, in real internet circumstance, there are many multi-peak hot topics, and so we design method (II) to predict the size of discusser- users writing or commenting upon article posts mainly with respect to multi-peak hot online topic in the following two days but it also can predict single-peak hot-topic trend.

Our main contribution lies in that we firstly put forward a novel time-varying state model to depict the dissemination mechanism of hot online topic. Meantime, we bring forward the first universal trend prediction theory of hot topic based on moving time windows which can be applied to distinct hot topics effectively and our results show the proposed prediction method is validate for both single-peak and multi-peak hot topics.

The rest of the paper is organized as follows. Section 2 discusses related work; Section 3 shows problem formation. Section 4 describes definition about the dynamic propagation model of hot online topic. In Section 5, we analyze the stability of the equilibrium of the topic in system (2.1). Experiments are discussed in Section 6. Section 7 presents the conclusions and future work.

Section snippets

Related work

To the best of our knowledge, scholarly interest in social media analysis has increased due to the growing use of tools such as weblogs in past several years. Many earlier studies tended to engage in different aspect of the following five stages in blog networks from topic data acquisition to propagation modeling of hot online topic. First, semantic analysis was used to develop data mining and topic detection techniques to research online topics on inception. Zheng [29] proposed a document

Problem formulations

Now we formalize the problem of modeling the propagation of hot online topic.

On the beginning, we assume target hot topic as TD, other topics which belong to the same category as target hot topic as TR, topics category as T.

User subset D = {dk(t)}(1  k  m, m < n) denotes the Discussed Group, where dk(t) represents the individual who participates in writing or commenting upon article posts with respect to target hot online topic TD at time t (we can deem dk(t) as discusser of topics for convenience);

Model definitions

We introduce the time-varying state equation to describe the transmission mechanism of hot online topic at now.

By Fig. 3.1, it is reasonable for us to suppose that each user in exited group E(t) loses interest in writing or commenting upon article posts with regard to any topics in the same topics category T as target hot topic TD. Then we establish the following state equations according to previous discussion and Fig. 3.1:dRdt=A-βRD-dRdDdt=BD+βRD-γDdEdt=γDWe describe all parameters as follows.

Analysis of hot online topics propagation model

For the sake of developing the proper method to predict the trend of hot online topic by our time-varying dynamic model effectively, we should understand the stability of the equilibrium of our time-varying dynamic model. Therefore, we first analyze the stability of the equilibrium of the constant dynamic propagation model (CDPM) where all the parameters are constant. It will help us to understand the stability of the equilibrium of our time-varying hot topic propagation model (THTPM). The

Experiments

For consideration of both theoretical motivation and practical application, we excogitate Method (I) to validate the model mainly for theoretical motivation and verify the validity of our theorems, we exemplify how to establish the threshold Q or Q and how to prejudge the trend of hot online topic by the threshold. Meantime, we design method (II) to overcome the shortcoming of Method (I) to be practically applied for predicting the trend of single-peak hot topic as well as multi-peak one.

Conclusions and future work

The propagation of hot online topic is correlative with the collective behavior of different user’s groups on blog network or BBS site. Thus by constructing a dynamic propagation model which is time-varying state equations of different user’s group just like epidemic modeling, we can approximate to actual hot-topic propagation process on blog network or BBS site. This time-varying dynamic model (THTPM) is first proposed to describe the propagation of hot online topic. Furthermore, Theorem 5.1,

References (30)

  • O. Diekmann et al.

    Mathematical Epidemiology of Infectious disease

    (2000)
  • F. Ginter et al.

    Combining hidden Markov models and latent semantic analysis for topic segmentation and labeling: Method and clinical application

    International Journal of Medical Informatics

    (2009)
  • D. Gruhl, R. Guha, D. Liben-Nowell, A. Tomkins, Information diffusion through blogspace, in: Proceedings of the 13th...
  • H. Hethcote

    The mathematics of infectious diseases

    SIAM Review

    (2000)
  • T. Hironori et al.

    Getting insights from the voices of customers: Conversation mining at a contact center

    Information Sciences

    (2009)
  • Cited by (0)

    View full text