Skip to main content
Log in

Change point detection for burst analysis from an observed information diffusion sequence of tweets

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

We propose a method of detecting the period in which a burst of information diffusion took place from an observed diffusion sequence data over a social network and report the results obtained by applying it to the real Twitter data. We assume a generic information diffusion model in which time delay associated with the diffusion follows the exponential distribution and the burst is directly reflected to the changes in the time delay parameter of the distribution. The shape of the parameter’s change is approximated by a step function and the problem of detecting the change points and finding the values of the parameter is formulated as an optimization problem of maximizing the likelihood of generating the observed diffusion sequence. Time complexity of the search is almost proportional to the number of observed data points and has been shown to be very efficient. We first demonstrated that the proposed method can detect the burst using a synthetic data and showed that it performs better than one of the representative state-of-the-art methods, confirming that the proposed method covers a wider range of change patterns. Then, we extended our evaluation on synthetic data to show that it is efficient and effective comparing it with a naive exhaustive search and a simple greedy method. We then apply the method to the real Twitter data of the 2011 To-hoku earthquake and tsunami, and reconfirmed its efficiency and effectiveness. Two interesting discoveries are that a burst period detected by the proposed method tends to contain massive homogeneous tweets on a specific topic even if the observed diffusion sequence consists of heterogeneous tweets on various topics, and that assuming the information diffusion path to be a line shape tree can give a good approximation of the maximum likelihood estimator when the actual diffusion path is not known.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. https://twitter.com/

  2. https://www.facebook.com/

  3. Observed sequence \({\cal C}\) does not tell which parent activated which child. Without this assumption, we have to introduce hidden variables.

  4. NHK is the government operated broadcaster.

  5. Great Hanshin-Awaji Earthquake occurred on January 17, 1995 in Kobe area and 6,434 people lost their lives.

References

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.

    Article  MATH  MathSciNet  Google Scholar 

  • Araujo, L., Cuesta, J.A., Merelo, J.J. (2006). Genetic algorithm for burst detection and activity tracking in event streams. In Proceedings of the 9th international conference on Parallel Problem Solving from Nature (PPSN’06) (pp. 302–311).

  • Bonacichi, P. (1987). Power and centrality: a family of measures. American Journal of Sociology, 92, 1170–1182.

    Article  Google Scholar 

  • Ebina, R., Nakamura, K., Oyanagi, S. (2011). A real-time burst detection method. In Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 1040–1046).

  • Goldenberg, J., Libai, B., Muller, E. (2001). Talk of the network: a complex systems look at the underlying process of word-of-mouth. Marketing Letters, 12, 211–223.

    Article  Google Scholar 

  • Katz, L. (1953). A new status index derived from sociometric analysis. Sociometry, 18, 39–43.

    MATH  Google Scholar 

  • Kempe, D., Kleinberg, J., Tardos, E. (2003). Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2003) (pp. 137–146).

  • Kimura, M., Saito, K., Nakano, R., Motoda, H. (2010). Extracting influential nodes on a social network for information diffusion. Data Mining and Knowledge Discovery, 20, 70–97.

    Article  MathSciNet  Google Scholar 

  • Kleinberg, J. (2002). Bursty and hierarchical structure in streams. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2002) (pp. 91–101).

  • Rissanen, J. (1989). Stochastic complexity in statistical inquiry. World Scientific.

  • Sadikov, E., Medina, M., Leskovec, J., Garcia-Molina, H. (2011). Correcting for missing data in information cascades. In Proceedings of the 4th ACM international conference on Web Search and Data Mining (WSDM 2011) (pp. 55–64).

  • Saito, K., Kimura, M., Ohara, K., Motoda, H. (2009). Learning continuous-time information diffusion model for social behavioral data analysis. In Proceedings of the 1st Asian Conference on Machine Learning (ACML2009), LNAI (Vol. 5828, pp. 322–337).

  • Saito, K., Kimura, M., Ohara, K., Motoda, H. (2010). Selecting information diffusion models over social networks for behavioral analysis. In Proceedings of the 2010 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2010), LNAI (Vol. 6323, pp. 180–195).

  • Sun, A., Zeng, D., Chen, H. (2010). Burst detection from multiple data streams: a network-based approach. IEEE Transactions on Systems, Man, & Cybernetics Society, Part C, 40, 258–267.

    Article  Google Scholar 

  • Wasserman, S., & Faust, K. (1994). Social network analysis. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Watts, D.J. (2002). A simple model of global cascades on random networks. Proceedings of National Academy of Science USA, 99, 5766–5771.

    Article  MATH  MathSciNet  Google Scholar 

  • Watts, D.J., & Dodds, P.S. (2007). Influence, networks, and public opinion formation. Journal of Consumer Research, 34, 441–458.

    Article  Google Scholar 

  • Zhang, X. (2006). Fast algorithms for burst detection. PhD dissertation, New York University. http://pdf.aminer.org/000/301/507/better_burst_detection.pdf.

  • Zhu, Y., & Shasha, D. (2003). Efficient elastic burst detection in data streams. In Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2003) (pp. 336–345).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kazumi Saito.

Additional information

The Twitter data we used in this paper were provided by Prof. Fujio Toriumi of Tokyo University and Prof. Kazuhiro Kazama of Wakayama University. This work was partly supported by Asian Office of Aerospace Research and Development, Air Force Office of Scientific Research under Grant No. AOARD-13-4042, and JSPS Grant-in-Aid for Scientific Research (C) (No. 23500194).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saito, K., Ohara, K., Kimura, M. et al. Change point detection for burst analysis from an observed information diffusion sequence of tweets. J Intell Inf Syst 44, 243–269 (2015). https://doi.org/10.1007/s10844-013-0283-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-013-0283-2

Keywords

Navigation