Skip to main content

Advertisement

Log in

Syndromic surveillance of Flu on Twitter using weakly supervised temporal topic models

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Surveillance of epidemic outbreaks and spread from social media is an important tool for governments and public health authorities. Machine learning techniques for nowcasting the Flu have made significant inroads into correlating social media trends to case counts and prevalence of epidemics in a population. There is a disconnect between data-driven methods for forecasting Flu incidence and epidemiological models that adopt a state based understanding of transitions, that can lead to sub-optimal predictions. Furthermore, models for epidemiological activity and social activity like on Twitter predict different shapes and have important differences. In this paper, we propose two temporal topic models (one unsupervised model as well as one improved weakly-supervised model) to capture hidden states of a user from his tweets and aggregate states in a geographical region for better estimation of trends. We show that our approaches help fill the gap between phenomenological methods for disease surveillance and epidemiological models. We validate our approaches by modeling the Flu using Twitter in multiple countries of South America. We demonstrate that our models can consistently outperform plain vocabulary assessment in Flu case-count predictions, and at the same time get better Flu-peak predictions than competitors. We also show that our fine-grained modeling can reconcile some contrasting behaviors between epidemiological and social models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Code and vocabulary can be found here: http://people.cs.vt.edu/liangzhe/code/hfstm-a.html.

  2. http://datasift.com/.

  3. http://www.basistech.com.

  4. http://www.google.org/flutrends.

References

  • Achrekar H, Gandhe A, Lazarus R, Yu S-H, and Liu B (2011) Predicting flu trends using twitter data. In: 2011 IEEE conference on computer communications workshops (INFOCOM WKSHPS). pp 702–707

  • Anderson RM, May RM (1991) Infectious diseases of humans. Oxford University Press, Oxford

    Google Scholar 

  • Andrews M, Vigliocco G (2010) The hidden markov topic model: a probabilistic model of semantic representation. Top Cogn Sci 2(1):101–113

    Article  Google Scholar 

  • Aramaki E, Maskawa S, Morita M (2011) Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP ’11). pp 1568–1576

  • Beretta E, Takeuchi Y (1995) Global stability of an SIR epidemic model with time delays. J Math Biol 33(3):250–260

    Article  MathSciNet  MATH  Google Scholar 

  • Blasiak S, Rangwala H (2011) A hidden Markov model variant for sequence classification. In: The 21nd international joint conference on artificial intelligence. pp 1192–1197

  • Blei D, Carin L, Dunson D (2010) Probabilistic topic models. Signal Process Mag IEEE 27(6):55–65

    Google Scholar 

  • Blei D, Lafferty J (2006) Dynamic topic models. In: The 23rd international conference on machine learning. pp 113–120

  • Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Brennan SP, Sadilek A, Kautz HA (2013) Towards understanding global spread of disease from everyday interpersonal interactions. In: Proceedings of the 23rd international joint conference on artificial intelligence. AAAI Press, pp 2783–2789

  • Butler D (2013) When Google got Flu wrong. Nature 494(7436):155–156

    Article  Google Scholar 

  • Chakraborty P, Khadivi P, Lewis B, Mahendiran A, Chen J, Butler P, Nsoesie E, Mekaru S, Brownstein J, Marathe M, Ramakrishnan N (2014) Forecasting a moving target: ensemble models for ili case count predictions. In: 2014 SIAM international conference on data mining (SDM ’14)

  • Chen L, Hossain KSMT, Butler P, Ramakrishnan N, Prakash BA (2014) Flu gone viral: Syndromic surveillance of flu on twitter using temporal topic models. In: Proceedings of the fifth IEEE international conference on data mining (ICDM ’14)

  • Christakis NA, Fowler JH (2010) Social network sensors for early detection of contagious outbreaks. PLoS One 5(9):e12948

    Article  Google Scholar 

  • Crane R, Sornette D (2008) Robust dynamic classes revealed by measuring the response function of a social system. Proc Natl Acad Sci 105(41):15649–15653

    Article  Google Scholar 

  • Culotta A (2010) Towards detecting influenza epidemics by analyzing twitter messages. In: Proceedings of the first workshop on social media analytics. ACM, pp 115–122

  • Ginsberg J, Mohebbi M, Patel R, Brammer L, Smolinski M, Brilliant L (2008) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014

    Article  Google Scholar 

  • Glance N, Hurst M, Tomokiyo T (2004) Blogpulse: automated trend discovery for weblogs. WWW 2004 workshop on the weblogging ecosystem: aggregation, analysis and dynamics

  • Gruber A, Weiss Y, Rosen-Zvi M (2007) Hidden topic markov models. In: International conference on artificial intelligence and statistics. pp 163–170

  • Hethcote HW (2000) The mathematics of infectious diseases. Soc Ind Appl Math SIAM Rev 42(4):599–653

    MathSciNet  MATH  Google Scholar 

  • Hong L, Yin D, Guo J, Davison B (2011) Tracking trends: incorporating term volume into temporal topic models. In: the 17th ACM SIGKDD international conference on knowledge discovery and data mining. pp 484–492

  • Jacquez J, Simon C (1993) The stochastic SI model with recruitment and deaths I. Comparison with the closed SIS model. Math Biosci 117(1):77–125

    Article  MathSciNet  MATH  Google Scholar 

  • Lamb A, Paul MJ, Dredze M (2013) Separating fact from fear: tracking flu infections on twitter. In: North American chapter of the association for computational linguistics (NAACL). pp 789–795

  • Lampos V, Cristianini N (2012) Nowcasting events from the social web with statistical learning. ACM Trans Intell Syst Technol 3(4):72

    Article  Google Scholar 

  • Lampos V, De Bie T, Cristianini N (2010) Flu detector: tracking epidemics on twitter. In: Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases: Part III (ECML PKDD’10). pp 599–602

  • Lazer DM, Kennedy R, King G, Vespignani A (2014) The parable of google flu: traps in big data analysis. Science 343(6176):1203–1205

    Article  Google Scholar 

  • Lee K, Agrawal A, Choudhary A (2013) Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 1474–1477

  • Li J, Cardie C (2013) Early stage influenza detection from twitter. arXiv:1309.7340

  • Li M, Muldowney J (1995) Global stability for the seir model in epidemiology. Math Biosci 125(2):155–164

    Article  MathSciNet  MATH  Google Scholar 

  • Matsubara Y, Sakurai Y, Prakash BA, Li L, Faloutsos C (2012) Rise and fall patterns of information diffusion: model and implications. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’12). pp 6–14

  • PAHO (2012). Epidemic disease database, pan american health organization. http://www.ais.paho.org/phip/viz/ed_flu.asp

  • Paul M, Dredze M (2011) You are what you tweet: analyzing twitter for public health. In: Fifth international AAAI conference on weblogs and social media (ICWSM 2011). pp 265–272

  • Paul M, Girju R (2010) A two-dimensional topic-aspect model for discovering multi-faceted topics. Urbana 51:61801

    Google Scholar 

  • Romero DM, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on world wide web (WWW ’11). ACM, New York. pp 695–704

  • Sadilek A, Kautz H, Silenzio V (2012) Predicting disease transmission from geo-tagged micro-blog data. In: AAAI conference on artificial intelligence

  • Spasojevic N, Yan J, Rao A, Bhattacharyya P (2014) Lasta: large scale topic assignment on multiple social networks. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’14). ACM, New York. pp 1809–1818

  • Steyvers M, Smyth P, Rosen-Zvi M, Griffiths T (2004) Probabilistic author-topic models for information discovery. In: The 10th ACM SIGKDD international conference on knowledge discovery and data mining. pp 306–315

  • Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’06). pp 424–433

  • Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on web search and data mining. ACM. pp 177–186

  • Yang J, McAuley J, Leskovec J, LePendu P, Shah N (2014a) Finding progression stages in time-evolving event sequences. In: Proceedings of the 23rd international conference on world wide web (WWW ’14). pp 783–794

  • Yang S-H, Kolcz A, Schlaikjer A, Gupta P (2014b) Large-scale high-precision topic modeling on twitter. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’14). ACM, New York, pp 1907–1916

  • Zhao S, Zhong L, Wickramasuriya J, Vasudevan V (2011) Human as real-time sensors of social and physical events, A case study of twitter and sports games. arXiv:1106.4300

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. IIS-1353346, by the Maryland Procurement Office under Contract H98230-14-C-0127, by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center (DoI/NBC) Contract Number D12PC000337, and by the VT College of Engineering. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the respective funding agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liangzhe Chen.

Additional information

Responsible editor: Charu Aggarwal.

Appendix

Appendix

1.1 HFSTM-A-FIT

In this appendix, we show the equations we designed for HFSTM-A-FIT. Note that the outlines of the HFSTM-FIT algorithm is similar to HFSTM-A-FIT, one can derive equations for HFSTM-FIT from the content we show below.

Let K, T, N, and U be the number of states, number of tweets per user, number of words per tweet, and total number of users. Let \(O=<O_1,O_2,\ldots ,O_T>\) and \(S=<S_1,S_2,\ldots ,S_T>\) the observed sequences of tweets and hidden states respectively for a particular user.

Here is a list of symbols that we will use.

  1. 1.

    \(\epsilon \): the prior for the binary state switching variable, which determines whether state of a tweet is drawn from the transition probability matrix or simply copied from the state of the previous tweet (a number in (0, 1])

  2. 2.

    \(\pi \): initial state probability (size is \(1\times K\))

  3. 3.

    \(\eta \): tansition probability matrix (size is \(K\times K\))

  4. 4.

    \(\phi \): word distrtibution for each state (size is \(K\times W\), where W is the total number of keywords for all of the states)

  5. 5.

    \(w_{tn}\): the nth word in the tth tweet

  6. 6.

    \(\lambda \): the background switch variable

  7. 7.

    c: the topic switch variable

  8. 8.

    y: the observed aspect value

For HFSTM-A, as mentioned in Sect. 3.3, the value of \(\lambda \) is biased by the observed aspect value y. We use \(\lambda \) instead of \(\lambda _{y}\) in the following for brevity, but remember the \(\lambda \) value in the equations is actually calculated using:

$$\begin{aligned} \lambda _{y_i=0}&=\lambda \\ \lambda _{y_i=1}&=\lambda +b \,\times \,(1-\lambda )\\ c_{y_i=0}&=c-a \,\times \,c\\ c_{y_i=1}&=c+a \,\times \,(1-c) \end{aligned}$$

We want to learn all the parameters given the tweet sequence. For compact notation we use \(H=(\epsilon ,\pi ,\eta ,\phi ,\lambda ,c)\). In HFSTM-A-FIT, we use forward backward procedure for which we define forward variable \(A_t(i)\) and backward variable \(B_t(i)\) as follows.

$$\begin{aligned} A_t(i)&= P(O_1,O_2,\ldots ,O_t, S_t =i|H)\\ B_t(i)&= P(O_{t+1},\ldots ,O_T|S_t =i,H) \end{aligned}$$

Let \(\gamma _t(i)\) be the probability of being in state \(S_i\) at for tth tweet given the observed tweet sequence O and other model parameters. For each user the size of \(\gamma \) is \(2K\times T\) (with the first K states as the states which are copies of the previous state, and the second K states which are derived after a transition). This probability can be expressed by the forward and backward probabilities.

$$\begin{aligned} \gamma _t(i)&= P(S_t =i|O,H)\\&= \frac{A_t(i)B_t(i)}{P(O|H)}\\&= \frac{A_t(i)B_t(i)}{\sum _{i=1}^{2K} A_t(i)B_t(i)}\\ \end{aligned}$$

We have two switch variables in the model: l, x. If \(l=1\), the word is generated either by states or topics, if \(l=0\) it’s generated by background. If \(x=0\), the word is generated by topics, if \(x=1\) it’s by states.

For \(l_i=1\), which means that \(w_i\) is generated by either state or topics.

$$\begin{aligned}&P(l_i=1| \lambda ,c,H,w) =\frac{ P(l_i=1|\lambda ,c,H)P(w|l_i=1,\lambda ,c,H)}{ P(w|\lambda ,c,H)}\\&= \textstyle \frac{ \lambda P(w_i|\lambda ,c,H,l_i=1,w_{-i})P(w_{-i}|\lambda ,c,H,l_i=1)}{ P(w_i|\lambda ,c,H,w_{-i})P(w_{-i}|\lambda ,c,H)}\\&= \textstyle \frac{ \lambda \sum _{x_i} [P(w_i|\lambda ,c,H,l_i=1,x_i,w_{-i})P(x_i|\lambda ,c,H,l_i=1,w_{-i})]}{ \sum _{l_i}[P(w_i|\lambda ,c,H,l_i,w_{-i})P(l_i|\lambda ,c,H,w_{-i})]}\\&= \textstyle \frac{ \lambda [(\sum _{topic} \phi _{topic} (w_i)P(topic|x_i=0,l_i=1,\lambda ,c,H,w_{-i}))(1-c)+(\sum _{state} \phi _{state}(w_i)\gamma _i(state))c]}{ \lambda [(\sum _{topic} \phi _{topic} (w_i)P(topic|...))(1-c)+(\sum _{state} \phi _{state}(w_i)\gamma _i(state))c]+(1-\lambda )\phi _{Bak}(w_i)} \end{aligned}$$

For \(l_i=0\), \(w_i\) is generated by background.

$$\begin{aligned}&P(l_i=0| \lambda ,c,H,w) =\textstyle \frac{ P(l_i=0|\lambda ,c,H)P(w|l_i=0,\lambda ,c,H)}{ P(w|\lambda ,c,H)}\\&= \textstyle \frac{ (1-\lambda )\phi _{Bak}(w_i)}{ \lambda [(\sum _{topic} \phi _{topic} (w_i)P(topic|...))(1-c)+(\sum _{state} \phi _{state}(w_i)\gamma _i(state))c]+(1-\lambda )\phi _{Bak}(w_i)} \end{aligned}$$

For \(x_i=0\), \(w_i\) is generated by topics.

$$\begin{aligned}&P(x_i=0| \lambda ,c,H,w) =\frac{ P(x_i=0|\lambda ,c,H)P(w|x_i=0,\lambda ,c,H)}{ P(w|\lambda ,c,H)}\\&= \frac{ (1-c) P(w_i|\lambda ,c,H,x_i=0,w_{-i})P(w_{-i}|\lambda ,c,H,x_i=0)}{ P(w_i|\lambda ,c,H,w_{-i})P(w_{-i}|\lambda ,c,H)}\\&= \frac{ (1-c) \sum _{l_i} [P(w_i|\lambda ,c,H,x_i=0,l_i,w_{-i})P(l_i|\lambda ,c,H,x_i=0,w_{-i})]}{ \sum _{x_i}P(w_i|\lambda ,c,H,w_{-i},x_i)P(x_i|\lambda ,c,H,w_{-i})}\\&=\textstyle \frac{ (1-c) [(\sum _{topic} \phi _{topic} (w_i)P(topic|x_i=0,l_i=1,\lambda ,c,H,w_{-i}))\lambda +\phi _{Bak}(w_i)(1-\lambda )]}{ (1-c) [(\sum _{top} \phi _{top} (w_i)P(top|...))\lambda +\phi _{Bak}(w_i)(1-\lambda )]+c[(\sum _{sta}\phi _{sta}(w_i)\gamma _i(sta))\lambda +\phi _{Bak}(w_i)(1-\lambda )]} \end{aligned}$$

For \(x_i=1\), \(w_i\) is generated by states.

$$\begin{aligned}&P(x_i=1| \lambda ,c,H,w) =\frac{ P(x_i=1|\lambda ,c,H)P(w|x_i=1,\lambda ,c,H)}{ P(w|\lambda ,c,H)}\\&= \textstyle \frac{ c[(\sum _{sta}\phi _{sta}(w_i)\gamma _i(sta))\lambda +\phi _{Bak}(w_i)(1-\lambda )]}{ (1-c) [(\sum _{top} \phi _{top} (w_i)P(top|...))\lambda +\phi _{Bak}(w_i)(1-\lambda )]+c[(\sum _{sta}\phi _{sta}(w_i)\gamma _i(sta))\lambda +\phi _{Bak}(w_i)(1-\lambda )]} \end{aligned}$$

Forward variable: We now further expand the forward variable in more details. The Initialization is as follows:

For \(1\le i\le K\):

$$\begin{aligned} A_1(i)&=P(O_1,S_1=i|H)\\&=P(O_1|S_1=i,H)P((S_1=i|H)\\&=\pi _i\prod _{n=1}^NP(w_{1n}|S_1=i,H)\\&=\pi _i\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{1n})+\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{1n})P(top|\ldots )+c\phi ^i(w_{1n})\right] \right\} \end{aligned}$$

For \(K+1\le i\le 2K\): \(A_1(i)=0\)

Induction is as follows:

For \(1\le j\le K\):

$$\begin{aligned} A_t(j)&=P(O_1,O_2,\ldots ,O_t,S_t=j|H)\\&=\left( \sum _i^{2K}A_{t-1}(i)\epsilon \eta _{ij}\right) P(O_t|S_t=j,H)\\&= \left( \sum _i^{2K}A_{t-1}(i)\epsilon \eta _{ij}\right) \prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{1n})\right. \\&\left. \quad +\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{1n})P(top|\ldots )+c\phi ^j(w_{1n})\right] \right\} \end{aligned}$$

For \(K+1\le j\le 2K\):

$$\begin{aligned}&A_t(j)=P(O_1,O_2,\ldots ,O_t,S_t=j|H)\\&=(A_{t-1}(j)+A_{t-1}(j-K))(1-\epsilon )\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{1n})\right. \\&\left. \quad +\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{1n})P(top|\ldots )+c\phi ^j(w_{1n})\right] \right\} \end{aligned}$$

Backward variable: The initialization for backward variable is as follows:

For \(1\le i\le 2K\):

$$\begin{aligned} B_T(i)=1 \end{aligned}$$

Induction is as follows:

For \(1\le i\le K\):

$$\begin{aligned}&B_t(i)=P(O_{t+1},\ldots ,O_T|S_t=i,H)\\&= \left( \sum _j^{K} \epsilon \eta _{ij} P(O_{t+1}| S_{t+1} =j,H) B_{t+1}(j)\right) \\&\quad + (1-\epsilon )P(O_{t+1}| S_{t+1} =i+K,H) B_{t+1}(i+K)\\&=\left( \sum _j^{K} \epsilon \eta _{ij} \prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n})+\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )\right. \right. \right. \\&\left. \left. \left. \quad +\,c\phi ^j(w_{(t+1)n})\right] \right\} B_{t+1}(j)\right) + (1-\epsilon )\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n}) \right. \\&\left. \quad +\,\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )+c\phi ^i(w_{(t+1)n})\right] \right\} B_{t+1}(i+K)\\ \end{aligned}$$

For \(K+1 \le i \le 2K\):

$$\begin{aligned}&B_{t}(i) = P(O_{t+1},\ldots ,O_T|S_t =i,H)\\&\quad = \left( \sum _j^{K} \epsilon \eta _{ij} P(O_{t+1}| S_{t+1} =j,H) B_{t+1}(j)\right) \\&\qquad + \,(1-\epsilon )P(O_{t+1}| S_{t+1} =i,H) B_{t+1}(i)\\&\begin{aligned}&\quad = \left( \sum _j^{K} \epsilon \eta _{ij} \prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n})+\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )\right. \right. \right. \\&\left. \left. \left. \qquad +\,c\phi ^j(w_{(t+1)n})\right] \right\} B_{t+1}(j) \right) + (1-\epsilon )\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n})\right. \end{aligned} \\&\quad \quad \left. \left. \left. +\,\lambda [(1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )+c\phi ^{i-K}(w_{(t+1)n}\right) \right] \right\} B_{t+1}(i)\\ \end{aligned}$$

Define z as follows:

$$\begin{aligned} z_{t,n}(i)&=P(T_{tn}=i|l_{tn}=1,x_{tn}=0,w_{tn},H)\\&=\frac{ P(w_{tn}|T_{tn}=i,H,l_{tn}=1,x_{tn}=0)P(T_{tn}=i|l_{tn}=1,x_{tn}=0,H)}{ P(w_{tn}|l_{tn}=1,x_{tn}=0,H)}\\&=\frac{ \phi _{top=i}(w_{tn})P(T_{tn}=i|l_{tn}=1,x_{tn}=0,H)}{ \sum _{i}[\phi _{top=i}(w_{tn})P(T_{tn}=i|l_{tn}=1,x_{tn}=0,H)]} \end{aligned}$$

Let \(\xi _t(i,j)\) be the probability of being in state \(S_i\) at time t, and state \(S_j\) at time \(t+1\), given O and other model parameters.

$$\begin{aligned} \xi _t(i,j)&=P(S_t=i,S_{t+1}=j|O,H)\\&=\frac{P(S_t=i,S_{t+1}=j,O|H)}{P(O|H)} \end{aligned}$$

To express \(\xi _t(i,j)\), we have the following definition.

For \(1\le i\le 2K\) and \(1\le j\le K\):

$$\begin{aligned}&T_1=A_t(i)\epsilon \eta _{ij}P(O_{t+1}|S_{t+1}=j,H)B_{t+1}(j)\\&=A_t(i)\epsilon \eta _{ij}\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n})\right. \\&\left. \quad +\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )+c\phi ^j(w_{(t+1)n})\right] \right\} B_{t+1}(j) \end{aligned}$$

For \(1\le i\le K\) and \(K+1\le j\le 2K\):

$$\begin{aligned}&T_2=A_t(i)(1-\epsilon )P(O_{t+1}|S_{t+1}=i+K,H)B_{t+1}(i+K)\\&=A_t(i)(1-\epsilon )\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n})\right. \\&\left. \quad +\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )+c\phi ^i(w_{(t+1)n})\right] \right\} B_{t+1}(i+K) \end{aligned}$$

For \(K+1\le i\le 2K\) and \(K+1\le j\le 2K\):

$$\begin{aligned}&T_3=A_t(i)(1-\epsilon )P(O_{t+1}|S_{t+1}=i,H)B_{t+1}(i)\\&=A_t(i)(1-\epsilon )\prod _{n=1}^N\left\{ (1-\lambda )\phi _{Bak}(w_{(t+1)n})\right. \\&\left. \quad +\lambda \left[ (1-c)\sum _{top}\phi _{top}(w_{(t+1)n})P(top|\ldots )+c\phi ^{i-K}(w_{(t+1)n})\right] \right\} B_{t+1}(i) \end{aligned}$$

Correspondingly, we have the following \(\xi \) values according to the different i, j value range:

$$\begin{aligned} \xi _t(i,j)&= \frac{ T_1 }{ \sum _i \sum _j (T_1 + T_2 + T_3) }\\ \xi _t(i,j)&= \frac{ T_2 }{ \sum _i \sum _j (T_1 + T_2 + T_3) }\\ \xi _t(i,j)&= \frac{ T_3 }{ \sum _i \sum _j (T_1 + T_2 + T_3) } \end{aligned}$$

Estimation of parameters:

We use the following equations to estimate the parameter values in the M-step.

For estimating \(\epsilon \):

$$\begin{aligned} \epsilon&= \frac{ \sum _{u=1}^U \sum _{t=1}^T \sum _{i=1}^{2K} \sum _{j=1}^K \xi (i,j) }{ \sum _{u=1}^U \sum _{t=1}^T \sum _{i=1}^{2K} \sum _{j=1}^{2K} \xi (i,j) } \end{aligned}$$

For estimating \(\pi \):

$$\begin{aligned} \pi _i&= \frac{ \sum _{u=1}^U \gamma _1(i) }{ \sum _{u=1}^U \sum _{i=1}^K \gamma _1(i) } \quad \text {for}\, 1 \le \,\hbox {i}\, \le \,\hbox {K} \end{aligned}$$

For estimating \(\eta \):

$$\begin{aligned} \eta _{ij}&= \frac{ \sum _{u=1}^U \sum _{t=1}^T \left( \xi _t(i,j) + \xi _t(i+K,j)\right) }{ \sum _{u=1}^U \sum _{t=1}^T \sum _{j=1}^K \left( \xi _t(i,j) + \xi _t(i+K,j)\right) } \quad \text {for}\, 1\le \,\hbox {i} \,\le \,\hbox {K}, \,1\le \,\hbox {j}\,\le \,\hbox {K} \end{aligned}$$

For estimating \(\lambda \):

$$\begin{aligned} \lambda =\frac{\sum _u\sum _t \frac{1}{N_t}\sum _{n=1}^{N_t}P(l_{tn}=1|\lambda ,c,H,w)}{UT} \end{aligned}$$

For estimating c:

$$\begin{aligned} c =\frac{\sum _u\sum _t \frac{1}{N_t}\sum _{n=1}^{N_t}P(l_{tn}=1|\lambda ,c,H,w)P(x_{tn}=1|\lambda ,c,H,w)}{\sum _u\sum _t \frac{1}{N_t}\sum _{n=1}^{N_t}P(l_{tn}=1|\lambda ,c,H,w)} \end{aligned}$$

For estimating \(\phi \):

$$ \begin{aligned} \phi ^i(w)&=\textstyle \frac{\sum _{u=1}^U \sum _{t=1}^T \sum _{\begin{array}{c} 1 \le n \le N \\ \& \\ w=w_{tn} \end{array}} P(l_{tn}=1|\lambda ,c,H,O)P(x_{tn}=1|\lambda ,c,H,O)\left( \gamma _t(i) + \gamma _t(i+K)\right) }{\sum _{u=1}^U \sum _{t=1}^T \sum _{w=1}^W \sum _{\begin{array}{c} 1 \le n \le N \\ \& \\ w=w_{tn} \end{array}} P(l_{tn}=1|\lambda ,c,H,O)P(x_{tn}=1|\lambda ,c,H,O)\left( \gamma _t(i) + \gamma _t(i+K)\right) } \\&\text {for}\, 1\le \,\hbox {i}\, \le \,\hbox {K}\\ \phi _{Bak}(w)&= \frac{\sum _{u=1}^U \sum _{t=1}^T \sum _{\begin{array}{c} 1 \le n \le N \\ \& \\ w=w_{tn} \end{array}} P(l_{tn}=0|\lambda ,c,H,O) }{\sum _{u=1}^U \sum _{t=1}^T \sum _{w=1}^W \sum _{\begin{array}{c} 1 \le n \le N \\ \& \\ w=w_{tn} \end{array}} P(l_{tn}=0|\lambda ,c,H,O) }\\ \phi _{Topic}(w)&= \textstyle \frac{ \sum _{u=1}^U \sum _{t=1}^T \sum _{\begin{array}{c} 1 \le n \le N \\ \& \\ w=w_{tn} \end{array}} P(l_{tn}=1|\lambda ,c,H,O)P(x_{tn}=0|\lambda ,c,H,O)z_{t,n}(Topic) }{ \sum _{u=1}^U \sum _{t=1}^T \sum _{w=1}^W \sum _{\begin{array}{c} 1 \le n \le N \\ \& \\ w=w_{tn} \end{array}} P(l_{tn}=1|\lambda ,c,H,O)P(x_{tn}=0|\lambda ,c,H,O)z_{t,n}(Topic) }\\ P(T_{tn}=i|l_{tn}&=1,x_{tn}=0,H)=\textstyle \frac{ \sum _{u=1}^U\sum _{t=1}^T\sum _{n=1}^{N_t}P(l_{tn}=1|\lambda ,c,H,O)P(x_{tn}=0|\lambda ,c,H,O)z_{t,n}(i)}{ \sum _{u=1}^U\sum _{t=1}^T\sum _{n=1}^{N_t}P(l_{tn}=1|\lambda ,c,H,O)P(x_{tn}=0|\lambda ,c,H,O)} \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Tozammel Hossain, K.S.M., Butler, P. et al. Syndromic surveillance of Flu on Twitter using weakly supervised temporal topic models. Data Min Knowl Disc 30, 681–710 (2016). https://doi.org/10.1007/s10618-015-0434-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-015-0434-x

Keywords

Navigation