Contextual bandits with hidden contexts: a focused data capture from social media streams

Lamprier, Sylvain; Gisselbrecht, Thibault; Gallinari, Patrick

doi:10.1007/s10618-019-00648-w

Contextual bandits with hidden contexts: a focused data capture from social media streams

Published: 10 August 2019

Volume 33, pages 1853–1893, (2019)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Sylvain Lamprier ORCID: orcid.org/0000-0002-2508-922X¹,
Thibault Gisselbrecht² &
Patrick Gallinari¹

376 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

This paper addresses the problem of real time data capture from social media. Due to different limitations, it is not possible to collect all the data produced by social networks such as Twitter. Therefore, to be able to gather enough relevant information related to a predefined need, it is necessary to focus on a subset of the information sources. In this work, we focus on user-centered data capture and consider each account of a social network as a source that can be followed at each iteration of a data capture process. This process, whose aim is to maximize the cumulative utility of the captured information for the specified need, is constrained at each time step by the number of users that can be monitored simultaneously. The problem of selecting a subset of accounts to listen to over time is a sequential decision problem under constraints, which we formalize as a bandit problem with multiple selections. In this work, we propose a contextual UCB-like approach, that uses the activity of any user during the current step to predict his future behavior. Besides the capture of usefulness variations, considering contexts also enables to improve the efficiency of the process by leveraging some structure in the search space. However, existing contextual bandit approaches do not fit for our setting where most of the contexts are hidden from the agent. We therefore propose a new algorithm, called HiddenLinUCB, which aims at dealing with such missing information via variational inference. Experiments demonstrate the very good behavior of this approach compared to existing methods for tasks of data capture from social networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting

Article Open access 30 November 2015

Activity-based Twitter sampling for content-based and user-centric prediction models

Article Open access 25 January 2017

Analytic Challenges in Social Sensing

Notes

For instance on Twitter, up to 7000 messages can be released per second.
In this paper, we focus on Twitter but several other media could have been considered, such as Facebook which proposes similar real-time APIs with restrictions. The full documentation of the Twitter APIs is available at: https://dev.twitter.com/streaming/public.
$\Phi ^{-1}$ corresponds to the Normal inverse cumulative distribution function.
At step t, for any action i, $q^*_{\tau _{i}}$ being a Gamma distribution with shape $a_{i,t-1}$ and scale $b_{i,t-1}$, the expectation $\mathbb {E}[\tau _{i}]$ equals $\dfrac{a_{i,t-1}}{b_{i,t-1}}$.

References

Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. In: Advances in neural information processing systems 24: 25th annual conference on neural information processing systems 2011. Proceedings of a meeting held 12–14 Dec 2011, Granada, Spain, pp 2312–2320
Agrawal S, Goyal N (2012a) Analysis of Thompson sampling for the multi-armed bandit problem. In: COLT 2012—the 25th annual conference on learning theory, 25–27 June 2012, Edinburgh, Scotland, pp 39.1–39.26
Agrawal S, Goyal N (2012b) Further optimal regret bounds for Thompson sampling. In: Proceedings of the sixteenth international conference on artificial intelligence and statistics, PMLR 31, 2013, pp 99–107
Agrawal S, Goyal N (2013) Thompson sampling for contextual bandits with linear payoffs. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp 127–135
Agarwal A, Hsu D, Kale S, Langford J, Li L, Schapire RE (2014) Taming the monster: a fast and simple algorithm for contextual bandits. In: Proceedings of the 31th international conference on machine learning (ICML), 2014, pp 1638–1646
Agrawal S, Devanur NR, Li L (2015) Contextual bandits with global constraints and objective. CoRR abs/1506.03374, arXiv:1506.03374
Alla J, Lavrenko V, Papka R (1998) Event tracking. CIIR Technical report IR—128, Department of Computer Science, University of Massachusetts
Audibert J, Bubeck S (2009) Minimax policies for adversarial and stochastic bandits. In: COLT 2009—the 22nd conference on learning theory, Montreal, Quebec, Canada, 18–21 June 2009
Audibert JY, Munos R, Szepesvári C (2009) Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor Comput Sci 410(19):1876–1902. https://doi.org/10.1016/j.tcs.2009.01.016
Article MathSciNet MATH Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2–3):235–256. https://doi.org/10.1023/A:1013689704352
Article MATH Google Scholar
Bao Y, Huang W, Yi C, Jiang J, Xue Y, Dong Y (2015) Effective deployment of monitoring points on social networks. In: International conference on computing, networking and communications, ICNC 2015, Garden Grove, CA, USA, 16–19 Feb 2015, pp 62–66. https://doi.org/10.1109/ICCNC.2015.7069316
Bechini A, Gazzè D, Marchetti A, Tesconi M (2016) Towards a general architecture for social media data capture from a multi-domain perspective. In: 30th IEEE international conference on advanced information networking and applications, AINA 2016, Crans-Montana, Switzerland, 23–25 Mar 2016, pp 1093–1100. https://doi.org/10.1109/AINA.2016.75
Bifet A, Frank E (2010) Sentiment knowledge discovery in twitter streaming data. In: Discovery science—13th international conference, DS 2010, Canberra, Australia, 6–8 Oct 2010. Proceedings, pp 1–15. https://doi.org/10.1007/978-3-642-16184-1_1
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, New York
MATH Google Scholar
Bnaya Z, Puzis R, Stern R, Felner A (2013) Bandit algorithms for social network queries. In: 2013 international conference on social computing, pp 148–153. https://doi.org/10.1109/SocialCom.2013.29
Boanjak M, Oliveira E, Martins J, Mendes Rodrigues E, Sarmento L (2012) Twitterecho: a distributed focused crawler to support open research with twitter data. In: Proceedings of the 21st international conference companion on World Wide Web, ACM, New York, NY, USA, WWW ’12 Companion, pp 1233–1240. https://doi.org/10.1145/2187980.2188266
Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found Trends Mach Learn 5(1):1–122
Article Google Scholar
Cappé O, Garivier A, Maillard OA, Munos R, Stoltz G (2013) Kullback-Leibler upper confidence bounds for optimal sequential allocation. Ann Stat 41(3):1516–1541
Article MathSciNet Google Scholar
Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the tenth international workshop on multimedia data mining, ACM, New York, NY, USA, MDMKDD ’10, pp 4:1–4:10. https://doi.org/10.1145/1814245.1814249
Catanese S, Meo PD, Ferrara E, Fiumara G, Provetti A (2011) Crawling facebook for social network analysis purposes. In: Proceedings of the international conference on web intelligence, mining and semantics, WIMS 2011, Sogndal, Norway, 25–27 May 2011, p 52. https://doi.org/10.1145/1988688.1988749
Cesa-Bianchi N, Gentile C, Zappella G (2013) A gang of bandits. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held 5–8 Dec 2013, Lake Tahoe, Nevada, US, pp 737–745
Chakrabarti S, van den Berg M, Dom B (1999) Focused crawling: a new approach to topic-specific web resource discovery. In: Proceedings of the eighth international conference on World Wide Web, Elsevier North-Holland, Inc., New York, NY, USA, WWW ’99, pp 1623–1640
Chapelle O, Li L (2011) An empirical evaluation of Thompson sampling. In: Advances in neural information processing systems 24, 12–14 Dec 2011. Granada, Spain, pp 2249–2257
Chen W, Wang Y, Yuan Y (2013) Combinatorial multi-armed bandit: general framework and applications. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp 151–159
Chu W, Li L, Reyzin L, Schapire RE (2011) Contextual bandits with linear payoff functions. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, AISTATS 2011, Fort Lauderdale, USA, 11–13 Apr 2011, pp 208–214
Colbaugh R, Glass K (2011) Emerging topic detection for business intelligence via predictive analysis of ’meme’ dynamics. In: AI for business agility, papers from the 2011 AAAI spring symposium, technical report SS-11-03, Stanford, California, USA, 21–23 Mar 2011
De Choudhury M, Lin YR, Sundaram H, Candan KS, Xie L, Kelliher A (2010) How does the data sampling strategy impact the discovery of information diffusion in social media? In: Proceedings of the 4th international association for the advancement of artificial intelligence (AAAI) conference on weblogs and social media
Garivier A, Moulines E (2011) On upper-confidence bound policies for switching bandit problems. In: Algorithmic learning theory—22nd international conference, ALT 2011, Espoo, Finland, 5–7 Oct 2011. Proceedings, pp 174–188. https://doi.org/10.1007/978-3-642-24412-4_16
Gentile C, Li S, Zappella G (2014) Online clustering of bandits. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, 21–26 June 2014, pp 757–765
Gisselbrecht T, Denoyer L, Gallinari P, Lamprier S (2015) Whichstreams: a dynamic approach for focused data capture from large social media. In: Proceedings of the ninth international conference on web and social media, ICWSM 2015, University of Oxford, Oxford, UK, 26–29 May 2015, pp 130–139
Gisselbrecht T, Lamprier S, Gallinari P (2016) Linear bandits in unknown environments. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2016, Riva del Garda, Italy, 19–23 Sept 2016, Proceedings, Part II, pp 282–298. https://doi.org/10.1007/978-3-319-46227-1_18
Gjoka M, Kurant M, Butts CT, Markopoulou A (2010) Walking in facebook: a case study of unbiased sampling of osns. In: INFOCOM 2010. 29th IEEE international conference on computer communications, joint conference of the IEEE computer and communications societies, 15–19 Mar 2010, San Diego, CA, USA, pp 2498–2506. https://doi.org/10.1109/INFCOM.2010.5462078
Gupta P, Goel A, Lin JJ, Sharma A, Wang D, Zadeh R (2013) WTF: the who to follow service at twitter. In: 22nd international World Wide Web conference, WWW ’13, Rio de Janeiro, Brazil, 13–17 May 2013, pp 505–514
Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: Proceedings of the 2010 ACM conference on recommender systems, RecSys 2010, Barcelona, Spain, 26–30 Sept 2010, pp 199–206. https://doi.org/10.1145/1864708.1864746
Hoffman M, Blei DM, Wang C, Paisley J (2012) Stochastic variational inference. J Mach Learn Res 14:1303–1347
MathSciNet MATH Google Scholar
Hong L, Davison BD (2010) Empirical study of topic modeling in twitter. In: Proceedings of the first workshop on social media analytics, 2010, pp 80–88
Joshi S, Boyd S (2009) Sensor selection via convex optimization. IEEE Trans Signal Process 57(2):451–462. https://doi.org/10.1109/TSP.2008.2007095
Article MathSciNet MATH Google Scholar
Kale S, Reyzin L, Schapire RE (2010) Non-stochastic bandit slate problems. In: Proceedings of the 23rd international conference on neural information processing systems, vol 1. Curran Associates Inc., USA, NIPS’10, pp 1054–1062
Kaufmann E, Korda N, Munos R (2012) Thompson sampling: an asymptotically optimal finite-time analysis. In: Algorithmic learning theory—23rd international conference, ALT 2012, Lyon, France, 29–31 Oct 2012. Proceedings, pp 199–213. https://doi.org/10.1007/978-3-642-34106-9_18
Kazerouni A, Ghavamzadeh M, Abbasi-Yadkori Y, Van Roy B (2016) Conservative contextual linear bandits. In: Proceedings of the thirty-first conference on neural information processing systems, NIPS’17, 2017. Proceedings, pp 3913–3922
Kingma DP, Welling M (2014) Auto-encoding variational Bayes. In: Proceedings of the 2nd international conference on learning representations, ICLR’14
Lage R, Denoyer L, Gallinari P, Dolog P (2013) Choosing which message to publish on social networks: a contextual bandit approach. In: Advances in social networks analysis and mining 2013, ASONAM ’13, Niagara, ON, Canada—25–29 Aug 2013, pp 620–627
Lai T, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv Appl Math 6(1):4–22
Article MathSciNet Google Scholar
Lamprier S, Gisselbrecht T, Gallinari P (2017) Variational Thompson sampling for relational recurrent bandits. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2017, Skopje, Macedonia, 18–22 Sept 2017, proceedings, Part II, pp 405–421. https://doi.org/10.1007/978-3-319-71246-8_25
Li L, Chu W, Langford J, Wang X (2011) Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In: Proceedings of the forth international conference on web search and web data mining, WSDM 2011, Hong Kong, China, 9–12 Feb 2011, pp 297–306. https://doi.org/10.1145/1935826.1935878
Li R, Wang S, Chang KC (2013) Towards social data platform: Automatic topic-focused monitor for twitter stream. In: Proceedings of the 39th international conference on very large data bases, PVLDB’13, 6(14), pp 1966–1977
Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. In: Proceedings of the 34-th international conference on machine learning (ICML), 2017, pp 2071–2080
Micarelli A, Gasparetti F (2007) Adaptive focused crawling. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web. Lecture notes in computer science, vol 4321. Springer, Berling, pp 231–262. https://doi.org/10.1007/978-3-540-72079-9_7
Qin L, Chen S, Zhu X (2014) Contextual combinatorial bandit and its application on diversified online recommendation. In: Proceedings of the 2014 SIAM international conference on data mining, Philadelphia, Pennsylvania, USA, 24–26 Apr 2014, pp 461–469. https://doi.org/10.1137/1.9781611973440.53
Slivkins A (2014) Contextual bandits with similarity information. J Mach Learn Res 15(1):2533–2568
MathSciNet MATH Google Scholar
Spaan MTJ, Lima PU (2009) A decision-theoretic approach to dynamic sensor selection in camera networks. In: Proceeding of the 19th international conference on automated planning and scheduling, pp 279–304
Taxidou I (2013) Realtime analysis of information diffusion in social media. In: Proceedings of the 39th international conference on very large data bases, PVLDB’13, 6(12), pp 1416–1421
Taxidou I, Fischer PM (2014) Rapid: A system for real-time analysis of information diffusion in twitter. In: Proceedings of the 23rd ACM international conference on information and knowledge management, CIKM 2014, Shanghai, China, 3–7 Nov 2014, pp 2060–2062. https://doi.org/10.1145/2661829.2661849
Thompson W (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Bull Am Math Soc 25:285–294
MATH Google Scholar
Vaswani S, Lakshmanan LVS (2015) Influence maximization with bandits. CoRR abs/1503.00024, arXiv:1503.00024
Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time twitter sentiment analysis of 2012 U.S. presidential election cycle. In: The 50th annual meeting of the association for computational linguistics, proceedings of the system demonstrations, 10 July 2012, Jeju Island, Korea, pp 115–120
Wang H, Wu Q, Wang H (2016) Learning hidden features for contextual bandits. In: Proceedings of the 25th ACM international on conference on information and knowledge management, ACM, New York, NY, USA, CIKM ’16, pp 1633–1642. https://doi.org/10.1145/2983323.2983847
Winn JM, Bishop CM (2005) Variational message passing. J Mach Learn Res 6:661–694
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research work has been carried out in the framework of the Technological Research Institute SystemX, and therefore granted with public funds within the scope of the French Program “Investissements d’Avenir”.

Author information

Authors and Affiliations

UPMC Univ Paris 06, CNRS, LIP6 UMR 7606, Sorbonne Universités, 4 place Jussieu, 75005, Paris, France
Sylvain Lamprier & Patrick Gallinari
SNIPS, 18 rue Saint Marc, 75002, Paris, France
Thibault Gisselbrecht

Authors

Sylvain Lamprier
View author publications
You can also search for this author in PubMed Google Scholar
Thibault Gisselbrecht
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Gallinari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sylvain Lamprier.

Additional information

Responsible editor: Kristian Kersting.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Proof for Proposition 3

Following the variational principle described above, the optimal variational distribution $q_{\beta }^{\star }(\beta )$ for $\beta $ satisfies:

$$\begin{aligned} \log q_{\beta }^{\star }(\beta )&=\,\mathbb {E}_{\beta ^{\backslash }}\Bigg [ \sum \limits _{i=1}^K\sum \limits _{s\in \mathcal {A}_{i,t-1}\cup \mathcal {B}_{i,t-1}} \log p(r_{i,s} | \beta , x_{i,s}) + \log p(\beta ) \Bigg ]+C\\&=\mathbb {E}_{\beta ^{\backslash }} \Bigg [ - \dfrac{1}{2} \sum \limits _{i=1}^{K} \sum \limits _{s\in \mathcal {A}_{i,t-1}\cup \mathcal {B}_{i,t-1}} (r_{i,s}-x_{i,s}^{\top }\beta )^{2} - \dfrac{1}{2} \beta ^{\top }\beta \Bigg ]+ C \\&= \mathbb {E}_{\beta ^{\backslash }} \Bigg [ - \dfrac{1}{2} \beta ^{\top }\left( I+\sum \limits _{i=1}^{K} \sum \limits _{s\in \mathcal {A}_{i,t-1}\cup \mathcal {B}_{i,t-1}}x_{i,s}x_{i,s}^{\top } \right) \beta \\&\quad + \beta ^{\top }\left( \sum \limits _{i=1}^{K} \sum \limits _{s\in \mathcal {A}_{i,t-1}\cup \mathcal {B}_{i,t-1}}x_{i,s}r_{i,s} \right) \Bigg ] +C \end{aligned}$$

where $\mathbb {E}_{\beta ^{\backslash }}$ stands for the expectation over all terms but $\beta $ and C denotes terms which do not depend on $\beta $.

By considering the independence of factors as defined in the formulation (18), and the definitions of $V_{t-1}$ and $\hat{\beta }_{t-1}$ in the proposition, we get:

$$\begin{aligned} \log q_{\beta }^{\star }(\beta )&= -\dfrac{1}{2}(\beta -\hat{\beta }_{t-1})^{\top }V_{t-1}(\beta -\hat{\beta }_{t-1}) +C \end{aligned}$$

which corresponds to a Gaussian $\mathcal {N}(\hat{\beta }_{t-1},V_{t-1}^{-1})$.

1.2 Proof for Proposition 4

Following the variational principle described above, the optimal variational distribution $q_{x_{i,s}}^{\star }(x_{i,s})$ for every action i and step $s<t$ satisfies:

$$\begin{aligned} \log q_{x_{i,s}}^{\star }(x_{i,s})&=\mathbb {E}_{x_{i,s}^{\backslash }}[\log p(r_{i,s} | \beta ,x_{i,s})+\log p(x_{i,s}|\mu _{i},\tau _{i})]+C\\&=-\dfrac{1}{2}\mathbb {E}_{x_{i,s}^{\backslash }}\left[ (r_{i,s}-x_{i,s}^{\top }\beta )^{2} +\tau _{i}(x_{i,s}-\mu _{i})^{\top }(x_{i,s}-\mu _{i})\right] +C\\&=-\dfrac{1}{2}\mathbb {E}_{x_{i,s}^{\backslash }}\left[ x_{i,s}^{\top }(\beta \beta ^{\top }+\tau _{i}I)x_{i,s}-2x_{i,s}^{\top }(\beta r_{i,s}+\mu _{i}\tau _{i})\right] +C\\ \end{aligned}$$

where $\mathbb {E}_{x_{i,s}^{\backslash }}$ stands for the expectation over all terms but $x_{i,s}$ and C denotes terms which do not depend on $x_{i,s}$.

By considering the independence of factors as defined in the formulation (18), and the definitions of $W_{i,s}^{-1}$ and $\hat{x}_{i,s}$ in the proposition, we get:

$$\begin{aligned} \log q_{x_{i,s}}^{\star }(x_{i,s})&=-\dfrac{1}{2}(x_{i,s}-\hat{x}_{i,s})^{\top }W_{i,s}(x_{i,s}-\hat{x}_{i,s}) +C \end{aligned}$$

which corresponds to a Gaussian $\mathcal {N}(\hat{x}_{i,s},W_{i,s}^{-1})$.

1.3 Proof for Proposition 5

Following the variational principle described above, the optimal variational distribution $q_{\mu _{i}}^{\star }(\mu _{i})$ for every action i at step t satisfies:

$$\begin{aligned} \log q_{\mu _{i}}^{\star }(\mu _{i})&=\mathbb {E}_{\mu _{i}^{\backslash }}\Bigg [\log p((x_{i,s})_{s\in \mathcal {B}_{i,t-1} \cup \mathcal {C}_{i,t-1}}|\mu _{i},\tau _{i}) +\log p(\mu _{i}|\tau _{i}) \Bigg ]+C\\&=\,\mathbb {E}_{\mu _{i}^{\backslash }} \Bigg [- \dfrac{\tau _{i}}{2} \left( \sum \limits _{\begin{array}{c} s\in \mathcal {B}_{i,t-1} \\ \cup \mathcal {C}_{i,t-1} \end{array}}(x_{i,s}-\mu _{i})^{\top }(x_{i,s}-\mu _{i}) + \mu _{i}^{\top }\mu _{i} \right) \Bigg ] +C\\&=\mathbb {E}_{\mu _{i}^{\backslash }}\Bigg [- \dfrac{\tau _i}{2} \mu _{i}^{\top }\mu _{i}(1+n_{i,t-1}) + \tau _i \mu _{i}^{\top } \sum \limits _{\begin{array}{c} s\in \mathcal {B}_{i,t-1} \\ \cup \mathcal {C}_{i,t-1} \end{array}}x_{i,s} \Bigg ]+C \end{aligned}$$

where $\mathbb {E}_{\mu _{i}^{\backslash }}$ stands for the expectation over all terms but $\mu _{i}$, C denotes terms which do not depend on $\mu _{i}$ and $n_{i,t-1}=|\mathcal {B}_{i,t-1}|+|\mathcal {C}_{i,t-1}|$.

By considering the independence of factors as defined in the formulation (18), and the definitions of $\Sigma _{i,t-1}^{-1}$ and $\hat{\mu }_{i,t-1}$ in the proposition, we get:

$$\begin{aligned} \log q_{\mu _{i}}^{\star }(\mu _{i})&= -\dfrac{1}{2}(\mu _{i}-\hat{\mu }_{i,t-1})^{\top }\Sigma _{i,t-1}(\mu _{i}-\hat{\mu }_{i,t-1}) +C \end{aligned}$$

which corresponds to a Gaussian $\mathcal {N}(\hat{\mu }_{i,t-1},\Sigma _{i,t-1}^{-1})$.

1.4 Proof for Proposition 6

Following the variational principle described above, the optimal variational distribution $q_{\tau _{i}}^{\star }(\tau _{i})$ for every action i at step t satisfies:

$$\begin{aligned} \log q_{\tau _{i}}^{\star }(\tau _{i})&=\mathbb {E}_{\tau _{i}^{\backslash }} \Bigg [ \log p((x_{i,s})_{s\in \mathcal {B}_{i,t-1} \cup \mathcal {C}_{i,t-1}}|\mu _{i},\tau _{i})+ \log p(\mu _{i}|\tau _{i})+\log p(\tau _{i}) \Bigg ] +C \end{aligned}$$

where $\mathbb {E}_{\tau _{i}^{\backslash }}$ stands for the expectation over all terms but $\tau _{i}$ and C denotes terms which do not depend on $\tau _{i}$.

Let us remind that $p(\tau _{i})$ corresponds to a Gamma distribution with parameters $a_{0}$ and $b_{0}$, whose density is given by: $p(\tau _{i})=\frac{b_{0}^{b_{0}}\tau _{i}^{a_{0}-1}e^{-\tau _{i}b_{0}}}{\Gamma (a_{0})}$, where $\Gamma (a_{0})$ stands for a normalization constant. Also, since the density of a Gaussian of mean m and covariance matrix $\tau ^{-1} I$ is given by $f(x;m,\tau ^{-1} I)=\frac{\tau _{i}^{d/2}}{(2\pi )^{d/2}}e^{-\tau _{i}/2(x-m)^{\top }(x-m)}$, we get:

$$\begin{aligned} \log q_{\tau _{i}}^{\star }(\tau _{i})&= \mathbb {E}_{\tau _{i}^{\backslash }} \Bigg [ -\dfrac{1}{2}\sum \limits _{s\in \mathcal {B}_{i,t-1} \cup \mathcal {C}_{i,t-1}}\tau _{i}(x_{i,s}-\mu _{i})^{\top }(x_{i,s}-\mu _{i}) \\&\quad -\dfrac{1}{2}\tau _{i}\mu _{i}^{\top }\mu _{i} +\sum \limits _{s\in \mathcal {B}_{i,t-1} \cup \mathcal {C}_{i,t-1}} \dfrac{d}{2}\log \tau _{i} \\&\quad + \dfrac{d}{2}\log \tau _{i} +(a_{0}-1)\log \tau _{i} -b_{0}\tau _{i} \Bigg ] +C \\&= \mathbb {E}_{\tau _{i}^{\backslash }} \Bigg [ -\dfrac{\tau _{i}}{2} (n_{i,t-1}+1) \mu _{i}^{\top }\mu _{i} + \tau _{i} \mu _{i}^{\top }\sum \limits _{s\in \mathcal {B}_{i,t-1} \cup \mathcal {C}_{i,t-1}}x_{i,s} \\&\quad + -\dfrac{\tau _{i}}{2} \sum \limits _{s\in \mathcal {B}_{i,t-1} \cup \mathcal {C}_{i,t-1}}x_{i,s}^{\top }x_{i,s} \\&\quad + (a_{0}-1)\log \tau _{i} -b_{0}\tau _{i}+\dfrac{d(n_{i,t-1}+1)}{2} \log \tau _{i} \Bigg ] + C \end{aligned}$$

By considering the independence of factors as defined in the formulation (18), and the definitions of $a_{i,t-1}$ and $b_{i,t-1}$ given in the proposition, we get:

$$\begin{aligned} \log q_{\tau _{i}}^{\star }(\tau _{i})&= (a_{i,t-1}-1 )\log \tau _{i}-b_{i,t-1}\tau _{i}+C \end{aligned}$$

which corresponds to a Gamma with parameters $a_{i,t-1}$, $b_{i,t-1}$.

1.5 Proof for Proposition 8

From Proposition 3, at step t, $\beta $ follows a Gaussian distribution with mean $\hat{\beta }_{t-1}$ and variance $V_{t-1}^{-1}$. Therefore, $\beta ^{\top }\hat{\mu }_{i,t-1}$ follows a Gaussian distribution with mean $\hat{\beta }_{t-1}^{\top }\hat{\mu }_{i,t-1}$ and variance $\hat{\mu }_{i,t-1}^{\top }V_{t-1}^{-1}\hat{\mu }_{i,t-1}$, hence we directly have: $\mathbb {P}\Big (|\beta ^{\top }\hat{\mu }_{i,t-1}-\hat{\beta }_{t-1}^{\top }\hat{\mu }_{i,t-1} | \le \alpha _{1} \sqrt{\hat{\mu }_{i,t-1}^{\top }V_{t-1}^{-1}\hat{\mu }_{i,t-1}}\Big )= 1-\delta _{1}$.

1.6 Proof for Proposition 9

From Proposition 5, at any step t, $\mu _{i}$ follows a Gaussian law of mean $\hat{\mu }_{i,t-1}$ and variance $\Sigma _{i,t-1}^{-1}$. Thus, the random variable $\mu _{i}-\hat{\mu }_{i,t-1}$ follows a Gaussian law of null mean and variance $\Sigma _{i,t-1}^{-1}$. Also, we know from this proposition that $\Sigma _{i,t-1}^{-1}=((1+n_{i,t-1})E[\tau _{i}])^{-1}I$, and thus $\Sigma _{i,t-1}^{-1}$ is diagonal with equal components on the diagonal. Let us temporarily denote $\Sigma _{i,t-1}^{-1}$ as $\sigma _{i,t}^{2} I$. Then, each component of the vector $\dfrac{\mu _{i}-\hat{\mu }_{i,t-1}}{\sigma _{i,t}}$ follows a standard Gaussian $\mathcal {N}(0,1)$, and therefore $\dfrac{||\mu _{i}-\hat{\mu }_{i,t-1}||^{2}}{\sigma _{i,t}^2}$ follows a $\chi ^{2}$ law with d freedom degrees. Let us note $\Psi $ the cumulative distribution function of the $\chi ^{2}$ law with d freedom degrees. We have:

$$\begin{aligned} \mathbb {P}\left( \dfrac{||\mu _{i}-\hat{\mu }_{i,t-1}||^{2}}{\sigma _{i,t}^{2}} \le \eta \right) = \Psi (\eta ) \end{aligned}$$

Equivalently, we have:

$$\begin{aligned} \mathbb {P}\left( ||\mu _{i}-\hat{\mu }_{i,t-1}|| \le \sqrt{\eta }\sigma _{i,t} \right) = \Psi (\eta ) \end{aligned}$$

Now let us consider that $|\beta ^{\top }(\mu _{i}-\hat{\mu }_{i,t-1}) | \le S||\mu _{i}-\hat{\mu }_{i,t-1}||$, since S is an upper-bound for $||\beta ||$. Therefore, we get:

$$\begin{aligned} \mathbb {P}\left( |\beta ^{\top }(\mu _{i}-\hat{\mu }_{i,t-1}) | \le S\sqrt{\eta }\sigma _{i,t} \right) \ge \Psi (\eta ) \end{aligned}$$

If we set $\Psi (\eta )=1-\delta _{2}$, it becomes:

$$\begin{aligned} \mathbb {P}\left( |\beta ^{\top }(\mu _{i}-\hat{\mu }_{i,t-1}) | \le S\sigma _{i,t}\sqrt{\Psi ^{-1}(1-\delta _{2})} \right) \ge 1-\delta _{2} \end{aligned}$$

where $\Psi ^{-1}$ stands for the inverse cumulative distribution function of a $\chi ^{2}$ law with d freedom degrees. Also, from Proposition 6, we know that $\sigma _{i,t}=\sqrt{\dfrac{1}{(1+n_{i,t-1})E[\tau _{i}]}}=\sqrt{\dfrac{b_{i,t-1}}{a_{i,t-1}(1+n_{i,t-1})}}$, which proves the proposition.

1.7 Proof for Theorem 2

From Propositions 8 and 9 and the Boole inequality, we get:

$$\begin{aligned}&P\left( |\mathbb {E}[r_{i,t}]-\hat{\beta }_{t-1}^{\top }\hat{\mu }_{i,t-1}|> \alpha _{1} \sqrt{\hat{\mu }_{i,t-1}^{\top }V_{t-1}^{-1}\hat{\mu }_{i,t-1}} + \alpha _{2}\sqrt{\dfrac{b_{i,t-1}}{a_{i,t-1}(1+n_{i,t-1})}} \right) \\&\quad = P\left( |\beta ^{\top }\hat{\mu }_{i,t-1} +\beta ^{\top }(\mu _{i}-\hat{\mu }_{i,t-1})-\hat{\beta }_{t-1}^{\top }\hat{\mu }_{i,t-1}| \right. \\&\qquad \left.>\alpha _{1} \sqrt{\hat{\mu }_{i,t-1}^{\top }V_{t-1}^{-1}\hat{\mu }_{i,t-1}} + \alpha _{2}\sqrt{\dfrac{b_{i,t-1}}{a_{i,t-1}(1+n_{i,t-1})}} \right) \\&\quad \le \mathbb {P}\left( |\beta ^{\top }\hat{\mu }_{i,t-1} - \hat{\beta }_{t-1}^{\top }\hat{\mu }_{i,t-1} |+|\beta ^{\top }(\mu _{i}-\hat{\mu }_{i,t-1}) | \right. \\&\qquad \left.> \alpha _{1} \sqrt{\hat{\mu }_{i,t-1}^{\top }V_{t-1}^{-1}\hat{\mu }_{i,t-1}}+\alpha _{2}\sqrt{\dfrac{b_{i,t-1}}{a_{i,t-1}(1+n_{i,t-1})}}\right) \\&\quad \le \mathbb {P}\left( \left\{ |\beta ^{\top }\hat{\mu }_{i,t-1} - \hat{\beta }_{t-1}^{\top }\hat{\mu }_{i,t-1} |> \alpha _{1} \sqrt{\hat{\mu }_{i,t-1}^{\top }V_{t-1}^{-1}\hat{\mu }_{i,t-1}} \right\} \text { } \right. \\&\qquad \left. \cup \text { } \left\{ |\beta ^{\top }(\mu _{i}-\hat{\mu }_{i,t-1}) | >\alpha _{2}\sqrt{\dfrac{b_{i,t-1}}{a_{i,t-1}(1+n_{i,t-1})}}\right\} \right) \\&\quad \le \delta _{1}+\delta _{2} \end{aligned}$$

By considering the contrapositive of this inequality, one obtain the announced result.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lamprier, S., Gisselbrecht, T. & Gallinari, P. Contextual bandits with hidden contexts: a focused data capture from social media streams. Data Min Knowl Disc 33, 1853–1893 (2019). https://doi.org/10.1007/s10618-019-00648-w

Download citation

Received: 11 August 2017
Accepted: 31 July 2019
Published: 10 August 2019
Issue Date: November 2019
DOI: https://doi.org/10.1007/s10618-019-00648-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Contextual bandits with hidden contexts: a focused data capture from social media streams

Abstract

Access this article

Similar content being viewed by others

Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting

Activity-based Twitter sampling for content-based and user-centric prediction models

Analytic Challenges in Social Sensing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 Proof for Proposition 3

1.2 Proof for Proposition 4

1.3 Proof for Proposition 5

1.4 Proof for Proposition 6

1.5 Proof for Proposition 8

1.6 Proof for Proposition 9

1.7 Proof for Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Contextual bandits with hidden contexts: a focused data capture from social media streams

Abstract

Access this article

Similar content being viewed by others

Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting

Activity-based Twitter sampling for content-based and user-centric prediction models

Analytic Challenges in Social Sensing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 Proof for Proposition 3

1.2 Proof for Proposition 4

1.3 Proof for Proposition 5

1.4 Proof for Proposition 6

1.5 Proof for Proposition 8

1.6 Proof for Proposition 9

1.7 Proof for Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation