ABSTRACT
Nowadays, Online Social Networks (OSN) are commonly used by groups of users to communicate. Members of a family, colleagues, fans of a brand, political groups: the demand for a precise identification of these groups is increasing from brand monitoring, business intelligence and e-reputation management.
However, a gap can be observed between the communities detected by many data analytics algorithms on OSN, and effective groups existing in real life: the detected communities often lack of meaning and internal semantic cohesion. Most of existing literature on OSN either focuses on the community detection problem in graphs without considering the topic of the messages exchanged, or concentrates exclusively on the messages without taking into account the social links.
In this article, we support the hypothesis that communities extracted on OSN should be topically coherent. We therefore propose a model to represent the interaction between users on Twitter, the reference on micro-blogging OSN, and metrics to evaluate the topical cohesion of the detected communities. As an evaluation, we measure the topical cohesion of the groups of users detected by a baseline community detection algorithm, using two measures inspired from the classification domain, and one measure inspired from the NLP domain.
A detailed analysis is performed on a big tweet dataset, from which a user graph is built. Introduced measures are compared with statistics to better picture the experiment, and yield interesting insights on a social and textual corpus.
- Alessia Amelio and Clara Pizzuti. 2015. Analysis of the Italian Tweet Political Sentiment in 2014 European Elections. In Tools with Artificial Intelligence (ICTAI), 2015 IEEE 27th International Conference on. IEEE, 713--720. Google ScholarDigital Library
- Thomas Aynaud and Jean-Loup Guillaume. 2010. Static community detection algorithms for evolving networks. In Modeling and optimization in mobile, ad hoc and wireless networks (WiOpt), 2010 proceedings of the 8th international symposium on. IEEE, 513--519.Google Scholar
- Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008.Google Scholar
- Pete Burnap, Omer F Rana, Nick Avis, Matthew Williams, William Housley, Adam Edwards, Jeffrey Morgan, and Luke Sloan. 2015. Detecting tension in online communities with computational Twitter analysis. Technological Forecasting and Social Change 95 (2015), 96--108.Google ScholarCross Ref
- Remy Cazabet and Frederic Amblard. 2011. Simulate to detect: a multi-agent system for community detection. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on, Vol. 2. IEEE, 402--408. Google ScholarDigital Library
- Laurence Cholvy. 2016. Influence-Based Opinion Diffusion. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 1355--1356. Google ScholarDigital Library
- Aaron Clauset, Mark EJ Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Physical review E 70, 6 (2004), 066111.Google Scholar
- Qi Gao, Fabian Abel, Geert-Jan Houben, and Yong Yu. 2012. A comparative study of usersâĂŹ microblogging behavior on Sina Weibo and Twitter. In International Conference on User Modeling, Adaptation, and Personalization. Springer, 88--101. Google ScholarDigital Library
- Christos Giatsidis, Fragkiskos D Malliaros, and Michalis Vazirgiannis. 2013. Advanced graph mining for community evaluation in social networks and the web. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 771--772. Google ScholarDigital Library
- Derek Greene, Derek O'Callaghan, and Pádraig Cunningham. 2012. Identifying Topical Twitter Communities via User List Aggregation. COMMPER 2012 (2012), 41.Google Scholar
- Steve Gregory. 2008. A fast algorithm to find overlapping communities in networks. Machine learning and knowledge discovery in databases (2008), 408--423.Google Scholar
- Adrien Guille and Cécile Favre. 2015. Event detection, tracking, and visualization in twitter: a mention-anomaly-based approach. Social Network Analysis and Mining 5, 1 (2015), 1--18.Google ScholarCross Ref
- Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. 2007. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, 56--65. Google ScholarDigital Library
- Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014).Google Scholar
- Nikolay Kinash, Alexei Tikhomirov, Andrey Trufanov, Olga Berestneva, Alexandr Boukhanovsky, and Zamira Ashurova. 2015. Analysis of large-scale networks using high performance technology (vkontakte case study). In Creativity in Intelligent, Technologies and Data Science. Springer, 531--541.Google Scholar
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World wide web. ACM, 591--600. Google ScholarDigital Library
- Cédric Lagnier, Ludovic Denoyer, Eric Gaussier, and Patrick Gallinari. 2013. Predicting information diffusion in social networks using content and userâĂŹs profiles. In European Conference on Information Retrieval. Springer, 74--85. Google ScholarDigital Library
- Thomas K Landauer, Peter W Foltz, and Darrell Laham. 1998. An introduction to latent semantic analysis. Discourse processes 25, 2--3 (1998), 259--284.Google Scholar
- Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). 1188--1196. Google ScholarDigital Library
- Changhyun Lee, Haewoon Kwak, Hosung Park, and Sue Moon. 2010. Finding influentials based on the temporal order of information adoption in twitter. In Proceedings of the 19th international conference on World wide web. ACM, 1137--1138. Google ScholarDigital Library
- Kwan Hui Lim and Amitava Datta. 2012. Finding Twitter communities with common interests using following links of celebrities. In Proceedings of the 3rd international workshop on Modeling social media. ACM, 25--32. Google ScholarDigital Library
- Kwan Hui Lim and Amitava Datta. 2016. An interaction-based approach to detecting highly interactive twitter communities using tweeting links. In Web Intelligence, Vol. 14. IOS Press, 1--15.Google Scholar
- Yan Liu, Alexandru Niculescu-Mizil, and Wojciech Gryc. 2009. Topic-link LDA: joint models of topic and author community. In proceedings of the 26th annual international conference on machine learning. ACM, 665--672. Google ScholarDigital Library
- Gabriel Magno, Giovanni Comarela, Diego Saez-Trumper, Meeyoung Cha, and Virgilio Almeida. 2012. New kid on the block: Exploring the google+ social graph. In Proceedings of the 2012 ACM conference on Internet measurement conference. ACM, 159--170. Google ScholarDigital Library
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
- Mark EJ Newman. 2006. Modularity and community structure in networks. Proceedings of the national academy of sciences 103, 23 (2006), 8577--8582.Google ScholarCross Ref
- Pieter Noordhuis, Michiel Heijkoop, and Alexander Lazovik. 2010. Mining twitter in the cloud: A case study. In Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on. IEEE, 107--114. Google ScholarDigital Library
- Olutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A Smith. 2013. Improved part-of-speech tagging for online conversational text with word clusters. Association for Computational Linguistics.Google Scholar
- Warren Pearce, Kim Holmberg, Iina Hellsten, and Brigitte Nerlich. 2014. Climate change on Twitter: Topics, communities and conversations about the 2013 IPCC Working Group 1 report. PloS one 9, 4 (2014), e94785.Google ScholarCross Ref
- Pascal Pons and Matthieu Latapy. 2005. Computing communities in large networks using random walks. In International Symposium on Computer and Information Sciences. Springer, 284--293. Google ScholarDigital Library
- François Queyroi, Laurent Beauguitte, and Hugues Pecout. 2015. RSS Flows, World Structure & Community detection. In European Colloquium of Theoretical and Quantitative Geography.Google Scholar
- Adithya Rao, Nemanja Spasojevic, Zhisheng Li, and Trevor DSouza. 2015. Klout score: Measuring influence across multiple social networks. In Big Data (Big Data), 2015 IEEE International Conference on. IEEE, 2282--2289. Google ScholarDigital Library
- Kevin Dela Rosa, Rushin Shah, Bo Lin, Anatole Gershman, and Robert Frederking. 2011. Topical clustering of tweets. Proceedings of the ACM SIGIR: SWSM (2011).Google Scholar
- M Rosvall and CT Bergstrom. 2007. Maps of information flow reveal community structure in complex networks. Technical Report. Technical report.Google Scholar
- Yulia Tyshchuk, William A Wallace, Hao Li, Heng Ji, and Sue E Kase. 2014. The nature of communications and emerging communities on Twitter following the 2013 Syria sarin gas attacks. In Intelligence and Security Informatics Conference (JISIC), 2014 IEEE Joint. IEEE, 41--47. Google ScholarDigital Library
- Christo Wilson, Bryce Boe, Alessandra Sala, Krishna PN Puttaswamy, and Ben Y Zhao. 2009. User interactions in social networks and their implications. In Proceedings of the 4th ACM European conference on Computer systems. Acm, 205--218. Google ScholarDigital Library
- Zhijun Yin, Liangliang Cao, Quanquan Gu, and Jiawei Han. 2012. Latent community topic analysis: Integration of community discovery with topic modeling. ACM Transactions on Intelligent Systems and Technology (TIST) 3, 4 (2012), 63. Google ScholarDigital Library
- Zhongying Zhao, Shengzhong Feng, Qiang Wang, Joshua Zhexue Huang, Graham J Williams, and Jianping Fan. 2012. Topic oriented community detection through social objects and link analysis in social networks. Knowledge-Based Systems 26 (2012), 164--173. Google ScholarDigital Library
Recommendations
From community detection to topical, interactive group detection in Online Social Networks
WI '19 Companion: IEEE/WIC/ACM International Conference on Web Intelligence - Companion VolumeOnline social networks are prevalent media, either for e-reputation management or for political debate. The presence of malicious actors is often signalled and usually detected by analysts, helped by monitoring tools. These tools still lack of a ...
Topical cohesion of communities on Twitter
Nowadays, Online Social Networks (OSN) are commonly used by groups of users to communicate. Members of a family, colleagues, fans of a brand, political groups... There is an increasing demand for a precise identification of these groups, coming from ...
Time-Sensitive Topic-Based Communities on Twitter
Proceedings of the 29th Canadian Conference on Artificial Intelligence on Advances in Artificial Intelligence - Volume 9673This paper tackles the problem of detecting temporal content-based user communities from Twitter. Most existing content-based community detection methods consider the users who share similar topical interests to be like-minded and use this as a basis to ...
Comments