Measuring Controversy in Social Networks Through NLP

de Zarate, Juan Manuel Ortiz; Di Giovanni, Marco; Feuerstein, Esteban Zindel; Brambilla, Marco

doi:10.1007/978-3-030-59212-7_14

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12303))

Included in the following conference series:

International Symposium on String Processing and Information Retrieval

845 Accesses
4 Citations

Abstract

Nowadays controversial topics on social media are often linked to hate speeches, fake news propagation, and biased or misinformation spreading. Detecting controversy in online discussions is a challenging task, but essential to stop these unhealthy behaviours.

In this work, we develop a general pipeline to quantify controversy on social media through content analysis, and we widely test it on Twitter.

Our approach can be outlined in four phases: an initial graph building phase, a community identification phase through graph partitioning, an embedding phase, using language models, and a final controversy score computation phase. We obtain an index that quantifies the intuitive notion of controversy.

To test that our method is general and not domain-, language-, geography- or size-dependent, we collect, clean and analyze 30 Twitter datasets about different topics, half controversial and half not, changing domains and magnitudes, in six different languages from all over the world.

The results confirm that our pipeline can quantify correctly the notion of controversy, reaching a ROC AUC score of 0.996 over controversial and non-controversial scores distributions. It outperforms the state-of-the-art approaches, both in terms of accuracy and computational speed.

J. M. O. de Zarate and M. Di Giovanni—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.reddit.com/.
2.
Kullback–Leibler divergence is a measure of how a probability distribution is different from a reference probability distribution.
3.
Code and datasets used in this work are available here: https://github.com/jmanuoz/Measuring-controversy-in-Social-Networks-through-NLP.
4.
https://github.com/jmanuoz/Measuring-controversy-in-Social-Networks-through-NLP.

References

Adamic, L.A., Glance, N.: The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, pp. 36–43. ACM (2005)
Google Scholar
Akoglu, L.: Quantifying political polarity based on bipartite opinion networks. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)
Google Scholar
Al-Ayyoub, M., Rabab’ah, A., Jararweh, Y., Al-Kabi, M.N., Gupta, B.B.: Studying the controversy in online crowds’ interactions. Appl. Soft Comput. 66, 557–563 (2018)
Article Google Scholar
Allport, G.W., Clark, K., Pettigrew, T.: The Nature of Prejudice. Addison-Wesley, Reading (1954)
Google Scholar
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Article Google Scholar
Bessi, A., Caldarelli, G., Del Vicario, M., Scala, A., Quattrociocchi, W.: Social determinants of content selection in the age of (mis)information. In: Aiello, L.M., McFarland, D. (eds.) SocInfo 2014. LNCS, vol. 8851, pp. 259–268. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13734-6_18
Chapter Google Scholar
Bild, D.R., Liu, Y., Dick, R.P., Mao, Z.M., Wallach, D.S.: Aggregate characterization of user behavior in Twitter and analysis of the retweet graph. ACM Trans. Internet Technol. (TOIT) 15(1), 1–24 (2015)
Article Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), P10008 (2008)
Article Google Scholar
Calvo, E.: Anatomía política de Twitter en argentina. Tuiteando# Nisman. Capital Intelectual, Buenos Aires (2015)
Google Scholar
Conover, M.D., Ratkiewicz, J., Francisco, M., Gonçalves, B., Menczer, F., Flammini, A.: Political polarization on Twitter. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)
Google Scholar
Dandekar, P., Goel, A., Lee, D.T.: Biased assimilation, homophily, and the dynamics of polarization. Proc. Natl. Acad. Sci. 110(15), 5791–5796 (2013)
Article MathSciNet Google Scholar
De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.: The mahalanobis distance. Chemometr. Intell. Lab. Syst. 50(1), 1–18 (2000)
Article Google Scholar
Del Vicario, M., Zollo, F., Caldarelli, G., Scala, A., Quattrociocchi, W.: Mapping social dynamics on Facebook: the Brexit debate. Soc. Netw. 50, 6–16 (2017)
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Di Giovanni, M., Brambilla, M., Ceri, S., Daniel, F., Ramponi, G.: Content-based classification of political inclinations of Twitter users. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 4321–4327 (2018)
Google Scholar
Dori-Hacohen, S., Allan, J.: Automated controversy detection on the web. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 423–434. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16354-3_46
Chapter Google Scholar
Easley, D., Kleinberg, J., et al.: Networks, Crowds, and Markets, vol. 8. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Feng, W., Wang, J.: Retweet or not?: personalized tweet re-ranking. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 577–586. ACM (2013)
Google Scholar
Garimella, K., De Francisci Morales, G., Gionis, A., Mathioudakis, M.: Reducing controversy by connecting opposing views. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 81–90. ACM (2017)
Google Scholar
Garimella, K., Morales, G.D.F., Gionis, A., Mathioudakis, M.: Quantifying controversy on social media. ACM Trans. Soc. Comput. 1(1), 3 (2018)
Article Google Scholar
Grčar, M., Cherepnalkoski, D., Mozetič, I., Kralj Novak, P.: Stance and influence of Twitter users regarding the Brexit referendum. Comput. Soc. Netw. 4(1), 1–25 (2017). https://doi.org/10.1186/s40649-017-0042-6
Article Google Scholar
Guerra, P.C., Meira Jr., W., Cardie, C., Kleinberg, R.: A measure of polarization on social media networks based on community boundaries. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)
Google Scholar
Hong, S.: Online news on Twitter: newspapers’ social media adoption and their online readership. Inf. Econ. Policy 24(1), 69–74 (2012)
Article Google Scholar
Jacomy, M., Venturini, T., Heymann, S., Bastian, M.: ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS One 9(6), e98679 (2014)
Article Google Scholar
Jang, M.: Probabilistic models for identifying and explaining controversy (2019)
Google Scholar
Jang, M., Foley, J., Dori-Hacohen, S., Allan, J.: Probabilistic approaches to controversy detection. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 2069–2072 (2016)
Google Scholar
Jeh, G., Widom, J.: SimRank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 538–543. ACM (2002)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM (JACM) 46(5), 604–632 (1999)
Article MathSciNet Google Scholar
Kulshrestha, J., Zafar, M.B., Noboa, L.E., Gummadi, K.P., Ghosh, S.: Characterizing information diets of social media users. In: Ninth International AAAI Conference on Web and Social Media (2015)
Google Scholar
Kumar, S., Hamilton, W.L., Leskovec, J., Jurafsky, D.: Community interaction and conflict on the web. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 933–943. International World Wide Web Conferences Steering Committee (2018)
Google Scholar
Kupavskii, A., et al.: Prediction of retweet cascade size over time. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2335–2338. ACM (2012)
Google Scholar
LaCour, M.: A balanced news diet, not selective exposure: evidence from a direct measure of media exposure. In: APSA 2012 Annual Meeting Paper (2015)
Google Scholar
Lahoti, P., Garimella, K., Gionis, A.: Joint non-negative matrix factorization for learning ideological leaning on Twitter. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 351–359. ACM (2018)
Google Scholar
Matakos, A., Terzi, E., Tsaparas, P.: Measuring and moderating opinion polarization in social networks. Data Min. Knowl. Disc. 31(5), 1480–1505 (2017). https://doi.org/10.1007/s10618-017-0527-9
Article MathSciNet MATH Google Scholar
Morales, A., Borondo, J., Losada, J.C., Benito, R.M.: Measuring political polarization: Twitter shows the two sides of Venezuela. Chaos: Interdisc. J. Nonlinear Sci. 25(3), 033114 (2015)
Article Google Scholar
Munson, S.A., Lee, S.Y., Resnick, P.: Encouraging reading of diverse political viewpoints with a browser widget. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)
Google Scholar
Pettigrew, T.F., Tropp, L.R.: Does intergroup contact reduce prejudice? Recent meta-analytic findings. In: Reducing Prejudice and Discrimination, pp. 103–124. Psychology Press (2013)
Google Scholar
Rajadesingan, A., Liu, H.: Identifying users with opposing opinions in Twitter debates. In: Kennedy, W.G., Agarwal, N., Yang, S.J. (eds.) SBP 2014. LNCS, vol. 8393, pp. 153–160. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05579-4_19
Chapter Google Scholar
Ramponi, G., Brambilla, M., Ceri, S., Daniel, F., Di Giovanni, M.: Vocabulary-based community detection and characterization. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. SAC 2019, pp. 1043–1050. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3297280.3297384
Ramponi, G., Brambilla, M., Ceri, S., Daniel, F., Giovanni, M.D.: Content-based characterization of online social communities. Inf. Process. Manag., 102133 (2019). https://doi.org/10.1016/j.ipm.2019.102133, http://www.sciencedirect.com/science/article/pii/S0306457319303516
Sapienza, F., Groisman, P.: Distancia de fermat y geodesicas en percolacion euclidea:teoriaa y aplicaciones en machine learning. M.sc. thesis (2018). http://cms.dm.uba.ar/academico/carreras/licenciatura/tesis/2018/Sapienza.pdf
Shearer, E., Gottfried, J.: News use across social media platforms 2017. Pew Research Center 7 (2017)
Google Scholar
Stewart, L.G., Arif, A., Starbird, K.: Examining trolls and polarization with a retweet network. In: Proceedings of the ACM WSDM, Workshop on Misinformation and Misbehavior Mining on the Web (2018)
Google Scholar
Tran, T., Ostendorf, M.: Characterizing the language of online communities and its relation to community reception. arXiv preprint arXiv:1609.04779 (2016)
Trilling, D.: Two different debates? Investigating the relationship between a political debate on TV and simultaneous comments on Twitter. Soc. Sci. Comput. Rev. 33(3), 259–276 (2015)
Article Google Scholar
Van Der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)
MathSciNet MATH Google Scholar
Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762
Venturini, T., Jacomy, M., Jensen, P.: What do we see when we look at networks. An introduction to visual network analysis and force-directed layouts. An introduction to visual network analysis and force-directed layouts, 26 April 2019 (2019)
Google Scholar
Weller, K., Bruns, A., Burgess, J., Mahrt, M., Puschmann, C.: Twitter and Society, vol. 89. Peter Lang, Bern (2014)
Book Google Scholar
Xiao, H.: Bert-as-service (2018). https://github.com/hanxiao/bert-as-service
Yang, X., Macdonald, C., Ounis, I.: Using word embeddings in Twitter election classification. Inf. Retrieval J. 21(2–3), 183–207 (2017). https://doi.org/10.1007/s10791-017-9319-5
Article Google Scholar
Yardi, S., Boyd, D.: Dynamic debates: an analysis of group polarization over time on Twitter. Bull. Sci. Technol. Soc. 30(5), 316–327 (2010)
Article Google Scholar
Zachary, W.W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33(4), 452–473 (1977)
Article Google Scholar
de Zarate, J.M.O., Feuerstein, E.: Vocabulary-based method for quantifying controversy in social media. arXiv preprint arXiv:2001.09899 (2020)

Download references

Author information

Authors and Affiliations

Universidad de Buenos Aires, C1053, Buenos Aires, Argentina
Juan Manuel Ortiz de Zarate & Esteban Zindel Feuerstein
Politecnico di Milano, Milan, 20133, Italy
Marco Di Giovanni & Marco Brambilla

Authors

Juan Manuel Ortiz de Zarate
View author publications
You can also search for this author in PubMed Google Scholar
Marco Di Giovanni
View author publications
You can also search for this author in PubMed Google Scholar
Esteban Zindel Feuerstein
View author publications
You can also search for this author in PubMed Google Scholar
Marco Brambilla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Manuel Ortiz de Zarate .

Editor information

Editors and Affiliations

CISE Department, University of Florida, Gainesville, FL, USA
Christina Boucher
Department of Computer Science, University of Central Florida, Orlando, FL, USA
Sharma V. Thankachan

Appendix A Details on the discussions

Table 2. Datasets statistics, the top group represent controversial topics, while the bottom one represent non-controversial ones

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Zarate, J.M.O., Di Giovanni, M., Feuerstein, E.Z., Brambilla, M. (2020). Measuring Controversy in Social Networks Through NLP. In: Boucher, C., Thankachan, S.V. (eds) String Processing and Information Retrieval. SPIRE 2020. Lecture Notes in Computer Science(), vol 12303. Springer, Cham. https://doi.org/10.1007/978-3-030-59212-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-59212-7_14
Published: 17 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59211-0
Online ISBN: 978-3-030-59212-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Measuring Controversy in Social Networks Through NLP

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A Details on the discussions

Appendix A Details on the discussions

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation