Abstract
The cognitive constraints that humans exhibit in their social interactions have been extensively studied by anthropologists, who have highlighted their regularities across different types of social networks. We postulate that similar regularities can be found in other cognitive processes, such as those involving language production. In order to provide preliminary evidence for this claim, we analyse a dataset containing tweets of a heterogeneous group of Twitter users (regular users and professional writers). Leveraging a methodology similar to the one used to uncover the well-established social cognitive constraints, we find that a concentric layered structure (which we call ego network of words, in analogy to the ego network of social relationships) very well captures how individuals organise the words they use. The size of the layers in this structure regularly grows (approximately 2–3 times with respect to the previous one) when moving outwards, and the two penultimate external layers consistently account for approximately 60% and 30% of the used words (the outermost layer contains 100% of the words), irrespective of the number of the total number of layers of the user.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
Functional words may also depend on the style of an author (and due to this they are often used in stylometry). Still, whether their usage require a significant cognitive effort is arguable, hence in this work we opted for their removal.
References
Aral, S., Van Alstyne, M.: The diversity-bandwidth trade-off. Am. J. Sociol. 117(1), 90–171 (2011)
Arnaboldi, V., Conti, M., La Gala, M., Passarella, A., Pezzoni, F.: Information diffusion in OSNs: the impact of nodes’ sociality. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 616–621. ACM (2014)
Boldrini, C., Toprak, M., Conti, M., Passarella, A.: Twitter and the press: an ego-centred analysis. In: Companion Proceedings of the The Web Conference 2018, pp. 1471–1478 (2018)
Broadbent, D.E.: Word-frequency effect and response bias. Psychol. Rev. 74(1), 1 (1967)
Brysbaert, M., Mandera, P., Keuleers, E.: The word frequency effect in word processing: an updated review. Curr. Direct. Psychol. Sci. 27(1), 45–50 (2018)
Brysbaert, M., Stevens, M., Mandera, P., Keuleers, E.: How many words do we know? Practical estimates of vocabulary size dependent on word definition, the degree of language input and the participant’s age. Front. Psychol. 7(Jul), 1116 (2016)
Clauset, A., Shalizi, C.R., Newman, M.E.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)
Davis, C.A., Varol, O., Ferrara, E., Flammini, A., Menczer, F.: BotOrNot: a system to evaluate social bots. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 273–274 (2016)
Diaz, M.T., McCarthy, G.: A comparison of brain activity evoked by single content and function words: an FMRI investigation of implicit word processing. Brain Res. 1282, 38–49 (2009)
Dunbar, R.: The social brain hypothesis. Evol. Anthropol. 9(10), 178–190 (1998)
Dunbar, R.: Theory of Mind and the Evolution of Language. Approaches to the Evolution of Language (1998)
Dunbar, R.I., Arnaboldi, V., Conti, M., Passarella, A.: The structure of online social networks mirrors those in the offline world. Soc. Netw. 43, 39–47 (2015)
Friederici, A.D., Opitz, B., Von Cramon, D.Y.: Segregating semantic and syntactic aspects of processing in the human brain: an FMRI investigation of different word types. Cerebr. Cortex 10(7), 698–705 (2000)
Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theor. 21(1), 32–40 (1975)
Gonçalves, B., Perra, N., Vespignani, A.: Modeling users’ activity on twitter networks: validation of Dunbar’s number. PloS ONE 6(8), e22656 (2011)
Haerter, J.O., Jamtveit, B., Mathiesen, J.: Communication dynamics in finite capacity social networks. Phys. Rev. Lett. 109(16), 168701 (2012)
Hill, R.A., Dunbar, R.I.: Social network size in humans. Hum. Nat. 14(1), 53–72 (2003)
Jenks, G.F.: Optimal data classification for choropleth maps. Department of Geography, University of Kansas Occasional Paper (1977)
Levelt, W.J., Roelofs, A., Meyer, A.S.: A theory of lexical access in speech production. Behav. Brain Sci. 22(1), 1–38 (1999)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
Miritello, G., et al.: Time as a limited resource: communication strategy in mobile phone networks. Soc. Netw. 35(1), 89–95 (2013)
Perfetti, C.A., Wlotko, E.W., Hart, L.A.: Word learning and individual differences in word learning reflected in event-related potentials. J. Exp. Psychol. Learn. Memory Cogn. 31(6), 1281 (2005)
Qu, Q., Zhang, Q., Damian, M.F.: Tracking the time course of lexical access in orthographic production: an event-related potential study of word frequency effects in written picture naming. Brain Lang. 159, 118–126 (2016)
Sutcliffe, A.G., Wang, D., Dunbar, R.I.: Modelling the role of trust in social relationships. ACM Trans. Internet Technol. (TOIT) 15(4), 16 (2015)
Varol, O., Davis, C.A., Menczer, F., Flammini, A.: Feature engineering for social bot detection. In: Feature Engineering for Machine Learning and Data Analytics, pp. 311–334. CRC Press (2018)
Zhou, W.X., Sornette, D., Hill, R.a., Dunbar, R.I.M.: Discrete hierarchical organization of social group sizes. Proc. Biol. Sci. Roy. Soc. 272(1561), 439–444 (2005)
Zipf, G.K.: Human Behavior and the Principle of Least Effort (1949)
Acknowledgements
This work was partially funded by the SoBigData++, HumaneAI-Net, MARVEL, and OK-INSAID projects. The SoBigData++ project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871042. The HumaneAI-Net project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 952026. The MARVEL project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 957337. The OK-INSAID project has received funding from the Italian PON-MISE program under grant agreement ARS01 00917.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Identifying Active Twitter Users
In order to be relevant to our work, a Twitter account must be an active account, which we define as an account not abandoned by its user and that tweets regularly. A Twitter account is considered abandoned, and we discard it, if the time since the last tweet is significantly bigger (we set this threshold at 6 months, as previously done also in [3]) than the largest period of inactivity for the account. We also consider the tweeting regularity, measured by counting the number of months where the user has been inactive. The account is tagged as sporadic, and discarded, if this number of months represents more than 50% of the observation period (defined as the time between the first tweet of a user in our dataset and the download time). We also discard accounts whose entire timeline is covered by the 3200 tweets that we are able to download, because their Twitter behaviour might have yet to stabilise (it is known that the tweeting activity needs a few months after an account is created to stabilise).
1.2 A.2 Additional Tables
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ollivier, K., Boldrini, C., Passarella, A., Conti, M. (2020). Structural Invariants in Individuals Language Use: The “Ego Network” of Words. In: Aref, S., et al. Social Informatics. SocInfo 2020. Lecture Notes in Computer Science(), vol 12467. Springer, Cham. https://doi.org/10.1007/978-3-030-60975-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-60975-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60974-0
Online ISBN: 978-3-030-60975-7
eBook Packages: Computer ScienceComputer Science (R0)