Skip to main content

Research on Online Digital Cultures - Community Extraction and Analysis by Markov and k-Means Clustering

  • Conference paper
  • First Online:
Personal Analytics and Privacy. An Individual and Collective Perspective (PAP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10708))

Included in the following conference series:

  • 758 Accesses

Abstract

We investigate approaches to personal data analytics that involves the participation of all actors in our shared digital culture. We analyse their communities by identifying and clustering social relations using mobile and social media data. The work is part of our effort to develop tools to create a “social data commons”, an open research environment that will share innovative tools and data sets to researchers interested in accessing the data that surrounds the production and circulation of digital culture and their actors. This experiment focuses on the groups of clustered relations that are formed within a user’s social data traces. Community extraction is a popular part of the analysis of social data. We have applied the technique of Markov Clustering to the Twitter networks of social actors. Qualitatively, we demonstrate that it is more effective than the Louvain method for finding social groups known to the subjects, while still being very simple to implement. We also demonstrate that traces of cell towers captured using our “MobileMiner” mobile application are sufficient to capture significant details about their social relations by the simple application of k-means.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Levandoski, J., Sarwat, M., Eldawy, A., Mokbel, M.: LARS: a location-aware recommender system, pp. 450–461 (2012)

    Google Scholar 

  2. Wayback machine archive of the University of Minnesota website. http://web.archive.org/web/20161202171606/http://www-users.cs.umn.edu/sarwat/foursquaredata/

  3. Young rewired state website. https://yrs.io/

  4. Blanke, T., Greenway, G., Pybus, J., Cote, M.: Mining mobile youth cultures. In: 2014 IEEE International Conference on Big Data, pp. 14–17 (2014)

    Google Scholar 

  5. Pybus, J., Coté, M., Blanke, T.: Hacking the social life of big data. Big Data Soc. 2(2) (2015). https://doi.org/10.1177/2053951715616649

  6. Beer, D., Burrows, R.: Popular culture, digital archives and the new social life of data. Theor. Cult. Soc. 30(4), 47–71 (2013)

    Article  Google Scholar 

  7. Pybus, J.: Social networks and cultural workers. J. Cult. Econ. 6(2), 137–152 (2013)

    Article  Google Scholar 

  8. Conversocial website. http://www.conversocial.com/

  9. Sproutsocial website. http://sproutsocial.com/

  10. Twitonomy website. http://www.twitonomy.com/

  11. Simplymeasured website. http://simplymeasured.com/

  12. Preotiuc-Pietro, D., Samangooei, S., Cohn, T., Gibbins, N., Niranjan, M.: Trendminer: an architecture for real time analysis of social media text. In: 6th International AAAI Conference on Weblogs and Social Media (ICWSM 2012), June 2012

    Google Scholar 

  13. Big social data project website. http://big-social-data.net/

  14. Bruns, A.: How long is a tweet? Mapping dynamic conversation networks on Twitter using Gawk and Gephi. Inf. Commun. Soc. 15(9), 1323–1351 (2012)

    Article  Google Scholar 

  15. Himelboim, I., McCreery, S., Smith, M.: Birds of a feather tweet together: integrating network and content analyses to examine cross-ideology exposure on Twitter. J. Comput. Med. Commun. 18(2), 40–60 (2013)

    Article  Google Scholar 

  16. Nodexl website. http://nodexl.codeplex.com/

  17. Van Dongen, S.: Graph clustering via a discrete uncoupling process. SIAM J. Matrix Anal. Appl. 30(1), 121–141 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  18. Opencellid website. https://opencellid.org/

  19. OpenStreetMap Contributors: Planet dump (2017). https://planet.osm.org, https://www.openstreetmap.org

  20. Isaacman, S., Becker, R., Cáceres, R., Kobourov, S., Martonosi, M., Rowland, J., Varshavsky, A.: Identifying important places in people’s lives from cellular network data. In: International Conference on Pervasive Computing, pp. 133–151 (2011)

    Google Scholar 

  21. van der Walt, S., Colbert, S.C., Varoquaux, G.: The NumPy array: a structure for efficient numerical computation (2011)

    Google Scholar 

  22. Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks (2009)

    Google Scholar 

  23. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theor. Exp. 2008(10), P10008 (2008)

    Article  Google Scholar 

  24. Ingersoll, G.S., Morton, T.S., Farris, A.L.: Taming Text. Manning Publications, Shelter Island (2013)

    Google Scholar 

  25. Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)

    Article  MATH  Google Scholar 

  26. Brandes, U., Gaertler, M., Wagner, D.: Experiments on graph clustering algorithms. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 568–579. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39658-1_52

    Chapter  Google Scholar 

  27. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  28. Gatmir-Motahari, S., Zang, H., Reuther, P.: Time-clustering-based place prediction for wireless subscribers. IEEE/ACM Trans. Netw. 21(5), 1436–1446 (2013)

    Article  Google Scholar 

  29. Song, C., Qu, Z., Blumm, N., Barabási, A.L.: Limits of predictability in human mobility. Science 327(5968), 1018–1021 (2010)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giles Greenway .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Greenway, G., Blanke, T., Cote, M., Pybus, J. (2017). Research on Online Digital Cultures - Community Extraction and Analysis by Markov and k-Means Clustering. In: Guidotti, R., Monreale, A., Pedreschi, D., Abiteboul, S. (eds) Personal Analytics and Privacy. An Individual and Collective Perspective. PAP 2017. Lecture Notes in Computer Science(), vol 10708. Springer, Cham. https://doi.org/10.1007/978-3-319-71970-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-71970-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-71969-6

  • Online ISBN: 978-3-319-71970-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics