Abstract
Social networks such as Twitter or Facebook are part of the phenomenon called Big Data, a term used to describe very large and complex data sets. To represent these networks, the connections between users can be easily represented using (directed) graphs. In this paper, we are mainly focused on two different aspects of social network analysis. First, our goal is to find an efficient and high-level way to store and process a social network graph, using reasonable computing resources (processor and memory).We believe that this is an important research interest, since it provides a more democratic method to deal with large graphs.Next, we turn our attention to the study of social capitalists, a specific kind of users on Twitter. Roughly speaking, such users try to gain visibility by following other users regardless of their contents. Using two similarity measures called overlap index and ratio, we show that such users may be detected and classified very efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. of Stat. Mech.: Theory and Experiment 2008(10), 10,008 (2008)
Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring User Influence in Twitter: The Million Follower Fallacy. In: ICWSM 2010: Proc. of int. AAAI Conference on Weblogs and Social (2010)
Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and Combating Link Farming in the Twitter Social Network. In: Proc. of the 21st Int. Conference on World Wide Web, WWW 2012, pp. 61–70 (2012)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proc. of the 19th Int. Conference on World Wide Web, WWW 2010, pp. 591–600 (2010)
Lakshman, A., Malik, P.: Cassandra: a structured storage system on a p2p network. In: Proc. of the 28th ACM Symp. on Princ. of Distributed Comput., PODC 2009, p. 5 (2009)
Martínez-Bazan, N., Águila Lorente, M.A., Muntés-Mulero, V., Dominguez-Sal, D., Gómez-Villamor, S., Larriba-Pey, J.L.: Efficient Graph Management Based On Bitmap Indices. In: Proc. of the 16th Int. Database Eng. & Appl. Symp., IDEAS 2012, pp. 110–119 (2012)
OrientDB (1999), http://www.orientdb.org/
Schatz, M.C., Langmead, B., Salzberg, S.L.: Cloud computing and the DNA data race. Nat. Biotech. 28(7), 691–693 (2010)
Schuett, T., Pierre, G.: ConpaaS, an integrated cloud environment for big data. ERCIM News 2012(89) (2012)
Simpson, G.G.: Mammals and the nature of continents. Am. J. of Science (241), 1–41 (1943)
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Anthony, S., Liu, H., Murthy, R.: Hive - a petabyte scale data warehouse using hadoop. İn: IEEE 26th Int. Conference on Data Eng., pp. 996–1005 (2010)
Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proc. of the 48th Annu. Southeast Reg. Conference, ACM SE, pp. 42:1–42:6 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dugué, N., Perez, A. (2013). Detecting Social Capitalists on Twitter Using Similarity Measures. In: Ghoshal, G., Poncela-Casasnovas, J., Tolksdorf, R. (eds) Complex Networks IV. Studies in Computational Intelligence, vol 476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36844-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-36844-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36843-1
Online ISBN: 978-3-642-36844-8
eBook Packages: EngineeringEngineering (R0)