Abstract
This paper looks at the problem of privacy in the context of Online Social Networks (OSNs). In particular, it examines the predictability of different types of personal information based on OSN data and compares it to the perceptions of users about the disclosure of their information. To this end, a real life dataset is composed. This consists of the Facebook data (images, posts and likes) of 170 people along with their replies to a survey that addresses both their personal information, as well as their perceptions about the sensitivity and the predictability of different types of information. Importantly, we evaluate several learning techniques for the prediction of user attributes based on their OSN data. Our analysis shows that the perceptions of users with respect to the disclosure of specific types of information are often incorrect. For instance, it appears that the predictability of their political beliefs and employment status is higher than they tend to believe. Interestingly, it also appears that information that is characterized by users as more sensitive, is actually more easily predictable than users think, and vice versa (i.e. information that is characterized as relatively less sensitive is less easily predictable than users might have thought).
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Acquisti, A.: The economics and behavioral economics of privacy. In: Lane, J., Stodden, V., Bender, S., Nissenbaum, H. (eds.) Privacy, Big Data, and the Public Good: Frameworks for Engagement, pp. 98–112. Cambridge University Press (2014)
Acquisti, A., Fong, C.M.: An experiment in hiring discrimination via online social networks. (2015). Available at SSRN 2031979
Agarwal, L., Shrivastava, N., Jaiswal, S., Panjwani, S.: Do not embarrass: re-examining user concerns for online tracking and advertising. In: Proceedings of the Ninth Symposium on Usable Privacy and Security (2013)
Backstrom, L., Kleinberg, J., Romantic partnerships, the dispersion of social ties: a network analysis of relationship status on facebook. In: Proceedings of CSCW 2014, pp. 831–841. ACM (2014)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Brandimarte, L., Acquisti, A., Loewenstein, G.: Misplaced confidences: privacy and the control paradox. In: Ninth Annual Workshop on the Economics of InformationSecurity, p. 43, Cambridge (2010)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Conover, M.D., Goncalves, B., Ratkiewicz, J., Flammini, A., Menczer, F.: Predicting the political alignment of twitter users. In: Privacy, Security, Risk and Trust (PASSAT) and SocialCom 2011, pp. 192–199 (2011)
Debatin, B., Lovejoy, J.P., Horn, A.-K., Hughes, B.N.: Facebook and online privacy: attitudes, behaviors, and unintended consequences. J. Comput. Mediated Commun. 15(1), 83–108 (2009)
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
World Economic Forum. Rethinking personal data: strengthening trust. Technical report, May 2012
Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. ICML 96, 148–156 (1996)
Ginsca, A.L., Popescu, A., Le Borgne, H., Ballas, N., Vo, P., Kanellos, I.: Large-scale image mining with flickr groups. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 318–334. Springer, Heidelberg (2015)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)
Heyman, R., De Wolf, R., Pierson, J.: Evaluating social media privacy settings for personal, advertising purposes. Info 16(4), 18–32 (2014)
Jernigan, C., Mistree, B.F., Gaydar: Facebook friendships expose sexual orientation. First Monday, 14(10) (2009)
Knijnenburg, B.P., Kobsa, A., Jin, H.: Dimensionality of information disclosure behavior. Int. J. Hum. Comput. Stud. 71(12), 1144–1162 (2013)
Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Nat. Acad. Sci. 110(15), 5802–5805 (2013)
Madejski, M., Johnson, M., Bellovin, S.M.: A study of privacy settings errors in an online social network. In: PERCOM Workshops (2012)
Nissenbaum, H.: Privacy as contextual integrity. Wash. L. Rev. 79, 101–139 (2004)
Pennacchiotti, M., Popescu, A.-M.: Democrats, republicans, starbucks afficionados: user classification in twitter. In: SIGKDD (2011)
Petkos, G., Papadopoulos, S., Kompatsiaris, Y.: PScore: A framework for enhancing privacy awareness in online social networks. In: Availability, Reliability and Security (ARES 2015), pp. 592–600. IEEE (2015)
Petronio, S.S.: Boundaries of Privacy: Dialectics of Disclosure. SUNY series in communication studies. State University of New York Press, Albany (2002)
Raman, A.S., Barloon, J.L., Welch, D.M.: Social media: emerging fair lending issues. Rev. Banking Financial Serv. 28(7), 81–88 (2012)
Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in twitter. In: Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, pp. 37–44. ACM (2010)
Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: ICDM 2008, pp. 995–1000 (2008)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)
Andrew Schwartz, H., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Ramones, S.M., Agrawal, M., Shah, A., Kosinski, M., Stillwell, D., Seligman, M.E.P., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS one 8(9), e73791 (2013)
Spyromitros-Xioufis, E., Papadopoulos, S., Popescu, A., Kompatsiaris, Y.: Personalized privacy-aware image classification. In: Proceedings of the 6th ACM International Conference on Multimedia Retrieval, ICMR 2016 (2016)
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-target regression via input space expansion: treating targets as inputs. Machine Learning, pp. 1–44 (2016)
Stutzman, F., Gross, R., Acquisti, A.: Silent listeners: the evolution of privacy and disclosure on Facebook. J. Privacy Confidentiality 4(2), 7–41 (2012)
Theodoridis, T., Papadopoulos, S., Kompatsiaris, Y.: Assessing the reliability of facebook user profiling. In: WWW (2015)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, New York (2009)
Westin, A.: Privacy and Freedom. Bodley Head, London (1970)
Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: WWW (2009)
Acknowledgment
This work is supported by the USEMP FP7 project, partially funded by the EC under contract number 611596.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Spyromitros-Xioufis, E., Petkos, G., Papadopoulos, S., Heyman, R., Kompatsiaris, Y. (2016). Perceived Versus Actual Predictability of Personal Information in Social Networks. In: Bagnoli, F., et al. Internet Science. INSCI 2016. Lecture Notes in Computer Science(), vol 9934. Springer, Cham. https://doi.org/10.1007/978-3-319-45982-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-45982-0_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45981-3
Online ISBN: 978-3-319-45982-0
eBook Packages: Computer ScienceComputer Science (R0)