Skip to main content

Advertisement

Log in

Data Mining and Privacy of Social Network Sites’ Users: Implications of the Data Mining Problem

  • Original Paper
  • Published:
Science and Engineering Ethics Aims and scope Submit manuscript

Abstract

This paper explores the potential of data mining as a technique that could be used by malicious data miners to threaten the privacy of social network sites (SNS) users. It applies a data mining algorithm to a real dataset to provide empirically-based evidence of the ease with which characteristics about the SNS users can be discovered and used in a way that could invade their privacy. One major contribution of this article is the use of the decision forest data mining algorithm (SysFor) to the context of SNS, which does not only build a decision tree but rather a forest allowing the exploration of more logic rules from a dataset. One logic rule that SysFor built in this study, for example, revealed that anyone having a profile picture showing just the face or a picture showing a family is less likely to be lonely. Another contribution of this article is the discussion of the implications of the data mining problem for governments, businesses, developers and the SNS users themselves.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. For the purpose of this paper, a social network site is a web-based service that allows individuals to “(1) construct a public or semi-public profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system” (Boyd and Ellison 2007: 211).

  2. This is how logic rules are textually represented in data mining literature.

  3. Australian Government office of the Australian Information professional. 2012. Business and Me. Retrieved March 9, 2012 from http://www.privacy.gov.au/individuals/business.

  4. Personal communication with The Victorian privacy commissioner, Ms Helen Versey, on the 13 of February 2012 during the sixth Australian Institute of Computer ethics conference in Melbourne. The Victorian privacy commissioner gave the key note address at this conference.

References

  • Alim, S., Abdulrahman, R., Neagu, D., & Ridley, M. (2011). Online social network profile data extraction for vulnerability analysis. International Journal of Internet Technology and Secured Transactions, 3, 194–209.

    Article  Google Scholar 

  • Al-Saggaf, Y. (2011). Saudi females on Facebook: An ethnographic study. International Journal of Emerging Technologies and Society, 9(1), 1–19.

    Google Scholar 

  • Al-Saggaf, Y. (2012). The mining of data retrieved from the eHealth record system should be governed. Information Age, 2012, 46–47.

    Google Scholar 

  • Al-Saggaf, Y., & Islam, Z. (2012). Privacy in social network sites (SNS): The threats from data mining. Ethical Space: The International Journal of Communication Ethics, 9(4), 32–40.

    Google Scholar 

  • Al-Saggaf, Y., & Nielsen, S. (2014). Self-disclosure on Facebook among female users and its relationship to feelings of loneliness. Computers in Human Behavior, 36(2014), 460–468. http://dx.doi.org/10.1016/j.chb.2014.04.014.

  • BBC News. (2007). Facebook opens profiles to public. Retrieved 12 January, 2012, from http://news.bbc.co.uk/go/pr/fr/-/2/hi/technology/6980454.stm.

  • Birrer, F. A. J. (2005). Data mining to combat terrorism and the roots of privacy concerns. Ethics and Information Technology, 7, 211–220.

    Article  Google Scholar 

  • Bonneau, J., Anderson, J., & Danezis, G. (2009). Prying data out of a social network. International Conference on Advances in Social Network Analysis and Mining, 20–22, 249–254.

    Article  Google Scholar 

  • Boyd, D. M., & Ellison, N. B. (2007). Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication, 13, 210–230.

    Article  Google Scholar 

  • Brankovic, L., Islam, M. Z., & Giggins, H. (2007). Privacy-preserving data mining. In M. Petkovic, & W. Jonker (Eds.), Security, privacy and trust in modern data management. Springer, ISBN: 978-3-540-69860-9, Chapter 11, pp. 151–166.

  • Catanese, S. A., Meo, D. E., Ferrara, E., Fiumara, G., & Provetti, A. (2011). Crawling Facebook for social network analysis purposes. In Proceedings of the international conference on web intelligence, mining and semantics, May 25–27.

  • Caudill, E. M., & Murphy, P. E. (2000). Consumer online privacy: Legal and ethical issues. Journal of Public Policy and Marketing, 19(1), 7–19.

    Article  Google Scholar 

  • Clifton, C., Kantarcioglu, M., Vaidya, J., & Zhu, M. Y. (2002). Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations Newsletter, 4(2), 28–34.

    Article  Google Scholar 

  • Debatin, B., Lovejoy, J. P., Horn, A., & Hughes, B. N. (2009). Facebook and online privacy: Attitudes, behaviors, and unintended consequences. Journal of Computer-Mediated Communication, 15, 83–108.

    Article  Google Scholar 

  • Edwards, L., & Brown, I. (2009). Data control and social networking: Irreconcilable ideas? In A. Matwyshyn (Ed.), Harboring data: Information security, law and the corporation. Stanford: Stanford University Press.

    Google Scholar 

  • Facebook. (2012). One Billion People on Facebook. http://newsroom.fb.com/News/One-Billion-People-on-Facebook-1c9.aspx. Accessed on October 13, 2012.

  • Felt, A., & Evans, D. (2008). Privacy protection for social networking platforms. Workshop on Web 2.0 Security and Privacy, May 22, pp. 1–8.

  • Gross, R., & Acquisti, A. (2005). Information revelation and privacy in online social networks. In Proceedings of the 2005 ACM workshop on privacy in the electronic society, pp. 71–80.

  • Harfoush, R. (2011). Has Facebook gone too far? The Mark News Online, November 11. Retrieved January 12, 2012, from http://ca.news.yahoo.com/know-facebook-050204096.html.

  • Hildebrandt, M. (2009). Who is profiling who? Invisible visibility. In S. Gutwirth, Y. Poullet, P. de Hert, C. de Terwangne, & S. Nouwt (Eds.), Reinventing data protection? (pp. 239–252). Berlin: Springer.

    Chapter  Google Scholar 

  • Islam, M. Z. (2008). Privacy preservation in data mining through noise addition. PhD thesis in Computer Science, School of Electrical Engineering and Computer Science, The University of Newcastle, Australia.

  • Islam, M. Z. (2012). EXPLORE: A novel decision tree classification algorithm. In L. M. MacKinnon (Ed.), Data security and security data. Berlin/Heidelberg: Springer. LNCS Vol. 6121, ISBN 978-3-642-25703-2, pp. 55–71.

  • Islam, M. Z., & Brankovic, L. (2011). Privacy preserving data mining: A noise addition framework using a novel clustering technique. Knowledge-Based Systems, 24(8), ISBN 0950-7051, (December 2011), 1214–1223.

  • Islam, M. Z., & Giggins, H. (2011). Knowledge discovery through SysFor: A systematically developed forest of multiple decision trees. In Proceedings of the ninth australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011. CRPIT, 121. P. Vamplew, A. Stranieri, K.-L. Ong, P. Christen, & P. J. Kennedy (Eds.), ACS, pp. 205–210.

  • Jagatic, T. N., Johnson, N. A., Jakobsson, M., & Menczer, F. (2007). Social phishing. Communications - ACM, 50, 94–100.

    Article  Google Scholar 

  • Johnson, B. (2009). Danah boyd: ‘People looked at me like I was an alien’. guardian.co.uk, at http://www.guardian.co.uk/technology/2010/jan/11/facebook-privacy. Accessed May 30, 2012.

  • Johnson, B. (2010). Privacy no longer a social norm, says Facebook founder. guardian.co.uk, at http://www.guardian.co.uk/technology/2010/jan/11/facebook-privacy. Accessed May 14, 2012.

  • Khan, M. A., Islam, M. Z., & Hafeez, M. (2011). Irrigation water demand forecasting—A data pre-processing and data mining approach based on spatiotemporal data. In Proceedings of the ninth Australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011, CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. and Kennedy, P. J. Eds., ACS, pp. 183–194.

  • Kirkpatrick, M. (2010). Facebook’s Zuckerberg says the age of privacy is over. ReadWriteWeb, at http://www.readwriteweb.com/archives/facebooks_zuckerberg_says_the_age_of_privacy_is_ov.php. Accessed May 14, 2012.

  • Kosala, R., & Blockeel, H. (2000). Web mining research: A survey. SIGKDD Explorations, 2, 1–15.

    Article  Google Scholar 

  • Krill, P. (2011). Big Data mining: Who owns your social network data? InfoWorld.com, March 9. Retrieved 19 December 2011 from http://www.infoworld.com/d/business-intelligence/big-data-mining-who-owns-your-social-network-data-746.

  • Laurent, W. (2011). The realities of social media data mining. Dashboard Insight, March 14. Retrieved 19 December 2011 from http://www.dashboardinsight.com/articles/new-concepts-in-business-intelligence/the-realities-of-social-media-data-mining.aspx.

  • Manjoo, F. (2007). Facebook finally lets users turn off privacy-invading ads. Salon.com, December 7. Retrieved 18 December 2011 from http://www.salon.com/2007/12/06/facebook_beacon_2/.

  • Moor, J. (1990). The ethics of privacy protection. Library Trends, 39, 69–82.

    Google Scholar 

  • Moor, J. (1997). Towards a theory of privacy in the information age. Computers and Society, 27, 27–32.

    Article  Google Scholar 

  • Nakashima, E. (2007). Feeling betrayed, Facebook users force site to honor their privacy. The Washington Post, November 30. Retrieved 18 December 2011 from http://www.washingtonpost.com/wp-dyn/content/article/2007/11/29/AR2007112902503.html.

  • Nissenbaum, H. (1997). Toward an approach to privacy in public: Challenges of information technology. Ethics and Behavior, 7, 207–220.

    Article  Google Scholar 

  • Nissenbaum, H. (1998). Protecting privacy in an information age: The problem of privacy in public. Law and Philosophy, 17, 559–596.

    Google Scholar 

  • Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review, 79, 119–158.

    Google Scholar 

  • Nissenbaum, H. (2010). Privacy in context. Stanford, CA: Stanford University Press.

    Google Scholar 

  • Nosko, A., Wood, E., & Molema, S. (2010). All about me: Disclosure in online social networking profiles: The case of Facebook. Computers in Human Behavior, 26(2010), 406–418.

    Article  Google Scholar 

  • Oboler, A., Welsh, K., & Cruz, L. (2012). The danger of big data: Social media as computational social science. First Monday, Volume 17, Number 7. Retrieved 24 December 2012 from http://www.firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3993/3269.

  • PPIP Act. (1998). NSW privacy and personal information act. http://www.legislation.nsw.gov.au/maintop/view/inforce/act+133+1998+cd+0+N. Accessed 9 June 2014.

  • Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann Publishers.

    Google Scholar 

  • Rachels, J. (1975). Why privacy is important. Philosophy & Public Affairs, 4, 323–333.

    Google Scholar 

  • Rahman, M. A., & Islam, M. Z. (2011). Seed-detective: A novel clustering technique using high quality seed for K-means on categorical and numerical attributes. In Proceedings of the ninth Australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011, CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. & Kennedy, P. J. Eds., ACS, pp. 211–220.

  • Rahman, M. G., Islam, M. Z., Bossomaier, T., & Gao, J. (2011). CAIRAD: A novel technique for incorrect records and attribute-values detection. In Proceedings of IEEE international joint conference on neural networks (IJCNN 12), Brisbane, Australia. June 10–June 15, 2012, pp. 1–10.

  • Rubenstein, I. S., Lee, R. D., & Schwartz, P. M. (2008). Data mining and internet profiling: Emerging regulatory and technological approaches. University Of Chicago Law Review, 75, 261–286.

    Google Scholar 

  • Sar, R. K., & Al-Saggaf, Y. (2014). Contextual Integrity’s decision heuristic and social network sites tracking. Ethics and Information Technology, 16(1), 15–26.

    Article  Google Scholar 

  • Sar, R. K., Al-Saggaf, Y., & Zia, T. (2012). You are what you type: Privacy in online social networks. In S. Leitch & M. Warren (Eds.) Proceedings of the Sixth AICE conference, Melbourne, Australia, 13 February 2012 (pp. 13–18). Deakin: School of Information Systems, Deakin University.

  • Tavani, H. T. (1999). Informational privacy, data mining, and the Internet. Ethics and Information Technology, 1, 137–145.

    Article  Google Scholar 

  • Tavani, H. T. (2011). Ethics and technology: Controversies, questions, and strategies for ethical computing (3rd ed.). Hoboken, NJ: John Wiley.

    Google Scholar 

  • Thelwall, M., Wilkinson, D., & Uppal, S. (2010). Data mining emotion in social network communication: Gender differences in MySpace. Journal of the American Society for Information Science and Technology, 61, 190–199.

    Article  Google Scholar 

  • Ting, I. (2008). Web mining techniques for on-line social networks analysis. In International conference on service systems and service Management, June 30 2008–July 2, pp. 1–5.

  • Vaidya, J., & Clifton, C. (2004). Privacy-preserving outlier detection. In Proceedings of the 4th IEEE international conference on data mining (ICDM 2004), pp. 233–240.

  • Van den Hoven, J. (2008). Information technology, privacy, and the protection of personal data. In J. van den Hoven & J. Weckert (Eds.), Information technology and moral philosophy (pp. 301–321). Cambridge: Cambridge University Press.

    Google Scholar 

  • Van Wel, L., & Royakkers, L. (2004). Ethical issues in web data mining. Ethics and Information Technology, 6, 129–140.

    Article  Google Scholar 

  • Young, A. L., & Quan-Hasse, A. (2009). Information revelation and internet privacy concerns on social network sites: A case study of Facebook. Proceedings of the fourth international conference on Communities and technologies, 2009, 265–274.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yeslam Al-Saggaf.

Appendix

Appendix

figure a
figure b
figure c
figure d
figure e

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Saggaf, Y., Islam, M.Z. Data Mining and Privacy of Social Network Sites’ Users: Implications of the Data Mining Problem. Sci Eng Ethics 21, 941–966 (2015). https://doi.org/10.1007/s11948-014-9564-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11948-014-9564-6

Keywords

Navigation