Differentiated security levels for personal identifiable information in identity management system

https://doi.org/10.1016/j.eswa.2011.04.226Get rights and content

Abstract

With the rapid development of Internet services, identity management (IdM) has got widely attraction as the credit agency between users and service providers. It facilitates users to use the Internet service, promotes service providers to enrich services, and makes Internet more security. Personally identifiable information (PII) is the most important information asset with which identity provider (IdP) can provide various services. Since PII is sensitive to users, it has become a serious problem that PII is leaked, illegal selected, illegal accessed. In order to improve security of PII, this study develops a novel framework using data mining to forecast information asset value and find appropriate security level for protecting user PII. The framework has two stages. In the first stage, user information asset is forecasted by data mining tool (decision tree) from PII database. Then security level for user PII is determined by the information asset value assuming that the higher information asset is, the more security requirement of PII is. In the second stage, with time being, number of illegal access and attack can be accumulated. It can be used to reconstruct the decision tree and update the knowledge base combined with the result of the first stage. Thus security level of PII can be timely adjusted and the protection of PII can be guaranteed even when security threat changes. Furthermore, an empirical case was studied in a user dataset to demonstrate the protection decision derived from the framework for various PII. Simulation results show that the framework with data mining can protect PII effectively. Our work can benefit the development of e-business service.

Highlights

► We develop a novel framework for protecting user Personally identifiable information (PII). ► User information asset is forecasted by data mining tool (decision tree) from PII database. ► Security level for user PII is determined by the information asset value. ► The decision tree and knowledge base are updated by the accumulated illegal access and attack. ► An empirical case was studied in a user dataset to demonstrate the protection decision derived from the framework for various PII.

Introduction

With development of information processing and electronic commerce, more and more people use Internet to communicate with each other and do shopping. Identity management (IdM) (Thompson & Thompson, 2007) is the most important platform for various application services on Internet. It enables all entities to interactive each other successfully on Internet. IdM is concerned with identities that can identify individuals uniquely within a given environment, which is designed to store, protect and manage user personally identifiable information (PII). Recently, IdM has got widely attraction as the important means to set up credit between users and service providers (SP). It can facilitate users to experience the Internet service, benefit service providers to reduce investment on services, and make Internet more security. IdM is composed with three types of fundamental entities with User, identity provider (IdP) and SP. Among them, IdP is the core entity to perform IdM functions. SP is responsible to provide application service. Normally, when user requests service from SP, SP sends query to IdP to ask for user identity resource. SP will provide user the service if user identity resource satisfies SP’s requirement.

PII refers to natural person’s special attributes that can directly or indirectly identify who they are. It could include user’s name, social identity (ID), preferences, address and other information. Since PII involves user privacy and sensitive information, the protection of PII is one of key issues in IdP. IdP needs enough hardware and software to ensure the security of PII. It becomes a serious problem that PII is leaked, illegal selected, illegal accessed (Mont, 2004). Many international organizations and individuals, such as IBM and Microsoft, have put much resource to the research and development (R&D) of protection of PII. Many frameworks and security technologies are proposed to protect user PII. For examples, Security assertion markup language (SAML) (Cantor, Kemp, Philpott, & Maler, 2005) and Ws-security (WS-Federation, 2006, WS-Trust Specification, 2007) protocols are developed to provide secure communication between IdPs and SP. With digital signature and encryption technologies (Acquisti, 2008), mechanisms are defined to create assertions as security tokens which can be used to protect user identity information. However, it is not necessary for all applications to use the same strength of security since the improvement of security strength means additional consumption of IT resource. Especially, in cloud computing scenarios, differentiated security level for user PII is important for the platform to provide high cost performance service to all users. However, there are not effective methods to apply the differentiated security on protection of PII. This study aims to develop a data mining framework for IdM system to automatically predict and set security levels for PII using useful patterns and rules that explored from PII data. When user registers in an IdP by applying the proposed framework, PII can be classified to be protected according to security levels. A case study on PII dataset is studied to demonstrate the validity of this approach. Simulation results show that the framework with data mining can protect PII effectively and perform better than IdM system without data mining.

The rest of the paper is organized as follows: In Section 2, differentiated security for PII is introduced. In Section 3, the framework using data mining tool is proposed and the corresponding process and components are discussed in detail. Data preparation is described in Section 4. The construction of decision tree is represented in Section 5. In Section 6 the performance of the framework using data mining is analyzed and compared with IdM system without data mining. Section 7 describes the protection of user PII according to differentiated security policies. In Section 8, considering attackers’ character, the decision tree for differentiated security is reconstructed and the security level of PII based on times of attack is set up. Section 9 is conclusion of this paper.

Section snippets

Differentiated security for protecting personal PII

Security requirements of PII vary dramatically among different users that are stored and transmitted on network. In networking applications, security communication is important, such as e-commerce, etc. But some applications just require low level of security, such as accessing to Internet for open information. Comparing with unsecured information, the secure information means additional investment from the SP’s point of view. Moreover, there is no absolute security of information. The

Main factors on security level

Security level of user PII is determined by both information assert and attacking frequency on the PII. Since it is impossible to measure attacking frequency just after user registration, it is important to evaluate user information asset at the initial stage. With the time being, the measurement of attacking frequency on the PII can be accumulated and is used to optimize the security level dynamically. The security level will continuously be updated from the combination of the initial security

Data preparation

Generally, security level for protecting PII is determined by user attributes which refers to user privacy, such as user address and user contact. From the security perspective, the security requirement of attributes is different according to user preferences and privacy request. The security level for protecting PII need be determined by the most important attribute. “adult” dataset from UCI (http://www.ics.uci.edu/m˜ learn) including 32561 instances is selected as user register information

Overview

Decision tree is constructed by user dataset and tested by test data. User’s data is comprised of various attributes. In this experimentation, the datasets are appropriate to show the effectiveness of the proposed framework in this paper. In the process of construction of decision tree, 10-fold cross-validation is used as test model to test the tree. The result shows that since user information asset correlates with multiple attributes, the proposed framework with data mining can get more

Performance evaluation

In evaluation phase, receive operating characteristic (ROC) Tom, 2006 curve is used to visualize and analyze classifier’s performance. An ROC graph is a technique for visualizing, organizing and selecting classifiers based on their performance. In ROC graph, if ROC curve is beeline with y=x, it means that the corresponding classifier is the worst and it can not be used for any classification questions.

Using ROC curve plot function of WEAK, the evaluation result of IdM system with and without

Application

Results from decision tree algorithm are stored in Knowledge base. IdP can get security level of individual user PII from the knowledge base. When user fills the necessary identity information and registers in IdP, user’s security level is found from knowledge base. In this paper, security levels are divided into two levels: 1 and 2, however, security level can be different in varies of applications. Fig. 9 describes a part of the results in the knowledge base.

As seen from the rules, it is easy

Reconstruct decision tree

In the first stage, IdP has no intrusion detection records about users, user information asset is used as class label for decision tree. With time being, number of illegal access and attack of PII is accumulated by intrusion detector. In the second stage, the number of attacking is used to reconstruct decision tree and update the knowledge base. After decision tree being reconstructed, the attack model of user PII is mined easily by using both the times of attacking and information asset value

Conclusions

This study applies a data mining framework with knowledge on IdM system to obtain appropriate security levels for user PII. Using the framework, IdM system can classify PII into appropriate security levels accurately. According to information asset value and assigned security level, IdM system can use suitable security mechanisms, such as encryption, to control the access of PII and avoid the leakage of PII. At the same time, with time going, the rules with decision tree can be reconstructed

Acknowledgements

The work was supported by China-Finland Cooperation Project on the Development and Demonstration of Intelligent Design Platform Driven by Living Lab Methodology (2010DFA12780).

References (27)

  • Y.L. Chen et al.

    Mining fuzzy association rules from questionnaire data

    Knowledge-based Systems

    (2009)
  • E.W.T. Ngai et al.

    Application of data mining techniques in customer relationship management: A literature review and classification

    Expert Systems with Applications

    (2009)
  • Acquisti, A., (2008). Identity Management, Privacy, and Price Discrimination. IEEE Security & Privacy, Vol. 6 (2), pp....
  • W. Boehm Barry

    Software risk management: Principles and practices

    IEEE Software

    (1991)
  • Cantor, S., Kemp, J., Philpott, R., & Maler, E. (2005). Security Assertion Markup Language v2.0. OASIS Security...
  • Chen, J., Wang, X., & He, L. (2008). An Architecture for Differentiated Security Service. In international symposium on...
  • H. Chih-Hung

    Data mining to improve industrial standards and enhance production and marketing: An empirical study in apparel industry

    Expert Systems with Applications

    (2009)
  • C. Chin-Jui et al.

    A Study on the application of data mining to disadvantaged social classes in Taiwan’s population census

    Expert Systems with Applications

    (2009)
  • R. Cunha et al.

    Knowledge reuse in data mining projects and its practical applications

    Enterprise Information Systems

    (2009)
  • http://www.ics.uci.edu/m˜...
  • J. Huang et al.

    Using AUC and accuracy in evaluating learning algorithms

    IEEE Transactions on Knowledge and Data Engineering

    (2005)
  • O.L. Mangasarian et al.

    Nonlinear knowledge-based classification

    IEEE Transactions on Neural Networks

    (2008)
  • H. Marit et al.

    Privacy and identity management

    IEEE Security & Privacy

    (2008)
  • Cited by (24)

    • DDoS attack resisting authentication protocol for mobile based online social network applications

      2022, Journal of Information Security and Applications
      Citation Excerpt :

      With the massive advancement of communication and smartphone technologies, the traditional Online Social Networks have entered into the generation of mobile-based online social networks (mOSNs). Smartphones provide the flexibility for an easy-access to mOSNs applications anytime and anywhere [2]. The presence of various user-friendly features of smartphones like cameras, sensors, GPS etc. facilitates mOSN to become more popular than traditional OSNs.

    • Efficient biometric identity-based encryption

      2018, Information Sciences
      Citation Excerpt :

      Following, Waters [49] constructed an efficient IBE, which is provable secure under the standard model. Since then, IBE evolved into numerous variations, e.g. hierarchical identity-based encryption [9,13,24], anonymous identity-based encryption [13,45], etc.; IBE is then also suitable to various identity related applications [15–18,47]. In [42], Sahai and Waters first introduced the notion of fuzzy identity-based encryption (FIBE) allowing for a certain amount of error-tolerance in the identities.

    • CenLocShare: A centralized privacy-preserving location-sharing system for mobile online social networks

      2018, Future Generation Computer Systems
      Citation Excerpt :

      The traditional OSNs have stepped into the era of mobile Online Social Networks (mOSNs). Mobile devices make people surface Internet and access mOSNs applications in anytime and anywhere [1]. The real-time location-sharing interaction turns into reality.

    • Privacy-preserving personal data operation on mobile cloud—Chances and challenges over advanced persistent threat

      2018, Future Generation Computer Systems
      Citation Excerpt :

      If the service provider is not under correct regulation which is bounded by patient privacy law, it may maliciously leak patients’ health record for commercial benefit. In addition to the previously introduced cloud-based authentication mechanisms, there are some interesting systems in the literature, such as behavior-based authentication [26], single sign on [27], mobile trusted module [28] and anonymous authentication [29]. These systems, however, cannot address the above challenges as well.

    • Using automated individual white-list to protect web digital identities

      2012, Expert Systems with Applications
      Citation Excerpt :

      Web digital identities (Chen, Wu, Shen, & Ji, 2011) in the form of pairs of usernames and passwords is a commonly used mechanism to authenticate individuals wishing to carry on transactions across the World Wide Web (Web for short).

    View all citing articles on Scopus
    View full text