Abstract
How to achieve a balance between data publication and privacy protection has been an important issue in information security for several years. When microdata is released to users, attributes that clearly identify individuals are usually removed. Nevertheless, it is still possible to link released data with some public or easy-to-access databases to obtain confidential information. To safeguard privacy, numerous techniques, such as generalization, suppression, and microaggregation, have been proposed to modify the to-be-released data. In this paper, we propose attribute-oriented granulation as a data protection mechanism that can integrate both generalization and microaggregation into a uniform framework. We address the computational issue of searching for the most specific granulation that satisfies confidentiality requirements. A breadth-first search algorithm with basic pruning strategies is presented and its properties are investigated. The properties can be used to improve the efficiency of our algorithm. We also define some quantitative measures of data quality and security, and apply evolutionary computation techniques to find the optimal granulation for privacy protection.
A preliminary version of this paper was published in ?. This work was partially supported by the Taiwan Information Security Center (TWISC) and NSC (Taiwan). NSC Grants: 95-2221-E-001-019 (D.W. Wang), 95-2221-E-001-029-MY3 (C.J. Liau), and 95-2221-E-001-004 (T-s. Hsu).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wang, D.W., Liau, C.J., Hsu, T.-s.: Attribute-oriented granulation for privacy protection. In: Proceedings of the 2nd IEEE International Conference on Granular Computing, pp. 726–731. IEEE Computer Society Press, Los Alamitos (2006)
Willenborg, L., de Waal, T.: Statistical Disclosure Control in Practice. Springer, Heidelberg (1996)
Chiang, Y.C., et al.: Preserving confidentiality when sharing medical database with the Cellsecu system. International Journal of Medical Informatics 71, 17–23 (2003)
Hsu, T.-s., Liau, C.J., Wang, D.W.: A logical model for privacy protection. In: Davida, G.I., Frankel, Y. (eds.) ISC 2001. LNCS, vol. 2200, pp. 110–124. Springer, Heidelberg (2001)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)
Wang, D.W., Liau, C.J., Hsu, T.-s.: Medical privacy protection based on granular computing. Artificial Intelligence in Medicine 32(2), 137–149 (2004)
Wang, D.W., Liau, C.J., Hsu, T.-s.: An epistemic framework for privacy protection in database linking. Data and Knowledge Engineering 32(2), 137–149 (2006)
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)
Pawlak, Z.: Rough Sets–Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Dalenius, T.: Finding a needle in a haystack - or identifying anonymous census records. Journal of Official Statistics 2(3), 329–336 (1986)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Denning, D.E.R.: Cryptography and Data Security. Addison-Wesley, Reading (1982)
Grzymala-Busse, J.: Algebraic properties of knowledge representation systems. In: Proceedings of the ACM SIGART International Symposium on Methodologies for Intelligent Systems, pp. 432–440. ACM Press, New York (1986)
Lin, T.Y.: Mining associations by linear inequalities. In: Proceedings of the 4th International Conference on Data Mining, pp. 154–161. IEEE Computer Society Press, Los Alamitos (2004)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: Proceedings of the 24th ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM Press, New York (2005)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mmining, pp. 279–288. ACM Press, New York (2002)
Pawlak, Z.: Rough sets and fuzzy sets. Fuzzy Sets and Systems 17, 119–123 (1985)
Pena-Reyes, C.A., Sipper, M.: Evolutionary computation in medicine: An overview. Artificial Intelligence in Medicine 19, 1–23 (2000)
Chiang, Y.T., et al.: How much privacy? - a system to safe guard personal privacy while releasing database. In: Alpigini, J.J., et al. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 226–233. Springer, Heidelberg (2002)
Wang, D.W., et al.: Value versus damage of information release: A data privacy perspective. International Journal of Approximate Reasoning 43(2), 179–201 (2006)
Orłowska, E.: Logic for reasoning about knowledge. Zeitschrift f. Math. Logik und Grundlagen der Math. 35, 559–572 (1989)
Orłowska, E.: Kripke semantics for knowledge representation logics. Studia Logica XLIX, 255–272 (1990)
Glasgow, J., MacEwen, G., Panangaden, P.: A logic for reasoning about security. ACM Transactions on Computer Systems 10(3), 226–264 (1992)
Bieber, P., Cuppens, F.: A definition of secure dependencies using the logic of security. In: Proc. of the 4th IEEE Computer Security Foundations Workshop, pp. 2–11. IEEE Computer Society Press, Los Alamitos (1991)
Cuppens, F.: A logical formalization of secrecy. In: Proc. of the 6th IEEE Computer Security Foundations Workshop, pp. 53–62. IEEE Computer Society Press, Los Alamitos (1993)
Dawson, S., et al.: Maximizing sharing of protected information. Journal of Computer and System Sciences 64, 496–541 (2002)
Machanavajjhala, A.K.V., et al.: l-diversity: privacy beyond k-anonymity. In: Proceedings of The 22nd International Conference on Data Engineering (2006)
Bethlehem, J.G., Keller, W.J., Pannekoek, J.: Disclosure control of microdata. Journal of the American Statistical Association 85(409), 38–45 (1990)
Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Transactions on Knowledge and Data Engineering 12(6), 900–919 (2000)
Morgenstern, M.: Controlling logical inference in multilevel database systems. In: Proc. of the IEEE Symposium on Security and Privacy, pp. 245–255. IEEE Computer Society Press, Los Alamitos (1988)
Bonatti, P., et al.: An access control model for data archives. In: Proceedings of the 16th International Conference on Information Security: Trusted Information: The New Decade Challenge (2001)
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 12th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 247–255. ACM Press, New York (2001)
Clifton, C., Kantarcıoǧlu, M., Vaidya, J.: Privacy-preserving data mining. In: Chu, W.W., Lin, T.Y. (eds.) Foundations and Advances in Data Mining, pp. 313–344. Springer, Heidelberg (2005)
Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent the discovery of association rules. SIGMOD Record 30(4), 45–54 (2001)
Srikant, R.: Privacy preserving data mining: challenges and opportunities. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, p. 13. Springer, Heidelberg (2002)
Muralidhar, K., Sarathy, R.: Security of random data perturbation methods. ACM Transactions on Database Systems 24(4), 487–493 (1999)
Biskup, J., Bonatti, P.A.: Confidentiality policies and their enforcement for controlled query evaluation. In: Gollmann, D., Karjoth, G., Waidner, M. (eds.) ESORICS 2002. LNCS, vol. 2502, pp. 39–55. Springer, Heidelberg (2002)
Bonatti, P.A., Kraus, S., Subrahmanian, V.S.: Foundations of secure deductive databases. IEEE Transactions on Knowledge and Data Engineering 7(3), 406–422 (1995)
Damiani, E., et al.: Balancing confidentiality and efficiency in untrusted relational dbmss. In: Proceedings of the 10th ACM Conference on Computer and Communication Security, pp. 93–102. ACM Press, New York (2003)
Domingo-Ferrer, J.: Advances in inference control in statistical databases: An overview. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 1–7. Springer, Heidelberg (2002)
Hsu, T.-s., et al.: Quantifying privacy leakage through answering database queries. In: Chan, A.H., Gligor, V.D. (eds.) ISC 2002. LNCS, vol. 2433, pp. 162–175. Springer, Heidelberg (2002)
Truta, T.M., Fotouhi, F., Barth-Jones, D.: Privacy and confidentiality management for the microaggregation disclosure control method: disclosure risk and information loss measures. In: Proceeding of the ACM Workshop on Privacy in the Electronic Society, pp. 21–30. ACM Press, New York (2003)
Wang, D.W., et al.: On the damage and compensation of privacy leakage. In: Proceedings of the 18th Annual IFIP WG 11.3 Working Conference on Data and Applications Security, pp. 311–324. Kluwer Academic Publishers, Dordrecht (2004)
Shannon, C.E.: The mathematical theory of communication. The Bell System Technical Journal 27(3-4), 379–423 (1948)
Li, M., Vitanyi, P.: An introduction to Kolmogorov Complexity and its Applications. Springer, Heidelberg (1993)
Klir, G.J., Wierman, M.J.: Uncertainty-Based Information: Elements of Generalized Information Theory. Physica, Heidelberg (1998)
Cholvy, L., Cuppens, F.: Analysing consistency of security policies. In: Proc. of the IEEE Symposium on Security and Privacy, pp. 103–112. IEEE Computer Society Press, Los Alamitos (1997)
Cuppens, F., Demolombe, R.: A deontic logic for reasoning about confidentiality. In: Brown, M.A., Carmo, J. (eds.) Deontic logic, agency, and normative systems: ΔEON’96, Third International Workshop on Deontic Logic in Computer Science, pp. 66–79 (1996)
Cuppens, F., Demolombe, R.: A modal logical framework for security policies. In: Raś, Z.W., Skowron, A. (eds.) ISMIS 1997. LNCS, vol. 1325, pp. 579–589. Springer, Heidelberg (1997)
Syverson, P.F., Stubblebine, S.G.: Group principals and the formalization of anonymity. In: Wing, J.M., Woodcock, J.C.P., Davies, J. (eds.) FM 1999. LNCS, vol. 1708, pp. 814–833. Springer, Heidelberg (1999)
Gray III., J.W., Syverson, P.F.: A logical approach to multilevel security of probabilistic systems. Distributed Computing 11(2), 73–90 (1998)
Syverson, P.F., Gray III., J.W.: The epistemic representation of information flow security in probabilistic systems. In: Proc. of the 8th IEEE Computer Security Foundations Workshop, pp. 152–166. IEEE Computer Society Press, Los Alamitos (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Wang, DW., Liau, CJ., Hsu, Ts. (2007). Granulation as a Privacy Protection Mechanism. In: Peters, J.F., Skowron, A., Marek, V.W., Orłowska, E., Słowiński, R., Ziarko, W. (eds) Transactions on Rough Sets VII. Lecture Notes in Computer Science, vol 4400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71663-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-71663-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71662-4
Online ISBN: 978-3-540-71663-1
eBook Packages: Computer ScienceComputer Science (R0)