Abstract
As more and more person-specific data like health information becomes available, increasing attention is paid to confidentiality and privacy protection. One proposed model of privacy protection is k-Anonymity, where a dataset is k-anonymous if each record is identical to at least (k-1) others in the dataset. Our goal is to minimize information loss while transforming a collection of records to satisfy the k-Anonymity model. The downside to current greedy anonymization algorithms is their potential to get stuck at poor local optimums. In this paper, we propose an Ordered Greed Framework for k-Anonymity. Using our framework, designers can avoid the poor-local-optimum problem by adding stochastic elements to their greedy algorithms. Our preliminary experimental results indicate improvements in both runtime and solution quality. We also discover a surprising result concerning at least two widely-accepted greedy optimization algorithms in the literature. More specifically, for anonymization algorithms that process datasets in column-wise order, we show that a random column ordering can lead to significantly higher quality solutions than orderings determined by known greedy heuristics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing Tables. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 246–258. Springer, Heidelberg (2004)
Anderson, P., Ashlock, D.: Advances in ordered greed. In: Proc. of 2004 Artificial Neural Networks in Engineering Conf., ANNIE 2004 (2004)
Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: Blocking anonymity threats raised by frequent itemset mining. In: Proc. of 5th IEEE Int’l Conf. on Data Mining, ICDM 2005, pp. 561–564 (2005)
Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proc. of 21st IEEE Int’l Conf. on Data Engineering, ICDE 2005, pp. 217–228 (2005)
Byun, J., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymization using clustering techniques. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 188–200. Springer, Heidelberg (2007)
Chaytor, R.: Utility Preserving k-Anonymity. Technical report MUN-CS 2006-01, Dept. Computer Science, Memorial University (2006)
Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: Proc. of 21st IEEE Int’l Conf. on Data Engineering, ICDE 2005, pp. 205–216 (2005)
Iyengar, V.: Transforming data to satisfy privacy constraints. In: Proc. of 8th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, KDD 2002, pp. 279–288 (2002)
Jiang, W., Clifton, C.: Privacy-preserving distributed k-anonymity. In: Proc. of 19th IFIP WG 11.3 Working Conf. on Data and Applications Security, DBSec 2005, pp. 166–177 (2005)
Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: Proc. of 2006 ACM SIGMOD Int’l Conf. on Management of Data, pp. 217–228 (2006)
Koudas, N., Srivastava, D., Yu, T., Zhang, Q.: Aggregate query answering on anonymized tables. In: Proc. of 23rd IEEE Int’l Conf. on Data Engineering, ICDE 2007 (2007)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: Proc. of 2005 ACM SIGMOD Int’l Conf. on Management of Data, pp. 49–60 (2005)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: Proc. of 22nd IEEE Int’l Conf. on Data Engineering, ICDE 2006 (2006)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Workload-aware anonymization. In: Proc. of 12th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, KDD 2006, pp. 277–286 (2006)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proc. of IEEE 23rd Int’l Conf. on Data Engineering, ICDE 2007 (2007)
Lin, X., Kwok, Y., Lau, V.: A genetic algorithm based approach to route selection and capacity flow assignments. Computer Communications 26(9), 961–974 (2003)
Lunacek, M., Whitley, D., Ray, I.: A crossover operator for the k-anonymity problem. In: Proc. of 7th ACM Genetic and Evolutionary Computation Conf., GECCO 2006, pp. 1713–1720 (2006)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: ℓ-diversity: privacy beyond k-anonymity. In: Proc. of 22nd IEEE Int’l Conf. on Data Engineering, ICDE 2006 (2006)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. of 23rd ACM Sym. on Principles of Database Systems, PODS 2004, pp. 223–228 (2004)
Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1999)
Pandey, R., Chattopadhyay, S.: Low power technology mapping for LUT based FPGA - a genetic algorithm approach. In: Proc. of 16th IEEE Int’l Conf. on VLSI Design, VLSI 2003, pp. 79–84 (2003)
Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. Technical report SRI-CSL-98-04. SRI International, Computer Science Laboratoy (1998)
Sweeney, L.: Guaranteeing anonymity when sharing medical data, the datafly system. In: Conf. of American Medical Informatics Association, Annual Fall Sym., AMIA 1997, pp. 51–55 (1997)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int’l J. on Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 571–588 (2002)
Truta, T., Vinay, B.: Privacy protection: p-sensitive k-anonymity property. In: Proc. of 22nd IEEE Int’l Conf. on Data Engineering Workshops (2006)
Wang, K., Yu, P., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 249–256. Springer, Heidelberg (2004)
Winkler, W.: Using Simulated Annealing for k-Anonymity. Technical report 2002-07. U.S. Bureau of the Census, Statistical Research Division (2002)
Wong, R., Li, J., Fu, A., Wang, K. (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: Proc. of 12th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, KDD 2006, pp. 754–759 (2006)
Xiao, X., Tao, Y.: Personalized privacy preservation. In: Proc. of 2006 ACM SIGMOD Int’l Conf. on Management of Data, pp. 229–240 (2006)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.: Utility-based anonymization for privacy preservation with less information loss. ACM SIGKDD Explorations 8(2), 21–30 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chaytor, R. (2008). Allowing Privacy Protection Algorithms to Jump Out of Local Optimums: An Ordered Greed Framework. In: Bonchi, F., Ferrari, E., Malin, B., Saygin, Y. (eds) Privacy, Security, and Trust in KDD. PInKDD 2007. Lecture Notes in Computer Science, vol 4890. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78478-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-78478-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78477-7
Online ISBN: 978-3-540-78478-4
eBook Packages: Computer ScienceComputer Science (R0)