Abstract
The problem of publishing personal data without giving up privacy is becoming increasingly important. Different interesting formalizations have been recently proposed in this context, i.e. k-anonymity [17,18] and l-diversity [12]. These approaches require that the rows in a table are clustered in sets satisfying some constraint, in order to prevent the identification of the individuals the rows belong to. In this paper we focus on the l-diversity problem, where the possible attributes are distinguished in sensible attributes and quasi-identifier attributes. The goal is to partition the set of rows, where for each set C of the partition it is required that the number of rows having a specific value in the sensible attribute is at most \(\frac{1}{l}\) |C|.
We investigate the approximation and parameterized complexity of l-diversity. Concerning the approximation complexity, we prove the following results: (1) the problem is not approximable within factor c ln l, for some constant c > 0, even if the input table consists of two columns; (ii) the problem is APX-hard, even if l = 4 and the input table contains exactly 3 columns; (iii) the problem admits an approximation algorithm of factor m (where m + 1 is the number of columns in the input table), when the sensitive attribute ranges over an alphabet of constant size. Concerning the parameterized complexity, we prove the following results: (i) the problem is W[1]-hard even if parameterized by the size of the solution, l, and the size of the alphabet; (ii) the problem admits a fixed-parameter algorithm when both the maximum number of different values in a column and the number of columns are parameters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 246–258. Springer, Heidelberg (2005)
Alon, N., Moshkovitz, D., Safra, S.: Algorithmic Construction of Sets for k-restrictions. ACM Trans. Algorithms 2(2), 153–177 (2006)
Alimonti, P., Kann, V.: Some APX-Completeness Results for Cubic Graphs. Theoretical Computer Science 237(1-2), 123–134 (2000)
Blocki, J., Williams, R.: Resolving the Complexity of Some Data Privacy Problems. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 393–404. Springer, Heidelberg (2010)
Bonizzoni, P., Della Vedova, G., Dondi, R., Pirola, Y.: Parameterized Complexity of k-Anonymity: Hardness and Tractability. In: Iliopoulos, C.S., Smyth, W.F. (eds.) IWOCA 2010. LNCS, vol. 6460, pp. 242–255. Springer, Heidelberg (2011)
Bonizzoni, P., Della Vedova, G., Dondi, R.: Anonymizing Binary and Small Tables is Hard to Approximate. Journal of Comb. Opt. 22(1), 97–119 (2011)
Downey, R.G., Fellows, M.R.: Fixed-Parameter Tractability and Completeness ii: On Completeness for W[1]. Theoretical Computer Science 141, 109–131 (1995)
Evans, P.A., Wareham, T., Chaytor, R.: Fixed-parameter Tractability of Anonymizing Data by Suppressing Entries. J. Comb. Optim. 18(4), 362–375 (2009)
Gionis, A., Tassa, T.: k-Anonymization with Minimal Loss of Information. IEEE Trans. Knowl. Data Eng. 21(2), 206–219 (2009)
Lenstra, H.: Integer Programming with a Fixed Number of Variables. Mathematics of Operations Research 4(8) (1983)
Li, J., Yi, K., Zhang, Q.: Clustering with diversity. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6198, pp. 188–200. Springer, Heidelberg (2010)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-Diversity: Privacy Beyond k-Anonymity. In: Liu, L., Reuter, A., Whang, K., Zhang, J. (eds.) 22nd International Conference on Data Engineering, p. 24. IEEE Computer Society, New York (2006)
Meyerson, A., Williams, R.: On the Complexity of Optimal K-Anonymity. In: Deutsch, A. (ed.) 23rd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 223–228. ACM, New York (2004)
Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University Press, Oxford (2006)
Park, H., Shim, K.: Approximate Algorithms for k-Anonymity. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) ACM SIGMOD International Conference on Management of Data, pp. 67–78. ACM Press, New York (2007)
Raz, R., Safra, S.: A Sub-Constant Error-probability Low-degree Test, and a Sub-Constant Error-Probability PCP Characterization of NP. In: Twenty-ninth Annual ACM Symposium on Theory of Computing (STOC 1997), pp. 475–484 (1997)
Samarati, P.: Protecting Respondents’ Identities in Microdata Release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Sweeney, L.: k-Anonymity: a Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Vazirani, V.: Approximation Algorithms. Springer, Heidelberg (2001)
Xiao, X., Yi, K., Tao, Y.: The Hardness and Approximation Algorithms for l-diversity. In: Manolescu, I., Spaccapietra, S., Teubner, J., Kitsuregawa, M., Lger, A., Naumann, F., Ailamaki, A., Ozcan, F. (eds.) 13th International Conference on Extending Database Technology (EDBT 2010), pp. 135–146. ACM Press, New York (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag GmbH Berlin Heidelberg
About this paper
Cite this paper
Dondi, R., Mauri, G., Zoppis, I. (2011). On the Complexity of the l-diversity Problem. In: Murlak, F., Sankowski, P. (eds) Mathematical Foundations of Computer Science 2011. MFCS 2011. Lecture Notes in Computer Science, vol 6907. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22993-0_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-22993-0_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22992-3
Online ISBN: 978-3-642-22993-0
eBook Packages: Computer ScienceComputer Science (R0)