Abstract
We investigate situations where releasing frequent sequential patterns can compromise individual’s privacy. We propose two concrete objectives for privacy protection: k-anonymity and α-dissociation. The first addresses the problem of inferring patterns with very low support, say, in [1,k). These inferred patterns can become quasi-identifiers in linking attacks. We show that, for all but one definition of support, it is impossible to reliably infer support values for patterns with two or more negative items (items which do not occur in a pattern) solely based on frequent sequential patterns. For the remaining definition, we formulate privacy inference channels. α-dissociation handles the problem of high certainty of inferring sensitive attribute values. In order to remove privacy threats w.r.t. the two objectives, we show that we only need to examine pairs of sequential patterns with length difference of 1. We then establish a Privacy Inference Channels Sanitisation (PICS) algorithm. It can, as illustrated by experiments, reduce the privacy disclosure risk carried by frequent sequential patterns with a small computation overhead.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vaidya, J., Clifton, C.: Privacy-preserving data mining: Why, how, and when. IEEE Security & Privacy 2, 19–27 (2004)
Wong, R., et al. (alpha,k)-anonymity: An enhanced k-anonymity model for privacy-preserving data publishing. In: KDD’06, pp. 754–759 (2006)
Atzori, M., et al.: Blocking anonymity threats raised by frequent itemset mining. In: ICDM’05, pp. 561–564 (2005)
Oliveira, S.R.M., Zaïane, O.R., Saygin, Y.: Secure association rule sharing. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 74–85. Springer, Heidelberg (2004)
Kantarcioglu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: KDD’04, pp. 599–604. ACM Press, New York (2004)
Jin, H., et al.: Mining unexpected associations for signalling potential adverse drug reactions from administrative health databases. In: Ng, W.-K., et al. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 867–876. Springer, Heidelberg (2006)
Ayres, J., et al.: Sequential PAttern Mining using a bitmap representation. In: KDD’02, pp. 215–224 (2002)
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Jin, H., Chen, J., He, H., O’Keefe, C.M. (2007). Privacy-Preserving Sequential Pattern Release. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_57
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)