Abstract
In this paper we address the problem of preserving mining accuracy as well as privacy in publishing sensitive time-series data. For example, people with heart disease do not want to disclose their electrocardiogram time-series, but they still allow mining of some accurate patterns from their time-series. Based on this observation, we introduce the related assumptions and requirements. We show that only randomization methods satisfy all assumptions, but even those methods do not satisfy the requirements. Thus, we discuss the randomization-based solutions that satisfy all assumptions and requirements. For this purpose, we use the noise averaging effect of piecewise aggregate approximation (PAA), which may alleviate the problem of destroying distance orders in randomly perturbed time-series. Based on the noise averaging effect, we first propose two naive solutions that use the random data perturbation in publishing time-series while exploiting the PAA distance in computing distances. There is, however, a tradeoff between these two solutions with respect to uncertainty and distance orders. We thus propose two more advanced solutions that take advantages of both naive solutions. Experimental results show that our advanced solutions are superior to the naive solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C., Yu, P.S.: Privacy-Preserving Data Mining: A Survey. In: Gertz, M., Jajodia, S. (eds.) Handbook of Database Security: Applications and Trends, pp. 431–460. Springer, Heidelberg (November 2007)
Agrawal, R., Srikant, R.: Privacy-Preserving Data Mining. In: Proc. of Int’l Conf. on Management of Data, ACM SIGMOD, Dallas, Texas, pp. 439–450 (May 2000)
Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-anonymization. In: Proc. of the 21st Int’l. Conf. on Data Engineering, Tokyo, Japan, pp. 217–228 (April 2005)
Bertino, E., Lin, D., Jiang, W.: A Survey of Quantification of Privacy Preserving Data Mining Algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining: Models and Algorithms, June 2008, pp. 183–205. Kluwer Academic Publishers, Dordrecht (2008)
Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for Privacy Preserving Distributed Data Mining. SIGKDD Explorations 4(2), 28–34 (2002)
Huang, Z., Du, W., Chen, B.: Deriving Private Information from Randomized Data. In: Proc. of Int’l. Conf. on Management of Data, ACM SIGMOD, Baltimore, Maryland, pp. 37–48 (June 2005)
Inan, A., Kantarcioglu, M., Bertino, E.: Using Anonymized Data for Classification. In: Proc. of the 25th Int’l. Conf. on Data Engineering, Shanghai, China, pp. 429–440 (April 2009)
Keogh, E., Chakrabarti, K., Pazzani, M.J., Mehrotra, S.: Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. Knowledge and Information Systems 3(3), 263–286 (2001)
Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR Time Series for Classification/Clustering, http://www.cs.ucr.edu/~eamonn/time_series_data
Kim, B.-S., Moon, Y.-S., Kim, J.: Noise Control Boundary Image Matching Using Time-Series Moving Average Transform. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 362–375. Springer, Heidelberg (2008)
Mukherjee, S., Chen, Z., Gangopadhyay, A.: A Privacy-Preserving Technique for Euclidean Distance-based Mining Algorithms using Fourier-related Transforms. The VLDB Journal 15(4), 293–315 (2006)
Oliveira, S.R.M., Zanane, O.R.: Privacy-Preserving Clustering by Data Transformation. In: Proc. of Brazilian Symp. on Databases, Amazonas, Brazil, pp. 304–318 (October 2003)
Papadimitriou, S., Li, F., Kollios, G., Yu, P.S.: Time Series Compressibility and Privacy. In: Proc. of the 33rd Int’l. Conf. on Very Large Data Bases, Vienna, Austria, pp. 459–470 (September 2007)
Verykios, V.S., Elmagarmid, A., Bertino, E., Saygin, Y., Dasseni, E.: Association Rule Hiding. IEEE Trans. on Knowledge & Data Engineering 16(4), 434–447 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Moon, YS., Kim, HS., Kim, SP., Bertino, E. (2010). Publishing Time-Series Data under Preservation of Privacy and Distance Orders. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-15251-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15250-4
Online ISBN: 978-3-642-15251-1
eBook Packages: Computer ScienceComputer Science (R0)