Skip to main content
Log in

A New Privacy-Preserving Data Mining Method Using Non-negative Matrix Factorization and Singular Value Decomposition

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

The data analysis and mining is more and more powerful with the rapid growing data size. And publishing data for researchers is becoming more valuable. This process has an important problem: privacy protection. In recent decades, many methods for protecting privacy in data publishing have been studied. One important kind of them is based on matrix decompositions. These methods find non-critical information for analysis task using matrix decompositions and remove it from the data to protecting privacy. This paper improves this kind method and gives a new algorithm for protecting privacy based on non-negative matrix factorization and singular value decomposition. Our basic idea is that if using plurality kinds of decompositions, it can analyze data from different directions and will analyze data more comprehensive. So, it may find more non-critical information and improve the algorithm performance. The experiments confirmed this idea. This new method can get better result than the traditional ones in which only one kind decomposition is used. Our method gives more powerful guarantee for protecting privacy when maintaining data quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Peng, J., Lu, J., Shang, X., & Chen, J. (2017). Identifying consistent disease subnetworks using DNet. Methods, 131, 104–110.

    Article  Google Scholar 

  2. Peng, J., Xue, H., Shao, Y., Shang, X., Wang, Y., & Chen, J. (2017). A novel method to measure the semantic similarity of HPO terms. International Journal of Data Mining and Bioinformatics, 17(2), 173–188.

    Article  Google Scholar 

  3. Hall, M. A., & Rich, S. S. (2000). Patients’ fear of genetic discrimination by health insurers: The impact of legal protections. Genetics in Medicine, 2(4), 214–221.

    Article  Google Scholar 

  4. Clayton, E. (2003). Ethical, legal, and social implications of genomic medicine. New England Journal of Medicine, 349(6), 562–569.

    Article  Google Scholar 

  5. Vaghashia, H., & Ganatra, A. (2015). A survey: Privacy preservation techniques in data mining. International Journal of Computer Applications, 119(4), 20–26.

    Article  Google Scholar 

  6. Yun, U., & Kim, J. (2015). A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Systems with Applications, 42(3), 1149–1165.

    Article  Google Scholar 

  7. Xu, S., Zhang, J., Han, D., & Wang, J. (2006). Singular value decomposition based data distortion strategy for privacy protection. Knowledge and Information Systems, 10(3), 383–397.

    Article  Google Scholar 

  8. Wang, J., Zhang, J., Xu, S., & Zhong, W. (2008). A novel data distortion approach via selective SSVD for privacy protection. International Journal of Information and Computer Security, 2(1), 48–70.

    Article  Google Scholar 

  9. Wang, J., Zhong, W., & Zhang, J. (2006). NNMF-based factorization techniques for high-accuracy privacy protection on non-negative-valued datasets. In Proceedings of the sixth IEEE international conference on data mining—workshops (pp. 513–517).

  10. Li, G., & Xi, M. (2015). An improved algorithm for privacy-preserving data mining based on NMF. Journal of Information and Computational Science, 12(9), 3423–3430.

    Article  Google Scholar 

  11. Liu, L., Wang, J., & Zhang, J. (2008). Wavelet-based data perturbation for simultaneous privacy-preserving and statistics-preserving. In Proceedings of the 2008 IEEE international conference on data mining workshops (pp. 27–35).

  12. Zhang, X., Xu, Z., Jia, N., Yang, W., Feng, Q., Chen, W., et al. (2015). Denoising of 3D magnetic resonance images by using higher-order singular value decomposition. Medical Image Analysis, 19(1), 75–86.

    Article  Google Scholar 

  13. Cong, F., Chen, J., Dong, G., & Zhao, F. (2013). Short-time matrix series based singular value decomposition for rolling bearing fault diagnosis. Mechanical Systems and Signal Processing, 34(1–2), 218–230.

    Article  Google Scholar 

  14. Maruyama, R., Maeda, K., Moroda, H., Kato, I., Inoue, M., Miyakawa, H., et al. (2014). Detecting cells using non-negative matrix factorization on calcium imaging data. Neural Networks, 55, 11–19.

    Article  Google Scholar 

  15. Shiga, M., & Mamitsuka, H. (2015). Non-negative matrix factorization with auxiliary information on overlapping groups. IEEE Transactions on Knowledge and Data Engineering, 27(6), 1615–1628.

    Article  Google Scholar 

  16. Wang, J., Zhan, J., & Zhang, J. (2008). Towards real-time performance of data value hiding for frequent data updates. In Proceedings of the 2008 IEEE international conference on granular computing (pp. 606–611).

  17. Witten, I. H., Frank, E., & Hall, M. A. (2016). Data mining: Practical machine learning tools and techniques. Burlington, MA: Morgan Kaufmann.

    Google Scholar 

  18. Lichman, M. (2013). UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

  19. Mangasarian, O. L., & Wolberg, W. H. (1990). Cancer diagnosis via linear programming. SIAM News, 23(5), 1 & 18.

Download references

Acknowledgements

This work is supported by Natural Science Basic Research Plan in Shaanxi Province of China (Program No. 2016JQ6078), the basic research fund of Chang’an University (0009–2014G6114024) and the Chinese NNSF (National Nature Science Foundation) (61601059). The breast cancer databases (WBC data set in this paper) was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guang Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, G., Xue, R. A New Privacy-Preserving Data Mining Method Using Non-negative Matrix Factorization and Singular Value Decomposition. Wireless Pers Commun 102, 1799–1808 (2018). https://doi.org/10.1007/s11277-017-5237-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-017-5237-5

Keywords

Navigation