Skip to main content

A Visual Tool for Interactively Privacy Analysis and Preservation on Order-Dynamic Tabular Data

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2022)

Abstract

The practice of releasing individual data, usually in tabular form, is obligated to prevent privacy leakage. With rendered privacy risks, visualization techniques have greatly prompted the user-friendly data sanitization process. Yet, we point out, for the first time, the attribute order (i.e., schema) of tabular data inherently determines the risk situation and the output utility, while is ignored in previous efforts. To mitigate this gap, this work proposes the design and pipeline of a visual tool (TPA, Tabular Privacy Assistant) for nuanced privacy analysis and preservation on order-dynamic tabular data. By adapting data cube structure as the flexible backbone, TPA manages to support real-time risk analysis in response to attribute order adjustment. Novel visual designs, i.e., data abstract, risk tree, integrated privacy enhancement, are developed to explore data correlations and acquire privacy awareness. We demonstrate TPA’s effectiveness with a case study on the prototype and qualitatively discuss the pros and cons with domain experts for future improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use data holder and user interchangeably.

References

  1. Abay, N.C., Zhou, Y., Kantarcioglu, M., Thuraisingham, B., Sweeney, L.: Privacy preserving synthetic data release using deep learning. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 510–526. Springer International Publishing, Cham (2019)

    Chapter  Google Scholar 

  2. Abowd, J.M., Vilhuber, L.: How protective are synthetic data? In: Domingo-Ferrer, J., Saygın, Y. (eds.) Privacy in Statistical Databases, pp. 239–246. Springer, Berlin Heidelberg, Berlin, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Bhattacharjee, K., Chen, M., Dasgupta, A.: Privacy-preserving data visualization: reflections on the state of the art and research opportunities. In: Computer Graphics Forum. vol. 39, pp. 675–692. Wiley Online Library (2020)

    Google Scholar 

  4. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)

    Article  Google Scholar 

  5. Caraux, G., Pinloche, S.: Permutmatrix: a graphical environment to arrange gene expression profiles in optimal linear order. Bioinformatics 21(7), 1280–1281 (2005)

    Article  Google Scholar 

  6. Chou, J.K., Bryan, C., Ma, K.L.: Privacy preserving visualization for social network data with ontology information. In: 2017 IEEE Pacific Visualization Symposium (PacificVis), pp. 11–20. IEEE (2017)

    Google Scholar 

  7. Chou, J.K., Wang, Y., Ma, K.L.: Privacy preserving visualization: a study on event sequence data. In: Computer Graphics Forum. vol. 38, pp. 340–355. Wiley Online Library (2019)

    Google Scholar 

  8. Dasgupta, A., Kosara, R., Chen, M.: Guess me if you can: A visual uncertainty model for transparent evaluation of disclosure risks in privacy-preserving data visualization. In: 2019 IEEE Symposium on Visualization for Cyber Security (VizSec), pp. 1–10. IEEE (2019)

    Google Scholar 

  9. Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) Theory and Applications of Models of Computation, pp. 1–19. Springer, Berlin Heidelberg, Berlin, Heidelberg (2008)

    MATH  Google Scholar 

  10. Elliot, M., Hundepool, A., Nordholt, E.S., Tambay, J.L., Wende, T.: Glossary on statistical disclosure control. In: Monograph on Official Statistics, pp. 381–392. Eurostat (2006)

    Google Scholar 

  11. Fernandez, N.F., et al.: Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data. Scientific data 4(1), 1–12 (2017)

    Article  Google Scholar 

  12. Furmanova, K., et al.: Taggle: combining overview and details in tabular data visualizations. Inf. Vis. 19(2), 114–136 (2020)

    Article  Google Scholar 

  13. Furmanova, K., et al.: Taggle: Scaling table visualization through aggregation. In: Poster@ IEEE Conference on Information Visualization (InfoVis’ 17), p. 139 (2017)

    Google Scholar 

  14. Gray, J., et al.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Disc. 1(1), 29–53 (1997)

    Article  Google Scholar 

  15. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: 22nd International Conference on Data Engineering (ICDE’06), pp. 25–25. IEEE (2006)

    Google Scholar 

  16. Li, B., Erdin, E., Gunes, M.H., Bebis, G., Shipley, T.: An overview of anonymity technology usage. Comput. Commun. 36(12), 1269–1283 (2013)

    Article  Google Scholar 

  17. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115. IEEE (2007)

    Google Scholar 

  18. Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526 (2009)

    Google Scholar 

  19. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1), 3-es (2007)

    Google Scholar 

  20. Massey, F.J., Jr.: The kolmogorov-smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951)

    Article  MATH  Google Scholar 

  21. de Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3(1), 1376 (2013)

    Article  Google Scholar 

  22. Pytlak, K.: Personal key indicators of heart disease. https://www.kaggle.com/datasets/kamilpytlak/personal-key-indicators-of-heart-disease/metadata (2022)

  23. Rajabiyazdi, F., Perin, C., Oehlberg, L., Carpendale, S.: Exploring the design of patient-generated data visualizations. In: Proceedings of Graphics Interface 2020, pp. 362–373. GI 2020 (2020)

    Google Scholar 

  24. Rao, R., Card, S.K.: The table lens: merging graphical and symbolic representations in an interactive focus+ context visualization for tabular information. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 318–322 (1994)

    Google Scholar 

  25. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

  26. Seo, J., Shneiderman, B.: Interactively exploring hierarchical clustering results [gene identification]. Computer 35(7), 80–86 (2002)

    Article  Google Scholar 

  27. Stadler, T., Oprisanu, B., Troncoso, C.: Synthetic data-anonymisation groundhog day. arXiv preprint arXiv:2011.07018 (2021)

  28. Sweeney, L.: Simple demographics often identify people uniquely (2000)

    Google Scholar 

  29. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Internat. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 571–588 (2002)

    Article  MATH  Google Scholar 

  30. Sweeney, L.: k-anonymity: a model for protecting privacy. Internat. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)

    Article  MATH  Google Scholar 

  31. Thaker, P., Budiu, M., Gopalan, P., Wieder, U., Zaharia, M.: Overlook: Differentially private exploratory visualization for big data. arXiv preprint arXiv:2006.12018 (2020)

  32. Wang, X., et al.: Graphprotector: a visual interface for employing and assessing multiple privacy preserving graph algorithms. IEEE Trans. Visual Comput. Graph. 25(1), 193–203 (2018)

    Article  Google Scholar 

  33. Wang, X., et al.: A utility-aware visual approach for anonymizing multi-attribute tabular data. IEEE Trans. Visual Comput. Graph. 24(1), 351–360 (2017)

    Article  Google Scholar 

  34. Wu, F.T.: Defining privacy and utility in data sets. U. Colo. L. Rev. 84, 1117 (2013)

    Google Scholar 

  35. Xiao, F., et al.: An information-aware visualization for privacy-preserving accelerometer data sharing. HCIS 8(1), 1–28 (2018). https://doi.org/10.1186/s13673-018-0137-6

    Article  Google Scholar 

  36. Xu, L., Skoularidou, M., Cuesta-Infante, A., Veeramachaneni, K.: Modeling tabular data using conditional gan. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  37. Zhang, D., Sarvghad, A., Miklau, G.: Investigating visual analysis of differentially private data. IEEE Trans. Visual Comput. Graph. 27(2), 1786–1796 (2020)

    Article  Google Scholar 

Download references

Acknowledgment

This work is supported by National Natural Science Foundation of China (62172155, 62072465).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, F., Liu, F., Zhou, T. (2022). A Visual Tool for Interactively Privacy Analysis and Preservation on Order-Dynamic Tabular Data. In: Gao, H., Wang, X., Wei, W., Dagiuklas, T. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 461. Springer, Cham. https://doi.org/10.1007/978-3-031-24386-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24386-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24385-1

  • Online ISBN: 978-3-031-24386-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics