A Visual Tool for Interactively Privacy Analysis and Preservation on Order-Dynamic Tabular Data

Liang, Fengzhou; Liu, Fang; Zhou, Tongqing

doi:10.1007/978-3-031-24386-8_2

Fengzhou Liang¹⁹,
Fang Liu²⁰ &
Tongqing Zhou²¹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 461))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

566 Accesses

Abstract

The practice of releasing individual data, usually in tabular form, is obligated to prevent privacy leakage. With rendered privacy risks, visualization techniques have greatly prompted the user-friendly data sanitization process. Yet, we point out, for the first time, the attribute order (i.e., schema) of tabular data inherently determines the risk situation and the output utility, while is ignored in previous efforts. To mitigate this gap, this work proposes the design and pipeline of a visual tool (TPA, Tabular Privacy Assistant) for nuanced privacy analysis and preservation on order-dynamic tabular data. By adapting data cube structure as the flexible backbone, TPA manages to support real-time risk analysis in response to attribute order adjustment. Novel visual designs, i.e., data abstract, risk tree, integrated privacy enhancement, are developed to explore data correlations and acquire privacy awareness. We demonstrate TPA’s effectiveness with a case study on the prototype and qualitatively discuss the pros and cons with domain experts for future improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We use data holder and user interchangeably.

References

Abay, N.C., Zhou, Y., Kantarcioglu, M., Thuraisingham, B., Sweeney, L.: Privacy preserving synthetic data release using deep learning. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 510–526. Springer International Publishing, Cham (2019)
Chapter Google Scholar
Abowd, J.M., Vilhuber, L.: How protective are synthetic data? In: Domingo-Ferrer, J., Saygın, Y. (eds.) Privacy in Statistical Databases, pp. 239–246. Springer, Berlin Heidelberg, Berlin, Heidelberg (2008)
Chapter Google Scholar
Bhattacharjee, K., Chen, M., Dasgupta, A.: Privacy-preserving data visualization: reflections on the state of the art and research opportunities. In: Computer Graphics Forum. vol. 39, pp. 675–692. Wiley Online Library (2020)
Google Scholar
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)
Article Google Scholar
Caraux, G., Pinloche, S.: Permutmatrix: a graphical environment to arrange gene expression profiles in optimal linear order. Bioinformatics 21(7), 1280–1281 (2005)
Article Google Scholar
Chou, J.K., Bryan, C., Ma, K.L.: Privacy preserving visualization for social network data with ontology information. In: 2017 IEEE Pacific Visualization Symposium (PacificVis), pp. 11–20. IEEE (2017)
Google Scholar
Chou, J.K., Wang, Y., Ma, K.L.: Privacy preserving visualization: a study on event sequence data. In: Computer Graphics Forum. vol. 38, pp. 340–355. Wiley Online Library (2019)
Google Scholar
Dasgupta, A., Kosara, R., Chen, M.: Guess me if you can: A visual uncertainty model for transparent evaluation of disclosure risks in privacy-preserving data visualization. In: 2019 IEEE Symposium on Visualization for Cyber Security (VizSec), pp. 1–10. IEEE (2019)
Google Scholar
Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) Theory and Applications of Models of Computation, pp. 1–19. Springer, Berlin Heidelberg, Berlin, Heidelberg (2008)
MATH Google Scholar
Elliot, M., Hundepool, A., Nordholt, E.S., Tambay, J.L., Wende, T.: Glossary on statistical disclosure control. In: Monograph on Official Statistics, pp. 381–392. Eurostat (2006)
Google Scholar
Fernandez, N.F., et al.: Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data. Scientific data 4(1), 1–12 (2017)
Article Google Scholar
Furmanova, K., et al.: Taggle: combining overview and details in tabular data visualizations. Inf. Vis. 19(2), 114–136 (2020)
Article Google Scholar
Furmanova, K., et al.: Taggle: Scaling table visualization through aggregation. In: Poster@ IEEE Conference on Information Visualization (InfoVis’ 17), p. 139 (2017)
Google Scholar
Gray, J., et al.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Disc. 1(1), 29–53 (1997)
Article Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: 22nd International Conference on Data Engineering (ICDE’06), pp. 25–25. IEEE (2006)
Google Scholar
Li, B., Erdin, E., Gunes, M.H., Bebis, G., Shipley, T.: An overview of anonymity technology usage. Comput. Commun. 36(12), 1269–1283 (2013)
Article Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115. IEEE (2007)
Google Scholar
Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526 (2009)
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1), 3-es (2007)
Google Scholar
Massey, F.J., Jr.: The kolmogorov-smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951)
Article MATH Google Scholar
de Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3(1), 1376 (2013)
Article Google Scholar
Pytlak, K.: Personal key indicators of heart disease. https://www.kaggle.com/datasets/kamilpytlak/personal-key-indicators-of-heart-disease/metadata (2022)
Rajabiyazdi, F., Perin, C., Oehlberg, L., Carpendale, S.: Exploring the design of patient-generated data visualizations. In: Proceedings of Graphics Interface 2020, pp. 362–373. GI 2020 (2020)
Google Scholar
Rao, R., Card, S.K.: The table lens: merging graphical and symbolic representations in an interactive focus+ context visualization for tabular information. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 318–322 (1994)
Google Scholar
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40(2), 99–121 (2000)
Article MATH Google Scholar
Seo, J., Shneiderman, B.: Interactively exploring hierarchical clustering results [gene identification]. Computer 35(7), 80–86 (2002)
Article Google Scholar
Stadler, T., Oprisanu, B., Troncoso, C.: Synthetic data-anonymisation groundhog day. arXiv preprint arXiv:2011.07018 (2021)
Sweeney, L.: Simple demographics often identify people uniquely (2000)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Internat. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 571–588 (2002)
Article MATH Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Internat. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)
Article MATH Google Scholar
Thaker, P., Budiu, M., Gopalan, P., Wieder, U., Zaharia, M.: Overlook: Differentially private exploratory visualization for big data. arXiv preprint arXiv:2006.12018 (2020)
Wang, X., et al.: Graphprotector: a visual interface for employing and assessing multiple privacy preserving graph algorithms. IEEE Trans. Visual Comput. Graph. 25(1), 193–203 (2018)
Article Google Scholar
Wang, X., et al.: A utility-aware visual approach for anonymizing multi-attribute tabular data. IEEE Trans. Visual Comput. Graph. 24(1), 351–360 (2017)
Article Google Scholar
Wu, F.T.: Defining privacy and utility in data sets. U. Colo. L. Rev. 84, 1117 (2013)
Google Scholar
Xiao, F., et al.: An information-aware visualization for privacy-preserving accelerometer data sharing. HCIS 8(1), 1–28 (2018). https://doi.org/10.1186/s13673-018-0137-6
Article Google Scholar
Xu, L., Skoularidou, M., Cuesta-Infante, A., Veeramachaneni, K.: Modeling tabular data using conditional gan. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Zhang, D., Sarvghad, A., Miklau, G.: Investigating visual analysis of differentially private data. IEEE Trans. Visual Comput. Graph. 27(2), 1786–1796 (2020)
Article Google Scholar

Download references

Acknowledgment

This work is supported by National Natural Science Foundation of China (62172155, 62072465).

Author information

Authors and Affiliations

Sun Yat-Sen University, Guangzhou, 510000, China
Fengzhou Liang
Hunan University, Changsha, 410000, China
Fang Liu
National University of Defense Technology, Changsha, 410000, China
Tongqing Zhou

Authors

Fengzhou Liang
View author publications
You can also search for this author in PubMed Google Scholar
Fang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tongqing Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fang Liu .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
Xi’an Jiaotong-Liverpool University, Suzhou, China
Xinheng Wang
Zhejiang University City College, Hangzhou, China
Wei Wei
London South Bank University, London, UK
Tasos Dagiuklas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, F., Liu, F., Zhou, T. (2022). A Visual Tool for Interactively Privacy Analysis and Preservation on Order-Dynamic Tabular Data. In: Gao, H., Wang, X., Wei, W., Dagiuklas, T. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 461. Springer, Cham. https://doi.org/10.1007/978-3-031-24386-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-24386-8_2
Published: 25 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24385-1
Online ISBN: 978-3-031-24386-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Visual Tool for Interactively Privacy Analysis and Preservation on Order-Dynamic Tabular Data