skip to main content
10.1145/3625403.3625405acmotherconferencesArticle/Chapter ViewAbstractPublication PagesadmitConference Proceedingsconference-collections
research-article

A Data Analysis Privacy Regulation Compliance Scheme for Lakehouse

Published:17 November 2023Publication History

ABSTRACT

To meet the diverse data storage and analysis needs in the Internet of Things era, businesses embrace the lakehouse approach, a hybrid deployment of data lakes and data warehouses on a single platform. Data consumers leverage data mining techniques through open APIs to explore data’s untapped potential. However, concerns arise regarding compliant data access and utilization. While privacy regulations like the General Data Protection Regulation (GDPR) offer conceptual guidance, their technical implementations remain vague. This paper proposes a privacy regulation compliance framework specific to lakehouse data analysis. By introducing a compliance verification layer between the analysis and processing layers, the scheme enables regulatory adherence. The utilization of Trusted Execution Environments (TEEs) guarantees verification of analysis requests, with blockchain serving as a storage medium for results. To mitigate unauthorized data analysis, we introduce a reputation-based punishment mechanism. Experimental results demonstrate the scheme’s feasibility.

References

  1. Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidharan, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith, Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko Vukolic, Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger fabric: a distributed operating system for permissioned blockchains. In Proceedings of the Thirteenth EuroSys Conference. ACM, 30:1–30:15. https://doi.org/10.1145/3190508.3190538Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Masoud Barati and Omer F. Rana. 2022. Tracking GDPR Compliance in Cloud-Based Service Delivery. IEEE Transactions on Services Computing 15, 3 (2022), 1498–1511. https://doi.org/10.1109/TSC.2020.2999559Google ScholarGoogle ScholarCross RefCross Ref
  3. Raymond Cheng, Fan Zhang, Jernej Kos, Warren He, Nicholas Hynes, Noah M. Johnson, Ari Juels, Andrew Miller, and Dawn Song. 2019. Ekiden: A Platform for Confidentiality-Preserving, Trustworthy, and Performant Smart Contracts. In Proceedings of the 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 185–200. https://doi.org/10.1109/EuroSP.2019.00023Google ScholarGoogle ScholarCross RefCross Ref
  4. Arka Rai Choudhuri, Matthew Green, Abhishek Jain, Gabriel Kaptchuk, and Ian Miers. 2017. Fairness in an Unfair World: Fair Multiparty Computation from Public Bulletin Boards. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), Bhavani Thuraisingham, David Evans, Tal Malkin, and Dongyan Xu (Eds.). ACM, 719–728. https://doi.org/10.1145/3133956.3134092Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. Cryptology ePrint Archive, Paper 2016/086. https://eprint.iacr.org/2016/086 https://eprint.iacr.org/2016/086.Google ScholarGoogle Scholar
  6. Poulami Das, Lisa Eckey, Tommaso Frassetto, David Gens, Kristina Hostáková, Patrick Jauernig, Sebastian Faust, and Ahmad-Reza Sadeghi. 2019. FastKitten: Practical Smart Contracts on Bitcoin. In Proceedings of the 2019 USENIX Security Symposium. USENIX Association, 801–818.Google ScholarGoogle Scholar
  7. Soukaina Ait Errami, Hicham Hajji, Kenza Ait El Kadi, and Hassan Badir. 2023. Spatial big data architecture: From Data Warehouses and Data Lakes to the LakeHouse. J. Parallel and Distrib. Comput. 176 (2023), 70–79. https://doi.org/10.1016/j.jpdc.2023.02.007Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Shaoyong Guo, Keqin Zhang, Bei Gong, Liandong Chen, Yinlin Ren, Feng Qi, and Xuesong Qiu. 2023. Sandbox Computing: A Data Privacy Trusted Sharing Paradigm Via Blockchain and Federated Learning. IEEE Trans. Comput. 72, 3 (2023), 800–810. https://doi.org/10.1109/TC.2022.3180968Google ScholarGoogle ScholarCross RefCross Ref
  9. Zicong Hong, Song Guo, and Peng Li. 2022. Scaling Blockchain via Layered Sharding. IEEE Journal on Selected Areas in Communications 40, 12 (2022), 3575–3588. https://doi.org/10.1109/JSAC.2022.3213350Google ScholarGoogle ScholarCross RefCross Ref
  10. Huawei Huang, Xiaowen Peng, Jianzhou Zhan, Shenyang Zhang, Yue Lin, Zibin Zheng, and Song Guo. 2022. BrokerChain: A Cross-Shard Blockchain Protocol for Account/Balance-based State Sharding. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. 1968–1977. https://doi.org/10.1109/INFOCOM48880.2022.9796859Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gianluca Lax and Antonia Russo. 2021. A Lightweight Scheme Exploiting Social Networks for Data Minimization According to the GDPR. IEEE Transactions on Computational Social Systems 8, 2 (2021), 388–397. https://doi.org/10.1109/TCSS.2020.3049009Google ScholarGoogle ScholarCross RefCross Ref
  12. Ngoc Duy Pham, Alsharif Abuadbba, Yansong Gao, Khoa Tran Phan, and Naveen K. Chilamkurti. 2023. Binarizing Split Learning for Data Privacy Enhancement and Computation Reduction. IEEE Transactions on Information Forensics and Security 18 (2023), 3088–3100. https://doi.org/10.1109/TIFS.2023.3274391Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nguyen Binh Truong, Kai Sun, Gyu Myoung Lee, and Yike Guo. 2020. GDPR-Compliant Personal Data Management: A Blockchain-Based Solution. IEEE Transactions on Information Forensics and Security 15 (2020), 1746–1761. https://doi.org/10.1109/TIFS.2019.2948287Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lipeng Wang, Zhi Guan, Zhong Chen, and Mingsheng Hu. 2023. Enabling Integrity and Compliance Auditing in Blockchain-based GDPR-compliant Data Management. IEEE Internet of Things Journal (2023), 1–1. https://doi.org/10.1109/JIOT.2023.3285211Google ScholarGoogle ScholarCross RefCross Ref
  15. Lun Wang, Usmann Khan, Joseph P. Near, Qi Pang, Jithendaraa Subramanian, Neel Somani, Peng Gao, Andrew Low, and Dawn Song. 2022. PrivGuard: Privacy Regulation Compliance Made Easier. In Proceeding of the 31st USENIX Security Symposium (USENIX Security). USENIX Association, 3753–3770. https://www.usenix.org/conference/usenixsecurity22/presentation/wang-lunGoogle ScholarGoogle Scholar
  16. Matei Zaharia, Ali Ghodsi, Reynold Xin, and Michael Armbrust. 2021. Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics. In Proceeding of the 11th Conference on Innovative Data Systems Research, (CIDR). http://cidrdb.org/cidr2021/papers/cidr2021_paper17.pdfGoogle ScholarGoogle Scholar

Index Terms

  1. A Data Analysis Privacy Regulation Compliance Scheme for Lakehouse

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ADMIT '23: Proceedings of the 2023 2nd International Conference on Algorithms, Data Mining, and Information Technology
      September 2023
      227 pages
      ISBN:9798400707629
      DOI:10.1145/3625403

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 November 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)41
      • Downloads (Last 6 weeks)7

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format