Abstract
Security (and Audit) log collection and storage is a crucial process for enterprises around the globe. Log analysis helps identify potential security breaches and, in some cases, is required by law for compliance. However, enterprises often delegate these responsibilities to a third-party cloud service provider, where the logs are collected and processed for anomaly detection and stored in a cold data warehouse for archiving. Prevalent schemes rely on plain (unencrypted) data for log anomaly detection. More often, these logs can reveal much sensitive information about an organization or the customers of that organization. Hence it is in the best interest of everyone to keep it encrypted at all times. This paper proposes “SigML” utilizing Fully Homomorphic Encryption (FHE) with the Cheon-Kim-Kim-Song (CKKS) scheme for supervised log anomaly detection on encrypted data. We formulate a binary classification problem and propose a novel “Aggregate” configuration using the Sigmoid function for resource-strained (wireless sensors or IoT) devices to reduce communication and computation requirements by a factor of n, where n is the number of ciphertexts received by the clients. We further approximate the Sigmoid activation function (\(\sigma (x)\)) with first, third, and fifth-order polynomials in the encrypted domain and evaluate the supervised models with NSL-KDD and HDFS datasets in terms of performance metrics and computation time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We consider that plaintexts belong to the unencrypted domain and ciphertexts to the encrypted domain.
- 2.
Sigmoid is used in LR and SVM during classification, so we decided to make it homomorphic.
- 3.
CKKS is more suited for arithmetic on real numbers, where we can have approximate but close results, while BFV is more suited for arithmetic on integers.
- 4.
Cloud-based models are susceptible to training data inference attacks, e.g., attribute inference attacks, membership inference attacks, and model inversion attacks.
- 5.
We omit the details of textual log data parsing for brevity.
References
Bittau, A., et al.: Prochlo: strong privacy for analytics in the crowd. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp. 441–459 (2017)
Boudguiga, A., Stan, O., Sedjelmaci, H., Carpov, S.: Homomorphic encryption at work for private analysis of security logs. In: ICISSP, pp. 515–523 (2020)
Brakerski, Z.: Fully homomorphic encryption without modulus switching from classical GapSVP. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 868–886. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32009-5_50
Brakerski, Z., Gentry, C., Vaikuntanathan, V.: Fully homomorphic encryption without bootstrapping. Cryptology ePrint Archive, Paper 2011/277 (2011). https://eprint.iacr.org/2011/277, https://eprint.iacr.org/2011/277
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)
TITLE 1.81.5. California Consumer Privacy Act of 2018 [1798.100 - 1798.199.100] (2018). https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?division=3.& part=4. &lawCode=CIV &title=1.81.5
Chen, H., et al.: Logistic regression over encrypted data from fully homomorphic encryption. BMC Med. Genom. 11(4), 3–12 (2018)
Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. Cryptology ePrint Archive, Report 2016/421 (2016). https://eprint.iacr.org/2016/421
Chillotti, I., Gama, N., Georgieva, M., Izabachène, M.: Faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 3–33. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_1
S.3195 - Consumer Online Privacy Rights Act (2021). https://www.congress.gov/bill/117th-congress/senate-bill/3195
for Cybersecurity, C.I.: Nsl-kdd—datasets—research—canadian institute for cybersecurity (2019). https://www.unb.ca/cic/datasets/nsl.html
Durumeric, Z., et al.: The security impact of https interception. In: NDSS (2017)
Fan, J., Vercauteren, F.: Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive, Report 2012/144 (2012). https://eprint.iacr.org/2012/144
EUR-Lex - 02016R0679-20160504 - EN - EUR-Lex (2016). https://eur-lex.europa.eu/eli/reg/2016/679/2016-05-04
He, P., Zhu, J., Zheng, Z., Lyu, M.R.: Drain: an online log parsing approach with fixed depth tree. In: 2017 IEEE International Conference on Web Services (ICWS), pp. 33–40. IEEE (2017)
He, S., Zhu, J., He, P., Lyu, M.R.: Loghub: a large collection of system log datasets towards automated log analytics (2020). https://doi.org/10.48550/ARXIV.2008.06448, https://arxiv.org/abs/2008.06448
Huelse: Huelse/seal-python: Microsoft seal 4.x for python (2022). https://github.com/Huelse/SEAL-Python. Accessed 9 May 2022
Principles for the processing of user data by Kaspersky security solutions and technologies—Kaspersky. https://usa.kaspersky.com/about/data-protection
Nakashima, E.: Israel hacked Kaspersky, then tipped the NSA that its tools had been breached (2017). https://www.washingtonpost.com/world/national-security/israel-hacked-kaspersky-then-tipped-the-nsa-that-its-tools-had-been-breached/2017/10/10/d48ce774-aa95-11e7-850e-2bdd1236be5d_story.html
Paul, J., et al.: Privacy-preserving collective learning with homomorphic encryption. IEEE Access 9, 132084–132096 (2021)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Perlroth, N., Shane, S.: How Israel caught Russian hackers scouring the world for U.S. secrets (2017). https://www.nytimes.com/2017/10/10/technology/kaspersky-lab-israel-russia-hacking.html
Python Core Team: Python: A dynamic, open source programming language. Python Software Foundation (2021). https://www.python.org/. Python version 3.10
Rane, S., Dixit, A.: BlockSLaaS: blockchain assisted secure logging-as-a-service for cloud forensics. In: Nandi, S., Jinwala, D., Singh, V., Laxmi, V., Gaur, M.S., Faruki, P. (eds.) ISEA-ISAP 2019. CCIS, vol. 939, pp. 77–88. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-7561-3_6
Ray, I., Belyaev, K., Strizhov, M., Mulamba, D., Rajaram, M.: Secure logging as a service—delegating log management to the cloud. IEEE Syst. J. 7(2), 323–334 (2013)
Remez, E.Y.: Sur le calcul effectif des polynomes d’approximation de tschebyscheff. CR Acad. Sci. Paris 199(2), 337–340 (1934)
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD cup 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. IEEE (2009)
Taylor, S.: Is your antivirus software spying on you?—Restore privacy (2021). https://restoreprivacy.com/antivirus-privacy/
Temperton, J.: AVG can sell your browsing and search history to advertisers (2015). https://www.wired.co.uk/article/avg-privacy-policy-browser-search-data
The Tor Project—Privacy & Freedom Online. https://www.torproject.org/
Wang, L.: Owl: A general-purpose numerical library in OCaml (2017)
Wang, Q., Feng, C., Xu, Y., Zhong, H., Sheng, V.S.: A novel privacy-preserving speech recognition framework using bidirectional LSTM. J. Cloud Comput. 9(1), 1–13 (2020)
Zawoad, S., Dutta, A.K., Hasan, R.: SecLaaS: secure logging-as-a-service for cloud forensics. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 219–230 (2013)
Zawoad, S., Dutta, A.K., Hasan, R.: Towards building forensics enabled cloud through secure logging-as-a-service. IEEE Trans. Dependable Secure Comput. 13(2), 148–162 (2015)
Zhao, J., Mortier, R., Crowcroft, J., Wang, L.: Privacy-preserving machine learning based data analytics on edge devices. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 341–346 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Trivedi, D., Boudguiga, A., Triandopoulos, N. (2023). SigML: Supervised Log Anomaly with Fully Homomorphic Encryption. In: Dolev, S., Gudes, E., Paillier, P. (eds) Cyber Security, Cryptology, and Machine Learning. CSCML 2023. Lecture Notes in Computer Science, vol 13914. Springer, Cham. https://doi.org/10.1007/978-3-031-34671-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-34671-2_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34670-5
Online ISBN: 978-3-031-34671-2
eBook Packages: Computer ScienceComputer Science (R0)