Abstract
This paper presents studies intended to analyze the methods for generating synthetic data to fill honeypot systems. To choose the generated data types, the topical target objects in the context of honeypot systems are revealed. The existing methods of generation are investigated. Methods for analyzing the quality of generated data in the context of honeypot systems are also analyzed. As a result, the layout of an automated system for generating synthetic data for honeypot systems is developed and the efficiency of its operation is estimated.
REFERENCES
Positive Technologies, 2021. https://www.ptsecurity.com/ru-ru/research/analytics/cybersecurity-threatscape-2021-q1/. Cited February, 2022.
Mairh, A., Barik, D., Verma, K., Jena, D., Honeypot in network security: A survey, ICCCS ’11: Proc. 2011 Int. Conf. on Communication, Computing & Security, Rourkela Odisha, India, 2011, New York: Association for Computing Machinery, 2011, pp. 600–605. https://doi.org/10.1145/1947940.1948065
Ovasapyan, T.D., Knyazev, P.V., and Moskvin, D.A., Application of taint analysis to study the safety of software of the internet of things devices based on the ARM architecture, Autom. Control Comput. Sci., 2020, vol. 54, pp. 834–840. https://doi.org/10.3103/S0146411620080246
Bao, J., Ji, C., and Gao, M., Research on network security of defense based on honeypot, 2010 Int. Conf. on computer application and system modeling (ICCASM 2010), Taiyuan, China, 2010, IEEE, 2010, pp. V10-299–V10-302. https://doi.org/10.1109/ICCASM.2010.5622780
Kalinin, M., Zegzhda, D., and Zavadskii, E., Protection of energy network infrastructures applying a dynamic topology virtualization, Energies, 2022, vol. 15, no. 11, p. 4123. https://doi.org/10.3390/en15114123
Positive Technologies. https://www.ptsecurity.com/ru-ru/about/news/positive-technologies-chislo-atak-na-promyshlennye-kompanii-vyroslo-na-91-po-sravneniyu-s-2019-godom/. Cited November 10, 2021.
Krundyshev, V. and Kalinin, M., The security risk analysis methodology for smart network environments, 2020 Int. Russian Automation Conf. (RusAutoCon), Sochi, Russia, 2020, IEEE, 2020, pp. 437–442. https://doi.org/10.1109/RusAutoCon49822.2020.9208116
Ivanov, D., Kalinin, M., Krundyshev, V., and Orel, E., Automatic security management of smart infrastructures using attack graph and risk analysis, 2020 Fourth World Conf. on Smart Trends in Systems, Security and Sustainability (WorldS5), London, 2020, IEEE, 2020, pp. 295–300. https://doi.org/10.1109/WorldS450073.2020.9210410
Ognev, R.A., Zhukovskii, E.V., and Zegzhda, D.P., Clustering of malicious executable files based on the sequence analysis of system calls, Autom. Control Comput. Sci., 2019, vol. 53, p. 1045–1055. https://doi.org/10.3103/S0146411619080212
Dakhnovich, A.D., Moskvin, D.A., and Ivanov, D.V., A technique for safely transforming the infrastructure of industrial control systems to the Industrial Internet of Things, Autom. Control Comput. Sci., 2020, vol. 54, pp. 841–849. https://doi.org/10.3103/S0146411620080106
Belenko, V., Krundyshev, V., and Kalinin, M., Synthetic datasets generation for intrusion detection in VANET, SIN ’18: Proc. 11th Int. Conf. on Security of Information and Networks, Cardiff, UK, 2018, New York: Association for Computing Machinery, 2018, p. 9. https://doi.org/10.1145/3264437.3264479
Belenko, V., Chernenko, V., Kalinin, M., and Krundyshev, V., Evaluation of GAN applicability for intrusion detection in self-organizing networks of cyber physical systems, 2018 Int. Russian Automation Conf. (RusAutoCon), Sochi, Russia, 2018, IEEE, 2018, pp. 1–7. https://doi.org/10.1109/RUSAUTOCON.2018.8501783
Dakhnovich, A.D., Moskvin, D.A., and Zegzhda, D.P., An approach to building cyber-resistant interactions in the Industrial Internet of Things, Autom. Control Comput. Sci., 2019, vol. 53, pp. 948–953. https://doi.org/10.3103/S0146411619080078
Stadler T., Oprisanu B., and Troncoso C., Synthetic data-anonymisation Groundhog Day, 31st USENIX Security Symp. (USENIX Security 22), Boston: USENIX Association, 2022, pp. 1451–1468. https://www.usenix.org/ conference/usenixsecurity22/presentation/stadler.
Shokri, R., Stronati, M., Song, C., and Shmatikov, V., Membership inference attacks against machine learning models, 2017 IEEE Symp. on Security and Privacy (SP), San Jose, Calif., 2017, IEEE, 2017, pp. 3–18. https://doi.org/10.1109/SP.2017.41
Hayes, J., Melis, L., Danezis, G., and De Cristofaro, E., LOGAN: Membership inference attacks against generative models, Proc. on Privacy Enhancing Technologies Symp., Barcelona, 2018, De Gruyter, 2018, pp. 133–152. https://doi.org/10.2478/popets-2019-0008
Bellovin, S.M., Dutta, P.K., and Reitinger, N., Privacy and synthetic datasets, Standford Tech. L. Rev., 2019, vol. 22, p. 1.
Peterson, L.E., K-nearest neighbor, Scholarpedia, 2009, vol. 4, no. 2, p. 1883. https://doi.org/10.4249/scholarpedia.1883
Oshiro, T., Perez, P.S., and Baranauskas, J.A., How many trees in a random forest?, Machine Learning and Data Mining in Pattern Recognition. MLDM 2012, Perner, P., Ed., Lecture Notes in Computer Science, vol. 7376, Berlin: Springer, 2012, pp. 154–168. https://doi.org/10.1007/978-3-642-31537-4_13
Large-scale CelebFaces Attributes (CelebA) Dataset. https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html/. Cited February 22, 2022.
Hospital Discharge Data Public Use Data File. https://www.dshs.texas.gov/THCIC/Hospitals/Download.shtm/. Cited February 22, 2022.
Funding
This work was performed as part of the State assignment for basic research (topic code 0784-2020-0026).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Translated by E. Oborin
About this article
Cite this article
Danilov, V.D., Ovasapyan, T.D., Ivanov, D.V. et al. Generation of Synthetic Data for Honeypot Systems Using Deep Learning Methods. Aut. Control Comp. Sci. 56, 916–926 (2022). https://doi.org/10.3103/S014641162208003X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S014641162208003X