Skip to main content
Log in

Software Identification by Standard Machine Learning Tools

  • Published:
Automatic Control and Computer Sciences Aims and scope Submit manuscript

Abstract

This article considers tools for controlling software installed on personal computers of automated system users. The flaws of these software solutions are grounded, and an approach to identifying executable files with the help of a machine learning algorithm is developed and presented. This algorithm consists in the gradient decision tree boosting on the basis of such libraries as XGBoost, LightGBM, CatBoost. The identification of programs with the help of XGBoost and LightGBM is executed. The experimental results are compared with the results of earlier studies conducted by other authors. The findings show that the developed method allows for identifying violations in the adopted security policy during information processing in automated systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.

Similar content being viewed by others

REFERENCES

  1. Zegzhda, P., Zegzhda, D., Pavlenko, E., and Ignatev, G., Applying deep learning techniques for Android malware detection, Proc. 11th Int. Conf. on Security of Information and Networks, Cardiff, 2018, New York: Association for Computing Machinery, 2018, p. 7.  https://doi.org/10.1145/3264437.3264476

  2. Pavlenko, E.Yu., Yarmak, A.V., and Moskvin, D.A., Hierarchical approach to analyzing security breaches in information systems, Autom. Control Comput. Sci., 2017, vol. 51, no. 8, pp. 829–834.  https://doi.org/10.3103/S0146411617080144

    Article  Google Scholar 

  3. Salakhutdinova, K., Krivtsova, I., Lebedev, I., and Sukhoparov, M., An approach to selecting an informative feature in software identification, Internet of Things, Smart Spaces, and Next Generation Networks and Systems. NEW2AN 2018, ruSMART 2018, Galinina, O., Andreeva, S., Balandin, S., and Koucheryavy, Yu., Eds., Lecture Notes in Computer Science, vol. 11118, Cham: Springer, 2018, pp. 318–327.  https://doi.org/10.1007/978-3-030-01168-0_30

    Book  Google Scholar 

  4. Krivtsova, I.E., Lebedev, I.S., and Salakhutdinova, K.I., Identification of executable files on the basis of statistical criteria, 20th Conf. of Open Innovations Association (FRUCT), St. Petersburg, 2017, IEEE, 2017, pp. 202–208.  https://doi.org/10.23919/FRUCT.2017.8071312

  5. Salakhutdinova, K.I., Lebedev, I.S., Krivtsova, I.E., and Sukhoparov, M.E., Studying the effect of selection of the sign and ratio in the formation of a signature in a program identification problem, Autom. Control Comput. Sci., 2018, vol. 52, no. 8, pp. 1101–1104.  https://doi.org/10.3103/S0146411618080229

    Article  Google Scholar 

  6. Salakhutdinova, K.I., Lebedev, I.S., and Krivtsova, I.E., Gradient boosting trees method in the task of software identification, Nauch.-Tekh. Vestn. Inf. Tekhnol., Mekh. Opt., 2018, vol. 18, no. 6, pp. 1016–1022.  https://doi.org/10.17586/2226-1494-2018-18-6-1016-1022

    Article  Google Scholar 

  7. Krivtsova, I., Lebedev, I., Sukhoparov, M., Bazhayev, N., Zikratov, I., Ometov, A., Andreev, S., Masek, P., Fujdiak, R., and Hosek, J., Implementing a broadcast storm attack on a mission-critical wireless sensor network, Wired/Wireless Internet Communications. WWIC 2016, Mamatas, L., Matta, I., Papadimitriou, P., and Koucheryavy, Y., Eds., Lecture Notes in Computer Science, vol. 9674, Cham: Springer, 2016, pp. 297–308.  https://doi.org/10.1007/978-3-319-33936-8_23

    Book  Google Scholar 

  8. CatBoost, GitHub. https://github.com/catboost. Cited January 29, 2019

  9. XGBoost, GitHub. https://github.com/dmlc/xgboost. Cited February 9, 2019.

  10. Kitov, V.V., Accuracy analysis of the gradient boosting method with random rotations, Ekon. Stat. Inf. Vestn. UMO, 2016, no. 4, pp. 22–26.

  11. LightGBM, GitHub. https://github.com/Microsoft/LightGBM. Cited February 2, 2019.

  12. Kaftannikov, I.L. and Parasich, A.V., Decision tree’s features of application in classification problems, Vestn. Yuzhno-Ural. Gos. Univ. Ser.: Komp’yut. Tekhnol., Upr., Radioelektron., 2015, vol. 15, no. 3, pp. 26–32.

    Google Scholar 

  13. Bagga, A. and Baldwin, B., Cross-document event coreference: Annotations, experiments, and observations, Proc. ACL-99 Workshop on Coreference and Its Applications, College Park, Md., 1999, 1999.

  14. Antonov, A.E. and Fedulov, A.S., Identification of file type on the basis of structural analysis, Prikl. Inf., 2013, no. 2, pp. 68–77.

  15. Kornblum, J.D., Identifying almost identical files using context triggered piecewise hashing, Digital Invest., 2006, vol. 3, pp. 91–97.  https://doi.org/10.1016/j.diin.2006.06.015

    Article  Google Scholar 

  16. Ebringer, T., Sun, L., and Boztas, S., A fast randomness test that preserves local detail, Proc. 18th Virus Bull. Int. Conf., Ottawa, 2008, pp. 34–42.

Download references

Funding

This study was conducted under basic research program 7, “New Developments in Cutting-Edge Areas of Power Engineering, Mechanics, and Robotics” of the Russian Academy of Sciences. This program is one of the programs of basic research in focal areas defined by the Presidium of the Russian Academy of Sciences.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. E. Sukhoparov.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Translated by S. Kuznetsov

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sukhoparov, M.E., Salakhutdinova, K.I. & Lebedev, I.S. Software Identification by Standard Machine Learning Tools. Aut. Control Comp. Sci. 55, 1175–1179 (2021). https://doi.org/10.3103/S0146411621080459

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0146411621080459

Keywords: