Skip to main content
Log in

Byte Frequency Based Indicators for Crypto-Ransomware Detection from Empirical Analysis

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

File entropy is one of the major indicators of crypto-ransomware because the encryption by ransomware increases the randomness of file contents. However, entropy-based ransomware detection has certain limitations; for example, when distinguishing ransomware-encrypted files from normal files with inherently high-level entropy, misclassification is very possible. In addition, the entropy evaluation cost for an entire file renders entropy-based detection impractical for large files. In this paper, we propose two indicators based on byte frequency for use in ransomware detection; these are termed EntropySA and DistSA, and both consider the interesting characteristics of certain file subareas termed “sample areas” (SAs). For an encrypted file, both the sampled area and the whole file exhibit high-level randomness, but for a plain file, the sampled area embeds informative structures such as a file header and thus exhibits relatively low-level randomness even though the entire file exhibits high-level randomness. EntropySA and DistSA use “byte frequency” and a variation of byte frequency, respectively, derived from sampled areas. Both indicators cause less overhead than other entropy-based detection methods, as experimentally proven using realistic ransomware samples. To evaluate the effectiveness and feasibility of our indicators, we also employ three expensive but elaborate classification models (neural network, support vector machine and threshold-based approaches). Using these models, our experimental indicators yielded an average F1-measure of 0.994 and an average detection rate of 99.46% for file encryption attacks by realistic ransomware samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Young A, Yung M. Cryptovirology: Extortion-based security threats and counter-measures. In Proc. the 17th IEEE Symp. Security and Privacy, May 1996, pp.129-140. DOI: https://doi.org/10.1109/SECPRI.1996.502676.

  2. Daemen J, Rijmen V. The Design of Rijndael: AES— The Advanced Encryption Standard. Springer, 2002. DOI: https://doi.org/10.1007/978-3-662-04722-4.

  3. Rivest R L, Shamir A, Adleman L. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 1978, 21(2): 120-126. DOI: https://doi.org/10.1145/359340.359342.

    Article  MathSciNet  MATH  Google Scholar 

  4. McCoy D, Bauer K, Grunwald D, Kohno T, Sicker D. Shining light in dark places: Understanding the Tor network. In Proc. the 8th Conf. Privacy Enhancing Technologies, Jul. 2008, pp.63-76. DOI: https://doi.org/10.1007/978-3-540-70630-4_5.

  5. Reid F, Harrigan M. An analysis of anonymity in the Bit-coin system. In Proc. the 3rd IEEE International Conf. Privacy, Security, Risk and Trust and the 3rd IEEE International Conf. Social Computing, Oct. 2011, pp.1318-1326. DOI: https://doi.org/10.1109/PASSAT/SocialCom.2011.79.

  6. Kelpsas B, Nelson A. Ransomware in hospitals: What providers will inevitably face when attacked. The Journal of Medical Practice Management, 2016, 32(1): 67-70.

    Google Scholar 

  7. Cyber Threat Alliance. CryptoWall version 3 threat. Technical Report, Infopoint Security, 2019. https://www.infopoint-security.de/medien/cryptowall-report.pdf, April 2021.

  8. Sophos. SophosLabs 2019 threat report. Technical Report, Sophos, 2019. https://www.sophos.com/en-us/medialibrary/PDFs/technical-papers/sophoslabs-2019-threat-report.pdf, May 2021.

  9. Sophos. Ransomware as a service (RaaS): Deconstructing Philadelphia. Technical Report, Sophos, 2017. https://www.sophos.com/en-us/medialibrary/PDFs/technical-papers/RaaS-Philadelphia.pdf, May 2021.

  10. Scaife N, Carter H, Traynor P, Butler K R B. Cryp-toLock (and drop it): Stopping ransomware attacks on user data. In Proc. the 36th IEEE International Conf. Distributed Computing Systems, Jun. 2016, pp.303-312. DOI: https://doi.org/10.1109/ICDCS.2016.46.

  11. Kharaz A, Arshad S, Mulliner C, Robertson W, Kirda E. UNVEIL: A large-scale, automated approach to detecting ransomware. In Proc. the 25th USENIX Security Symp., Aug. 2016, pp.757-772.

  12. Continella A, Guagnelli A, Zingaro G, De Pasquale G, Barenghi A, Zanero S, Maggi F. ShieldFS: A self-healing, ransomware-aware filesystem. In Proc. the 32nd Annual Conf. Computer Security Applications, Dec. 2016, pp.336-347. DOI: https://doi.org/10.1145/2991079.2991110.

  13. Shukla M, Mondal S, Lodha S. POSTER: Locally virtualized environment for mitigating ransomware threat. In Proc. the 2016 ACM SIGSAC Conf. Computer and Communications Security, Oct. 2016, pp.1784-1786. DOI: https://doi.org/10.1145/2976749.2989051.

  14. McDaniel M, Heydari M H. Content based file type detection algorithms. In Proc. the 36th Hawaii International Conf. System Sciences, Jan. 2003. DOI: 10.1109/HICSS.2003.1174905.

  15. Shannon C E. A mathematical theory of communication. Bell System Technical Journal, 1948, 27(3): 379-423. DOI: https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.

    Article  MathSciNet  MATH  Google Scholar 

  16. Richman J S, Moorman J R. Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology: Heart and Circulatory Physiology, 2000, 278(6): 2039-2049. DOI: https://doi.org/10.1152/ajp-heart.2000.278.6.H2039.

    Article  Google Scholar 

  17. Humeau-Heurtier A. The multiscale entropy algorithm and its variants: A review. Entropy, 2015, 17(5): 3110-3123. DOI: https://doi.org/10.3390/e17053110.

    Article  MathSciNet  Google Scholar 

  18. Ghaffari F, Abadi M. DroidMalHunter: A novel entropy-based anomaly detection system to detect malicious Android applications. In Proc. the 5th International Conf. Computer and Knowledge Engineering, Oct. 2015, pp.301-306. DOI: https://doi.org/10.1109/ICCKE.2015.7365846.

  19. Jones L. Constructive approximations for neural networks by sigmoidal functions. Proceedings of IEEE, 1990, 78(10): 1586-1589. DOI: https://doi.org/10.1109/5.58342.

    Article  Google Scholar 

  20. Kingma D, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014. http://arxiv.org/abs/1412.6980, May 2021.

  21. Makhoul J, Kubala F, Schwartz R, Weischedel R. Performance measures for information extraction. In Proc. the DARPA Broadcast News Workshop, February 1999, pp.249-252.

  22. Dworkin M. Recommendation for block cipher modes of operation: Galois/Counter Mode (GCM) for confidentiality and authentication. Technical Report, National Institute of Standards and Technology, 2006. https://web.cs.ucdavis.edu/~rogaway/ocb/gcm.pdf, April 2021.

  23. Sahu M K, Ahirwar M, Hemlata A. A review of malware detection based on pattern matching technique. International Journal of Computer Science and Information Technologies, 2014, 5(1): 944-947.

    Google Scholar 

  24. Sedgewick A, Souppaya M, Scarfone K. Guide to application whitelisting. Technical Report, National Institute of Standards and Technology, 2015. https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-167.pdf, April 2021. DOI: 10.6028/NIST.SP.800-167.

  25. Prabhakaran V, Arpaci-Dusseau A C, Arpaci-Dusseau R H. Analysis and evolution of journaling file systems. In Proc. the 2005 USENIX Annual Technical Conf., April 2005, pp.105-120.

  26. Virable M, Savage S, Voelker G M. BlueSky: A cloud-backed file system for the enterprise. In Proc. the 10th USENIX Conf. File and Storage Technologies, Feb. 2012, Article No. 19.

  27. Paik J Y, Shin K, Cho E S. Self-defensible storage devices based on ash memory against ransomware. In Proc. the 37th IEEE Symp. Security and Privacy, May 2016.

  28. Huang J, Xu J, Xing X, Liu P, Qureshi M K. FlashGuard: Leveraging intrinsic ash properties to defend against encryption ransomware. In Proc. the 2017 ACM SIGSAC Conf. Computer and Communications Security, Oct. 2017, pp.2231-2244. DOI: https://doi.org/10.1145/3133956.3134035.

  29. Kolodenker E, Koch W, Stringhini G, Egele M. Pay-Break: Defense against crypto-graphic ransomware. In Proc. the 2017 ACM on Asia Conf. Computer and Communications Security, Apr. 2017, pp.599-611. DOI: https://doi.org/10.1145/3052973.3053035.

  30. Karresand M, Shahmehri N. File type identification of data fragments by their binary structure. In Proc. the 2006 IEEE Workshop on Information Assurance, Jun. 2006, pp.140-147. DOI: 10.1109/IAW.2006.1652088.

  31. Li Q, Ong A, Suganthan P, Thing V. A novel support vector machine approach to high entropy data fragment classification. In Proc. South African Information Security Multi-Conference, May 2010, pp.236-247.

  32. Lyda R, Hamrock J. Using entropy analysis to find encrypted and packed malware. IEEE Security and Privacy, 2007, 5(2): 40-45. DOI: https://doi.org/10.1109/MSP.2007.48.

    Article  Google Scholar 

  33. Saxe J, Berlin K. Deep neural network based malware detection using two dimensional binary program features. In Proc. the 10th International Conf. Malicious and Unwanted Software, Oct. 2015, pp.11-20. DOI: https://doi.org/10.1109/MAL-WARE.2015.7413680.

  34. Li B, Zhang Y, Yao J, Yin T. MDBA: Detecting malware based on bytes n-gram with association mining. In Proc. the 26th International Conf. Telecommunications, Apr. 2019, pp.227-232. DOI: https://doi.org/10.1109/ICT.2019.8798828.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joon-Young Paik.

Supplementary Information

ESM 1

(PDF 653 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, G.Y., Paik, JY., Kim, Y. et al. Byte Frequency Based Indicators for Crypto-Ransomware Detection from Empirical Analysis. J. Comput. Sci. Technol. 37, 423–442 (2022). https://doi.org/10.1007/s11390-021-0263-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-021-0263-x

Keywords

Navigation