Elsevier

Applied Soft Computing

Volume 96, November 2020, 106630
Applied Soft Computing

Cryptocurrency malware hunting: A deep Recurrent Neural Network approach

https://doi.org/10.1016/j.asoc.2020.106630Get rights and content

Highlights

  • Criminals have found that cryptocurrency can demonstrate to be a highly profitable effort.

  • We propose a deep Recurrent Neural Network (RNN) learning model for hunting cryptocurrency malware threats.

  • Our proposed model utilizes the RNN to analyze windows applications Opcodes as a case study.

  • The trained model is evaluated with five different Long Short-Term Memory configurations conducted by 10-fold cross-validation (CV) technique.

Abstract

In recent years, cryptocurrency trades have increased dramatically, and this trend has attracted cyber-threat actors to exploit the existing vulnerabilities and infect their targets. The malicious actors use cryptocurrency malware to perform complex computational tasks using infected devices. Since cryptocurrency malware threats perform a legal process, it is a challenging task to detect this type of threat by a manual or heuristic method. In this paper, we propose a novel deep Recurrent Neural Network (RNN) learning model for hunting cryptocurrency malware threats. Specifically, our proposed model utilizes the RNN to analyze Windows applications’ operation codes (Opcodes) as a case study. We collect a real-world dataset that comprises of 500 cryptocurrency malware and 200 benign-ware samples, respectively. The proposed model trains with five different Long Short-Term Memory (LSTM) structures and is evaluated by a 10-fold cross-validation (CV) technique. The obtained results prove that a 3-layer configuration model gains 98% of detection accuracy, which is the highest rate among other current configurations. We also applied traditional machine learning (ML) classifiers to show the applicability of deep learners (LSTM) versus traditional models in dealing with cryptocurrency malware.

Introduction

Blockchain technology was introduced to the real-world through Bitcoin cryptocurrency in [1], [2]. As its backbone, this technology offers several key features in network communication such as decentralization, transparency, security, and trust in a peer-to-peer manner. Blockchain has a vital role in empowering cryptocurrencies. Bitcoin, Monero, and Ethereum have demonstrated to be practically useful in other domains beyond payments due to their secure-by-design nature [3], [4], [5], [6], [7], [8]. The ever-rising popularity of cryptocurrencies is generating a considerable amount of interest among developers and security researchers [9], [10], [11].

Mining is a vital process that is responsible for the verification of transactions in all cryptocurrencies that run on blockchain [2], [12]. This process requires the first blockchain network nodes (known as miners) to solve a complex mathematical problem to generate new blocks and retain the integrity of the transactions. Miners must solve a hash problem to create a valid block. Eventually, miners take an amount of the mined currency as a reward, and this process can generate an income for a cryptocurrency miner in the network as well as for malicious actors [9], [12].

There is a growing concern about the security of users when involved in the distribution networks and particularly in applications involving blockchain technology. Based on the recent incidents, the number of cryptocurrency malware has drastically increased [13], [14]. Increasing the value of diverse cryptocurrencies in the digital world has made malware and malicious crypto-miners famous as recent as 2017 [15]. The most common method to infect an unknown victim’s device is installing the mining software on the victim’s machine without any verification. In 2017, Coinhive took advantage of the victim’s computational power by placing a few lines of JavaScript code into their web pages in order to mine cryptocurrency [15].

Cryptocurrency malware is an approach to abuse victim’s machines (laptops, computers, smartphones, tablets) without their verification to mine cryptocurrency [13], [16]. The malicious actors use cryptocurrency malware to steal computational power and resources from their victims’ devices to compute complex equations [17], [18], [19], [20]. Therefore, the malware actors can compete against other miner’s cryptocurrency computational tasks without the costly overhead [16], [21]. The victims may not be aware when they are under attack of cryptocurrency malware. Virtually, the entire cryptocurrency malware software is designed to remain stealth from users, but that does not mean the side effects are not present. Some common side effects include:

  • reducing the speed of other processes

  • growing your electricity bills

  • decreasing the lifetime of your device

Depending on how smart the attack design, cryptocurrency mining relates to exceptionally high processor task that has considerable side effects [16], [21], [22].

In recent years, Machine Learning (ML) based malware threat detection solutions have obtained promising results in everyday malware hunting tasks [23]. Besides these, deep learning (DL) methods have also been applied in complex malware threat detection tasks [21], [22]. In prior research, a wide range of models for detecting malware based on the dynamic and static analysis have been proposed [24], [25], [26]. In this paper, we propose a model that benefits from Recurrent Neural Network (RNN) to detect cryptocurrency malware based on a given application’s operation codes (opcodes). The main goal of RNN is to use serial or sequence information, and RNNs are known as a return. As a result, RNN can efficiently predict objects of a sequence (or series) of inputs. In this case, the output will depend on previous calculations. Indeed, RNNs have an internal state (memory) that holds information about what has been calculated, and they can process variable-length sequences of inputs. Moreover, RNN has flexibility and high power, so using RNN in cryptocurrency malware hunting is a viable solution due to the fact that RNN efficiently can consider and learn the sequence of opcodes with variable-length sequences during the training step.

Our approach does not require any modification on opcodes. This paper targets MS Windows cryptocurrency malware threats, since a considerable number of users, use the MS Windows OS platform for trading cryptocurrencies [25]. We evaluate our model by comparing its performance against methods that use conventional machine learning classifiers like, SVM, K-Nearest Neighbor, Naïve Bayes, Decision Tree, and Random Forest, as well as Ada-Boost. Ada-Boost is an ensemble learning technique [27]. Therefore, the main contributions of this paper are as follows:

  • a three-layer Deep Recurrent Neural Network (RNN) model to detect cryptocurrency malware threats in the MS Windows platform.

  • a dataset that consists of 500 real-world cryptocurrency malware applications and over 200 legitimate cryptocurrency application.

  • a comparative analysis among traditional ML algorithms and the proposed model to show the effectiveness of RNN on detecting cryptocurrency malware threats.

The rest of the paper is structured as follows. Section 2 demonstrates a review of related work. In Section 3, we define our proposed methodology for cryptocurrency malware hunting. The experimental results and comparisons are presented in Section 4. Finally, in Section 5, we present our concluding remarks and discuss our future work towards cryptocurrency malware threats.

Section snippets

Related work

In this section, we mention the most relevant machine learning-based malware threat hunting works. In recent years, a large number of researchers have focused on malware threat hunting based on ML algorithms. ML algorithms used for different malware hunting challenges to detect patterns and find malware from benign applications. In this section, we review related work pertaining to our work done on malware threat hunting.

Joshua Saxe et al. [28] proposed a deep neural network-based malware

Cryptocurrency malware hunting methodology

In this section, we present the cryptocurrency malware hunting methodology which consists of four stages, as illustrated in Fig. 1. In the first stage, we collected cryptocurrency malware and benign samples. All the collected samples belong to the MS Windows OS platform. Next, we executed the collected samples simulated environment and decompiling and unpacking these files to extract their opcodes. In the next step, we created a feature vector based on each sample’s opcode. Finally, we used the

Experimental results

In this section, we present the experimental results of our proposed model for cryptocurrency malware classification and hunting. Experiments are obtained with the collected datasets of both malware and benign samples. We built five LSTM models with different configurations that it presented in Table 2.

We defined dataset A as A={L1,L2,L3,,Ln} and each our sample of the datasets is defined as L. In fact, all samples have a considerable number of sets of opcode, L={m1,m2,m3,,mn}. We also

Conclusion

Nowadays, Cryptocurrency usage in different applications leads to an increase in concurrency malware threats for this technology user. To overcome this challenge, we proposed a deep model that applied Recurrent Neural Network architecture to detect the cryptocurrency malware application based on their opcodes sequence. In fact, we evaluated our proposed model based on the Cryptocurrency applications’ opcodes analysis and obtained a detection accuracy of 98.25% against this malware family. In

CRediT authorship contribution statement

Abbas Yazdinejad: Conceptualization, Data curation. Hamed HaddadPajouh: Formal analysis. Ali Dehghantanha: Methodology, Project administration. Reza M. Parizi: Funding acquisition, Investigation. Gautam Srivastava: Writing - original draft. Mu-Yen Chen: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (47)

  • YazdinejadA. et al.

    Blockchain-enabled authentication handover with efficient privacy protection in SDN-based 5G networks

    IEEE Trans. Netw. Sci. Eng.

    (2019)
  • SovbetovY.

    Factors influencing cryptocurrency prices: Evidence from bitcoin, ethereum, dash, litcoin, and monero

    J. Econ. Financ. Anal.

    (2018)
  • WoodG.

    Ethereum: A secure decentralised generalised transaction ledger

    Ethereum Proj. Yellow Pap.

    (2014)
  • A. Yazdinejad, R.M. Parizi, G. Srivastava, A. Dehghantanha, K.R. Choo, Energy efficient decentralized authentication in...
  • DraghicescuD. et al.

    Crypto-mining application fingerprinting method

  • YazdinejadA. et al.

    An energy-efficient SDN controller architecture for IoT networks with blockchain-based security

    IEEE Trans. Serv. Comput.

    (2020)
  • YazdinejadA. et al.

    Decentralized authentication of distributed patients in hospital networks using blockchain

    IEEE J. Biomed. Health Inf.

    (2020)
  • EyalI. et al.

    Majority is not enough: Bitcoin mining is vulnerable

    Commun. ACM

    (2018)
  • RüthJ. et al.

    Digging into browser-based crypto mining

  • ZimbaA. et al.

    Crypto mining attacks in information systems: an emerging threat to cyber security

    J. Comput. Inf. Syst.

    (2018)
  • GhoshU. et al.

    Towards secure software-defined networking integrated cyber-physical systems: Attacks and countermeasures

  • MadhanE. et al.

    An improved communications in cyber physical system architecture, protocols and applications

  • RatheeG. et al.

    A blockchain framework for securing connected and autonomous vehicles

    Sensors

    (2019)
  • Cited by (98)

    • An ensemble deep learning model for cyber threat hunting in industrial internet of things

      2023, Digital Communications and Networks
      Citation Excerpt :

      This problem affects the learning of long-term dependency in data and makes it difficult to predict and detect anomalies, decreasing anomaly detection accuracy and other evaluation metrics. Therefore, Recurrent Neural Network (RNN) [35] as well as LSTM architectures [23] are used to solve the problem of long-term dependence in time series data, as the vanishing gradient or exploding gradient often occurs in dealing with long-term dependent time series data [36]. In order to resolve the aforementioned long-time dependency issues in IIoT cyber threat hunting, an ensemble model in this paper based on the deep learning RNN model is proposed and the LSTM architecture [37] is applied to dominate long-time dependency problems in IIoT cyber threat hunting.

    View all citing articles on Scopus
    View full text