An Identification Method of Untrusted Interactive Behavior in ERP System Based on Markov Chain

Xu, Mengyao; Yi, Qian; Yi, Shuping; Xiong, Shiquan

doi:10.1007/978-3-030-22351-9_14

An Identification Method of Untrusted Interactive Behavior in ERP System Based on Markov Chain

Mengyao Xu¹⁵,
Qian Yi¹⁶,
Shuping Yi¹⁵ &
…
Shiquan Xiong¹⁵

Conference paper
First Online: 12 June 2019

2260 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11594))

Abstract

Enterprise Resource Planning (ERP) software system is widely used in enterprises as an advanced management system. In recent years, the information security problem of ERP software system has gradually attracted people’s attention. To solve the information security problem of the ERP software system, we first need to pay attention to the untrusted interactive behavior in the ERP software system. Enterprise network users generate a lot of interactive behavior in the process of using ERP system. Untrusted interactive behavior will cause huge damage to the enterprise if they are not identified. Based on this, this paper proposes a method based on Markov chain to identify untrusted interactive behavior of users in the ERP system, Firstly, a series of network user behavior characteristics are constructed based on the log records of ERP system. Then, the hidden Markov model is used to model the behavior of trusted users based on these behavior characteristics. Next, the forward algorithm is used to calculate the probability of a series of observation sequences of trusted users and untrusted users based on the hidden Markov model of trusted users. Finally, the untrusted users are identified by comparing the observation sequence probability set of trusted and untrusted users. The recognition rate of the model for trusted users is 92.64%, and the false positive rate for untrusted users is 0.76%. This result indicates that the model is effective for identifying untrusted interaction behavior.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

With the popularization and development of enterprise information management, ERP system is widely used in enterprises as an advanced management system. Enterprise users are faced with many information security issues while enjoying the convenience of ERP system. The US Department of Homeland Security (DHS) issued a security alert saying that national hackers and criminals are increasingly attacking ERP systems, and they have found evidence that Dridex Trojan attacked bank’s ERP system, which brought huge losses to the bank.

One of the methods that can be used to secure information security in ERP systems is intrusion detection. Anomaly detection approach [1] is a key element of intrusion detection that attempts to evaluate the behavior of a user or system and consider intrusive or irregular activities as some deviation from normal patterns. The core of the method is how to identify whether the current behavior is abnormal (intrusive or irregular).

Currently, the traditional studies on identify abnormal behavior of network users are classified into two categories. One is to use network traffic as a characteristic [2,3,4,5,6]. Jain et al. [2] identify abnormal behavior by identifying abnormal network traffic. The other is to use packet as a characteristic [7,8,9,10,11]. Lee et al. [7] identify abnormal behavior by monitoring whether packets are abnormal. However, the characteristics they choose are vitally dependent on computer, moreover, the abnormal behavior identified by these characteristics do not necessarily correspond to the abnormal behavior of users in real life. Therefore, the credibility of the abnormal behavior identified by these methods is open to question.

Trust is typically interpreted as a subjective belief in the reliability, honesty and security of an entity on which we depend for our welfare [12], and these entities contain software, hardware, data, people and organizations. Numerous researchers have conceptualized trust as a behavior, which has been validated in work collaboration and social communications [13]. On the one hand, behavior-based trust models are widely used in e-commerce sites to help consumers assess the quality of their products. Cao et al. [14] studied the trusted third party (TTP) in Australia’s business and examined the factors influencing consumers’ trust behavior from the perspective of consumers online trust of online shopping. Kaur et al. [15] proposed a model to discern the impact of trust factors pertaining in Indian E-Commerce marketplace on the customers’ intention to purchase from an e-store. On the other hand, behavior-based trust models are widely used in software system to ensure information security of software systems. There are two main models in this field. One is to evaluate the factors affecting trust by using continuous or discrete real numbers (trust value) [16,17,18,19]. Hosseini et al. [16] proposed a way to measure the user’s behavioral credibility by scoring the user’s behavior, and also proposed that the score of repeated malicious behavior should be lower than the first malicious behavior. There are some problems in determining whether the behavior is credible by calculating the trust value. When establishing the evaluation system, some untrusted interactive behaviors are preset, so it is impossible to detect untrusted interactive behaviors that are not preset. The other is to model the trusted users by continually optimizing the characteristic framework that describes trusted behaviors [20,21,22]. Yan et al. [20] built a behavioral characteristic framework of trusted users based on computer trust and the interaction intention between human and computer. However, these articles have only been theoretically studied and have not been further explored in conjunction with actual data.

In this paper, we study how to identify untrusted interactive behavior in ERP software systems based on human factors. Trusted interaction is defined as a predictable and controllable information transformation process through a computer network in a way that people and computers work together in an effective manner. Moreover, trusted interactive behavior refers to behavior that is consistent with an individual’s behavioral habits and reflects it in network operations. Hence, we establish a behavioral model of a trusted user by selecting characteristics that reflect individual behavioral habits and then identify untrusted interactive behavior on this basis.

The remainder of this paper is organized as follows: The background is described in Sect. 2. The method for how to identify the untrusted interactive behavior is described in Sect. 3 and an example is given in Sect. 4. Finally, the concluding remarks are addressed in Sect. 5.

2 Background

Hidden Markov Model (HMM) has a wide range of applications in the field of pattern recognition.

2.1 The Concept of Hidden Markov Model (HMM)

HMM is a conceptual model of time series. It describes the process of randomly generating unobservable state sequences from a hidden Markov chain, and then generating an observation sequence from each state.

HMM is a double stochastic process. One is Markov chain, which is used to describe the state of metastasis. Another is to describe each state and observation of the corresponding relation between statistics [23].

HMM has two basic assumptions:

(1)
The state of the hidden Markov chain at any time t depends only on the state of its previous moment, regardless of the state and observation at other times, and is independent of the time t.
(2)
Observation at any time depends only on the state of the Markov chain at that moment, independent of other observations and states.

2.2 The Parameters of HMM

An HMM is characterized by the following:

(1)
Q is a collection of all possible states, V is a collection of all possible observations;
(2)
I is a sequence of states of length T, O is the corresponding observation sequence;
(3)
N is the number of all possible states; M is the number of all possible observations;
(4)
A is the state transition probability distribution;

$$ A = \left\{ {a_{ij} } \right\}\,\text{and}\,a_{ij} = P\left[ {q_{t + 1} = j\,|\,q_{t} = i} \right],1 \le i,j \le N $$
(1)
(5)
B is the observation symbol probability distribution in state j;

$$ B = \left\{ {b_{j} \left( {v_{k} } \right)} \right\}\,\text{and}\,b_{j} \left( {v_{k} } \right) = P\left[ {o_{t} = v_{k} \,|\,q_{t} = j} \right], \, 1 \le j \le N, \, 1 \le k \le M $$
(2)
(6)
π is the initial state distribution;

$$ \pi = \left\{ {\pi_{i} } \right\}\,\text{and}\,\pi_{i} = P\left[ {q_{1} = i} \right], \, 1 \le i \le N $$
(3)

For convenience, we usually use a compact notation λ = (A, B, π) to indicate the complete parameter set of an HMM.

2.3 Three Algorithms of HMM

HMM has three algorithms:

(1)
Forward or backward algorithm. Given the model λ = (A, B, π) and the observation sequence O = (o₁, o_2, …, o_T ) to calculate the probability P(O | λ) of the occurrence of the sequence O under the model λ.
(2)
Baum-Welch algorithm. Given the observation sequence O = (o₁, o_2, …, o_T) to estimate the parameters of the model λ = (A, B, π) and make the observation sequence probability P(O | λ) maximum under this model.
(3)
Viterbi algorithm. Given the observation sequence to find the most likely corresponding state sequence.

3 Method

In our research, we collect the characteristic data of each trusted interactive behavior of the user, and use the hidden Markov model to establish each user’s network behavior pattern. Then, each user’s online behavior is matched to its network behavior pattern. Matching unsuccessful behavior is considered untrusted interactive behavior.

3.1 Data Collection and Preparation

In this paper, the data comes from the background log of a publishing company. This log records the operational records left by all users of the company when they use the ERP software system. All the characteristics of the log record are as follows (Table 1):

Table 1. The characteristics of the log record

Full size table

When we preprocess this data, firstly, we should filter the entire operation record of the required user according to the operator’s name. Secondly, trusted interactive behavior refers to behavior that is consistent with an individual’s behavioral habits and reflects it in network operations. However, a single operation can’t correctly describe the user’s operating habits, and usually a series of operations can represent the user’s operating habits. Therefore, the user’s ten operations are treated as one unit, and the next unit is obtained by moving one operation down on the basis of the previous unit. Finally, we need to determine which characteristics are selected to describe the behavior patterns of trusted users.

3.2 The Selection of Characteristics

Based on the user’s behavioral habits, six characteristics are chosen to describe the behavior patterns of trusted users (Table 2).

Table 2. Selected characteristics

Full size table

The number of IPs can show that the user likes to use the same IP for a long time while working, or prefers to change frequently. The enter button & function can represent the order of operations. The time accumulation for each operation can reflect the speed of user operations. The operating time period can reflect the user’s work schedule. The time difference between before and after operation reflect the user’s attitude towards work (like delay or timely processing). The combination of the types of operations can represent the character of the individual.

An example is used to illustrate the meaning of the characteristics: the operation of a unit of a user is (1, 23, 3, 4, 1, 10), that means the user only uses one type of IP to perform this group of operations and the operation sequence number is 23, a total of 3 s was spent to perform this set of operations and the accumulated time difference between before and after operations is 6 min to 8 min, The operating time period is 9:00–9:30 and the operation type combination number is 10 (5 business operations, 2 function operations, 3 business operations).

The relevant original record table displayed in Chinese is shown below (Fig. 1).

3.3 The Model Parameters of Trusted Users

The untrusted interactive behavior is diverse and we can’t fully understand. Based on this, we model the behavior of the user when the system is running normally, which means that each behavior of the user is trusted. The hidden Markov model built for the behavior of trusted users contains only two states: trusted state and untrusted state. The trusted state is represented by 0, and the untrusted state is represented by 1. The number of observations is determined by the type of unit operation in the previous section. Because the model is modeled when the system is running normally, the state transition matrix A = $ \left[ {\begin{array}{*{20}c} 1 & 0 \\ 1 & 0 \\ \end{array} } \right] $, this means that the transition probability from the trusted state to the trusted state and from the untrusted state to the trusted state is 1, that is, regardless of the current state, the next step will be transferred to a trusted state with a probability of 1. The observation probability matrix B refers to the probability distribution of the unit operation of the trusted user. The initial state probability vector π = {1, 0}. Based on this, the hidden Markov model of trusted user behavior is established.

3.4 The Behavior Recognition of Untrusted Users

We need to set a fixed size sliding window for the observation sequence. The distance that the window slides down each time is an operation. Next, the forward algorithm is used to calculate the observed sequence probability set of trusted and untrusted user behavior under the hidden Markov model of trusted user behavior. When we obtain an observation sequence set of trusted user behavior, we need to use a smaller value in the observation sequence set as our decision threshold. The observation sequence exceeding the threshold is determined as a sequence of behaviors of the trusted user, and instead is determined as a sequence of behaviors of the untrusted user.

4 Procedure

There are two network users using the ERP system participated in the experiment. They are from a publishing company in Chongqing, China. User A is defined as a trusted user, user B is defined as an untrusted user.

4.1 Training Phase

Sequences of user A are used as a training set, User A’s hidden Markov model is the trusted user’s hidden Markov model. The hidden Markov model of the trusted user has been represented in the third section. User A’s observation sequence has a window size of three. Using user A’s 20,000 observation sequences as training data, the model can obtain the observation sequence probability set of user A. The probability of the observed sequences is so small, so we use a logarithm of the probability of these observations to amplify them. The amount of data is too large. The following Fig. 2 only shows the probability of observation sequence of 1000 data. Thus, the probability threshold of the observation sequence of user A is determined to be –6.389.

4.2 Test Phase

User A’s remaining 5000 observation sequences are used as test data 1, which are used to test the recognition rate of the model. The observation sequence probability set of user A’s test data is shown in the Fig. 3 below.

The test results show that the recognition rate of the model for trusted user behavior is 92.64%, and the false positive rate is 7.36%. This means that in the 4989 pieces of behavior data of trusted users, 4622 pieces of data are judged as behavior data of trusted users, and 367 pieces of data are determined as behavior data of untrusted users.

User B’s 5000 observation sequences are used as test data 2, which are used to test the false positive rate of the model. The observation sequence probability set of user B’s test data is shown in the Fig. 4 below.

The test results show that the recognition rate of the model for untrusted user behavior is 99.24%, and the false positive rate is 0.76%. This means that in the 4989 pieces of behavior data of untrusted users, 4951 pieces of data are judged as behavior data of untrusted users, and 38 pieces of data are determined as behavior data of trusted users.

Identifying the untrusted user’s behavior as the trusted user is more horrible than identifying the trusted user’s behavior as the untrusted user. Therefore, we choose a relatively large threshold to ensure a lower false positive rate when selecting the observation sequence probability threshold.

5 Conclusion

In this paper, the method of identifying untrusted interaction behavior in the process of human-computer interaction based on human behavior habits is proposed by us. Firstly, we analyzed the current information security issues of the ERP software system and reviewed the current methods for solving the information security problems of ERP software systems. Secondly, we propose that to solve the information security problem of ERP software system, we first need to identify the untrusted interaction behavior in the ERP software system. At the same time, we define the trusted interaction and trusted interactive behavior. Thirdly, we introduced the related concepts, parameters and algorithms of the hidden Markov model, then we use Hidden Markov Model to model the behavior of trusted users. Fourthly, we use the forward algorithm in the hidden Markov model to calculate the observation sequence probability set of the trusted user behavior and determine the probability threshold of the observed sequence. Finally, the recognition rate and false positive rate of the model were tested with two test sets.

From the experimental results, the recognition rate of our model is 92.64% and the false positive rate is 0.76%. This shows that the model is effective for identifying untrusted interactive behavior. Moreover, our research provides a new way to identify untrusted interactive behavior and the behavior we define as untrusted user behavior is closer to the abnormal user behavior in real life.

In the future, there is still a lot of work that needs to be done by us. Firstly, we can consider to improve the characteristic framework, such as adding some computer-related characteristics, or the characteristics of the environment’s influence on interaction behavior in human-computer interaction etc. Secondly, only the simplest hidden Markov model is used to model the behavior of trusted users. In future research, higher-order hidden Markov models can be considered to model the behavior of trusted users. Finally, the influence of other factors on the experiment wasn’t considered when selecting the experimental subjects, for example, the influence of the occupation of the experimental subjects on their operating habits.

References

Carter, E.: Intrusion Detection Systems. Cisco Press, Indianapolis (2002)
Google Scholar
Jain, R., Abouzakhar, N.: Hidden Markov model based anomaly intrusion detection. In: International Conference for Internet Technology & Secured Transactions. IEEE (2012)
Google Scholar
Lee, D.C., et al.: Fast traffic anomalies detection using SNMP MIB correlation analysis. In: International Conference on Advanced Communication Technology. IEEE (2009)
Google Scholar
Huang, S.Y., Huang, Y.N.: Network traffic anomaly detection based on growing hierarchical SOM. In: IEEE/IFIP International Conference on Dependable Systems & Networks. IEEE Computer Society (2013)
Google Scholar
Yan, G.: Network anomaly traffic detection method based on support vector machine. In: International Conference on Smart City & Systems Engineering. IEEE (2017)
Google Scholar
Yu, Q., Gu, X.: Network traffic anomaly detection based on dynamic programming. In: International Conference on Computing Intelligence & Information System. IEEE Computer Society (2017)
Google Scholar
Lee, S., Shin, S.-H., Roh, B.-h.: Abnormal behavior-based detection of Shodan and Censys-like scanning. In: Ninth International Conference on Ubiquitous and Future Networks (ICUFN). IEEE (2017)
Google Scholar
Garg, A., Maheshwari, P.: PHAD: packet header anomaly detection. In: International Conference on Intelligent Systems & Control. IEEE (2016)
Google Scholar
Wang, K., Kim, H.S.: PCAD: cloud performance anomaly detection with data packet counts. In: IEEE International Conference on Cloud Computing Technology & Science. IEEE Computer Society (2017)
Google Scholar
Uyyala, S., Naik, D.: Anomaly based intrusion detection of packet dropping attacks in mobile ad-hoc networks. In: International Conference on Control. IEEE (2014)
Google Scholar
Caulkins, B.D., Lee, J., Wang, M.: Packet- vs. session-based modeling for intrusion detection systems. In: International Conference on Information Technology: Coding & Computing. IEEE (2005)
Google Scholar
Jøsang, A.: Identity management and trusted interaction in Internet and mobile computing. IET Inf. Secur. 8(2), 67–79 (2014). Author, F., Author, S.: Title of a proceedings paper. In: Editor, F., Editor, S. (eds.) CONFERENCE 2016, LNCS, vol. 9999, pp. 1–13. Springer, Heidelberg (2016)
Article Google Scholar
Anderson, J.C., Narus, J.A.: A model of distributor firm and manufacturer firm working partnerships. J. Mark. 54(1), 42–58 (1990). Author, F.: Contribution title. In: 9th International Proceedings on Proceedings, pp. 1–2. Publisher, Location (2010)
Article Google Scholar
Cao, C., Yan, J., Li, M.: The effects of consumer perceived different service of trusted third party on trust intention: an empirical study in Australia. In: IEEE 14th International Conference on e-Business Engineering (ICEBE). IEEE (2017)
Google Scholar
Kaur, B., Madan, S.: A fuzzy expert system to evaluate customer’s trust in B2C E-commerce websites. In: International Conference on Computing for Sustainable Global Development. IEEE (2014)
Google Scholar
Hosseini, S.B, Shojaee, A., Agheli, N.: A new method for evaluating cloud computing user behavior trust. In: Information & Knowledge Technology. IEEE (2015)
Google Scholar
Yang, X., Liu, L., Zou, R.: A statistical user-behavior trust evaluation algorithm based on cloud model. In: International Conference on Computer Sciences & Convergence Information Technology. IEEE (2012)
Google Scholar
Ma, J., Zhang, Y.: Research on trusted evaluation method of user behavior based on AHP algorithm. In: International Conference on Information Technology in Medicine & Education. IEEE (2015)
Google Scholar
Jiang, W., Guo, S., Chen, W.: A trust evaluation model and algorithm based on network behavior detection. In: IEEE International Conference on Broadband Network & Multimedia Technology. IEEE (2011)
Google Scholar
Yan, Z., Kantola, R., Zhang, P.: Theoretical issues in the study of trust in human-computer interaction. In: IEEE International Conference on Trust. IEEE (2012)
Google Scholar
Liu, W., Ci, L., Liu, L.: Research on behavior trust based on Bayesian inference in trusted computing networks. In: IEEE International Conference on Smart City/SocialCom/SustainCom. IEEE (2016)
Google Scholar
Yan, Z., Kantola, R., Zhang, P.: A research model for human-computer trust interaction. In: IEEE International Conference on Trust. IEEE Computer Society (2011)
Google Scholar
Jiang, X.: A facial expression recognition model based on HMM. In: International Conference on Electronic & Mechanical Engineering & Information Technology. IEEE (2011)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 71671020.

Author information

Authors and Affiliations

Department of Industrial Engineering, Chongqing University, Chongqing, 400044, China
Mengyao Xu, Shuping Yi & Shiquan Xiong
Department of Mechanical Design and Manufacturing, Chongqing University, Chongqing, 400044, China
Qian Yi

Authors

Mengyao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qian Yi
View author publications
You can also search for this author in PubMed Google Scholar
Shuping Yi
View author publications
You can also search for this author in PubMed Google Scholar
Shiquan Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qian Yi .

Editor information

Editors and Affiliations

San Jose State University, San Jose, CA, USA
Abbas Moallem

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, M., Yi, Q., Yi, S., Xiong, S. (2019). An Identification Method of Untrusted Interactive Behavior in ERP System Based on Markov Chain. In: Moallem, A. (eds) HCI for Cybersecurity, Privacy and Trust. HCII 2019. Lecture Notes in Computer Science(), vol 11594. Springer, Cham. https://doi.org/10.1007/978-3-030-22351-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-22351-9_14
Published: 12 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22350-2
Online ISBN: 978-3-030-22351-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics