The Constrained GAN with Hybrid Encoding in Predicting Financial Behavior

Zhang, Yuhang; Yang, Wensi; Sun, Wanlin; Ye, Kejiang; Chen, Ming; Xu, Cheng-Zhong

doi:10.1007/978-3-030-23367-9_2

Yuhang Zhang^16,17,
Wensi Yang^16,17,
Wanlin Sun^16,18,
Kejiang Ye¹⁶,
Ming Chen¹⁶ &
…
Cheng-Zhong Xu¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11516))

Included in the following conference series:

International Conference on AI and Mobile Services

680 Accesses
1 Citations

Abstract

Financial data are often used in predicting users’ behaviors in business fields. The previous work usually focuses on the positive samples which means those specific persons can bring the profit to companies. However, in most cases, the proportion of positive samples is very small. The traditional algorithms do not perform well when the positive and negative samples are extremely unbalanced. To solve this problem, we propose an integrated network. Meanwhile, the original dataset includes both objective and index data, our method integrates the one-hot encoding and float encoding together to uniform the data which named hybrid encoding. Then, the article uses GAN framework to overcome the shortcoming of unbalanced dataset. Finally, voting rules put both data sensitive and data insensitive classifiers together to make a strong classifier. We evaluated the performance of our framework on a real world dataset and experimental results show that our method is effective, with 5%(±0.5%) to 87% improvement in accuracy as compared with other methods mentioned in this paper.

You have full access to this open access chapter, Download conference paper PDF

Grammar-Based Multi-objective Genetic Programming with Token Competition and Its Applications in Financial Fraud Detection

LightGBM Model for Credit Card Fraud Discovery

Intelligent Fraud Detection Framework for PFMS Using HGRO Feature Selection and OC-LSTM Fraud Detection Technique

Article 17 May 2023

Keywords

1 Introduction

Bigdata is the foundation of the machine learning. A lot of financial companies including most of the banks value the power of the data. In 2015, the global card fraud rate was 7.76BP^{Footnote 1}(The amount of fraud per 10,000 dollars.). The report of UK antifraud agency CIFAS^{Footnote 2} said that their own agency and affiliates had prevented a total loss about 1 billion pounds by using 325,000 records and related algorithms. In China, many banks and financial enterprises lose more than billions of Yuan per year because of financial fraud. Those are few examples happened in predicting financial fraud, but it can be concluded that can prevent enormous economic losses if financial behaviors forecasts can be deployed to more institutions.

According to the previous data, fraud detection is an important application in predicting financial behaviors [1]. The fraud events are detected from the real huge datasets which record different information about clients and their transaction data. Researchers usually sort those behaviors [2] into different categories and then to solve corresponding problems. There are some common applications in financial fraud detection, such as outlier detection [3, 4], clustering and regression. With the increase of the fraud behaviors, people use more approaches to identify the fraud events. Data mining [5] is a necessary technology in fraud detection. In the early stage, data mining with PCA [6], SVM [7] and other methods have made progress in this field. Meanwhile, statistical models such as Naive Bayes [8,9,10], belief stage [11] and logistic model [12] are appeared in real applications. Starting from 2000, with the rapid development of computer technology and blowout of data volume, some unsupervised [13, 14] algorithms and neural networks [15, 16] have attracted people’s attention, fuzzy neural network [17] is also included in it.

Especially in recent years, deep learning [18, 19] is widely used for forecasting and has made a lot of research achievements in academic field. At the same time, industry and academia also combine [20] statistical model and deep learning to achieve a better result.

We compare some main methods in the following figure in order to show a clearer illustration (Fig. 1).

In this study, we are motivated by the fact that citizens’ deposits in banks are usually used for centralized investment by financial institutions. It is necessary for banks to predict those potential customers. We design an efficient algorithm for deposit prediction based on an existing real dataset^{Footnote 3} (Fig. 2).

Our contributions are as follows:

(1) Compared with other studies, different data attributes were taken into account in the algorithm of this paper and then different coding methods were adopted. (2) We restrict the GAN and make it more robust, especially under the condition that the data distributions are approximate.

2 Related Work

Comparing with earlier data volume, today’s data are always huge. So, people can not label data like before, and the useful positive samples with labels are very few in the whole dataset. The network has to learn more nonsense information instead of useful features. Meanwhile, the study of Generative Adversarial Nets (GAN) [21] provides us a new idea to exploit the power of data, according to the application of image processing [22], we can further use it in other fields.

Data attributes influence a lot in the following procedure, it can decide which classifier you will use and also indicate the method that used in learning features. This problem pushes us to think twice how to process the original data before it passed to classifiers and do not influence the data attributes at the same time. Data encoding is a normal method to represent raw data, one-hot encoding [23] has been proved that is an efficient way in machine learning to keep data attributes. We notice that it is unreliable when relying on only one encoding method. Inspired by some popular algorithms [24] in deep learning. Through observation, we find that the data always can be divided into two categories, one is no correlation between data, we call it objective data. Another part of the data is index data, and we need to consider its numerical attributes.

Hybrid Encoding: In the financial datasets, there are not only objective data, such as occupations, ages and so on, but also index data of financial industry. It is bad for the final classification if those data are normalized into the same format, so we propose a hybrid coding method to avoid this problem. One-hot encoding is used for objective data. In the early research, this method is often used in the encoding related to FPGA [23, 25]. Because of the convenience of representation and the small number of bytes, logic circuit programming prefers one-hot encoding too. Recently, people also use it in machine learning [26]. This encoding method maintains data independence for the representation of discrete (objective) data. For the index data, we normalize it into an integer, because continuous (index) data are often associated with each other. The advantage of the hybrid coding can take full account of the different attributes of data, rather than violent transformation.

Constrained GAN: GAN has been successfully applied in images [27]. NVIDIA group [28] used the GAN to generate high quality human face pictures which are very similar to real face. This technology has developed quiet well and is constantly applied to various applications, fake pictures made by GAN are hard to be distinguished by human eyes today. Actually, GAN uses the KL divergence to do the optimization. However, we find that the distribution of positive and negative samples in dataset are very similar, so we introduce the MD [29] as the constraint in the GAN. With the help of MD, the new generated positive samples are closer to the original data.

Integrated Classifiers: The successful application of AdaBoost [30] tells us that the combination of weak classifiers can become a strong classifier. In fact, the idea of model fusion comes from industry more. Industry often uses the different advantages of multiple models to integrate a better performance model [31]. This method can often achieve better experimental results. But in practical application, we find that is not so simple to apply, it still need to select the better performanced classifiers, and combine them according by the voting rules. In this process, the important thing is to drop those classifiers with the worst results.

3 Designed Method

3.1 Hybrid Encoding

STEP 1: Our work distinguishes the data which is discrete (objective) or continuous (index). For discrete data, we encode it by the one-hot encoding. Two-bit binary numbers can represent four different combinations. There are no more than nine sub-categories in our dataset, so 4-bit binary digits can meet the requirement. Every main feature is encoded by four-bit binary, even though some contain only a few categories. We process data like this because that can prevent the difference of binary digits between different categories from affecting the result of classification. Experiments show that the uniform representation of digits can ultimately improve the accuracy by 1% $ - $ 2%, and it won’t bring the dimension disaster.

STEP 2: The continuous data are unified into numerical data within 10. The maxi-mum occurrence is assigned 10, and the minimum occurrence is 0.

STEP 3: After forming the data according to the above encoding methods, one-hot encoding are put in the high position and continuous data are put in the low position to form such a data sequence. As shown in the Table 1.

Table 1. The format of hybrid encoding

Full size table

3.2 Constrained GAN

GAN often used in the field of image classification and video tracking, but in the application scenario of financial data, we can still turn the data into corresponding pixel values. Even if every data is meaningless pixel values, the corresponding pictures are statistically significant. In this case, positive samples are changed into corresponding pixel [22] values to form a picture. These pictures are sent into the new network for training to produce new pictures, then the pictures are converted into corresponding data. This process can complete the whole enrichment of positive samples.

The objective of optimization can be written as:

$$\begin{aligned} MinMax[V(G, D)]=\mathbb {E}_{x \in P_{data}} [logD(x)]+\mathbb {E}_{z \in P_{z}} [log(1-D(Z))] \quad \end{aligned}$$

(1)

To optimiz the discriminator, we need to train the discriminator continu-ously so that it can identify the probability from the real data to the largest. We can rewrite the Eq. (1) as:

$$\begin{aligned} V(G, D)=P_{data}(x)logD(x)+P_z(Z)log(1-D(Z)) \quad \end{aligned}$$

(2)

Then, the solution D$'$ is:

$$\begin{aligned} D^\prime =\frac{P_{data}(x)}{P_{data}(x)+P_z(Z)} \quad \end{aligned}$$

(3)

Bringing (3) back to (2):

$$\begin{aligned} \begin{aligned} V(G,D^\prime )=\,&\mathbb {E}_{x \in P_{data}}\left[ log{\frac{P_{data}(x)}{P_{data}(x)+P_z(Z)}}\right] \\&+P_{z}\left[ log\left( 1-D\left( \frac{P_{z}(Z)}{P_{data}(x)+P_z(Z)}\right) \right) \right] \quad \end{aligned} \end{aligned}$$

(4)

K-L [32] is used to measure the similarity between two probabilities, it can be defined as:

$$\begin{aligned} D_{KL}(P \Vert Q)=\sum _{i=1}^N P(x_i)log\frac{P(x_i)}{Q(x_i)} \quad \end{aligned}$$

(5)

With the K-L, we can further rewrite (4) in:

$$\begin{aligned} V(G,D^\prime )= & {} -2log2+KL(P_{data}(x) \vert \vert A)+KL(P_z(Z) \vert \vert A) \quad \end{aligned}$$

(6)

$$\begin{aligned} A= & {} \frac{P_{data}(x)+P_z(Z)}{2} \quad \end{aligned}$$

(7)

Now, our next goal is to minimize $ P_z(Z)log(1-D(Z))$. According to the method of gradient descent, we can get the following results by using D$'$, the $P_Z'$ is the solution of G:

$$\begin{aligned} {P_z}^\prime \leftarrow (P_z - \eta {\partial V(G, D^*))} \quad \end{aligned}$$

(8)

The MD is defined as:

$$\begin{aligned} D_{MD}^2= & {} (x-m)^T C^{-1} (x-m) \quad \end{aligned}$$

(9)

$$\begin{aligned} C= & {} (x-m)(x-m)^T \quad \end{aligned}$$

(10)

x represents the whole pattern vectors, m means one of specific vector in the whole. In this paper, we use data(x) replaces x, z means m. The constrained GAN is:

$$\begin{aligned} f_{goal}=V(G^* ,D^*)+\lambda (P_{data}(x)-P_z(Z))^T C^{-1} (P_{data}(x)-P_z(Z)) \quad \end{aligned}$$

(11)

This formula (11) is introduced into GAN as a new objective function, from which some newly generated data can be obtained.

3.3 Integrated Classifiers

In the experiment, we found that the positive and negative sample distributions of some attributes are very similar. It is the reason why we choose different methods to integrate a stronger classifier. According to the latest paper [33], if the distributions of the two kinds of data are similar, it is hard to find a way to separate the two kinds of data away clearly. The practical approach is to integrate different classifiers, and the different features can be used effectively by different classifiers.

Firstly, we tested some common classification algorithms. For example, Decision tree model and Random Forest model. Due to the simplification of algorithm and flexibility in handling the multiple data attribute types, Decision tree [34, 35] are widely used in classification problem. Random Forest [36] is a combination of multiple tree predictors such that each tree depends on a random independent dataset and all trees in the forest are the same distribution.

Then, in order to further understand the performance of weak classifiers are integrated into strong classifiers, this paper further tested the performance of some classifiers, such as AdaBoost, XGBoost and Gradient Boosting. Adaboost is the original edition of weak classifier integrated into the strong classifier. However, unlike AdaBoost, Gradient Boosting [37] chooses the direction of gradient descent in iteration to ensure that the final result is the best, Gradient Boosting can achieve highly accurate. XGBoost [38] implements machine learning algorithms under the Gradient Boosting framework. It is an optimized distributed gradient boosting library which is used by data scientists to achieve state-of-the-art results on many machine learning challenges. Compared with the Gradient Boosting, the advantage of XGBoost is a regular term added to it first. Then, the error of XGBoost loss function is second-order Taylor expansion and Gradient Boosting is first-order Taylor expansion, so the loss function approximation is more accurate.

Last, due to the outstanding performance of Multi-layer Perceptron (MLP) [39] and Naive Bayes in classification tasks, these two algorithms are also included in our candidate list. As the initial model of neural network, MLP has been used in a variety of applications. Performances in some statistical models can be improved in the neural networks. The Naive Bayes [40] method is a set of supervised learning algorithms. It is based on Bayes theory and assumes that each pair of features is independent. Although the assumption is simple, Naïve Bayes classifier works well on many real classification problems, such as document classification and spam filtering. It only needs a small amount of training data to estimate the necessary parameters. Linear Regression (LR) [41] [42] and Stochastic Gradient Descent (SGD) [43] added to this framework, however, the performances are not satisfied.

Table 2. Recall of different classifiers in predicting positive samples. The ratio in Table. 2 represents the ratio of positive samples to negative samples. In practice, we found such an interesting phenomenon: not the more joint cascaded classifiers mean the better result. Because some classifiers have the negative impact on the results, we have to discard some of them which has poor result, that is why we chose 4 form 9 classifiers finally.

Full size table

After comprehensive measurement, we chose the final four classifiers: Decision Tree, Gradient Boosting (GB), XGBoost and Naive Bayes^{Footnote 4}. The voting rules are defined as:

$$\begin{aligned} \mathbb {P}(x)=\omega _1 * P_{Decisiontree}+\omega _2 * P_{Naive Bayes}+\omega _3 * P_{GB}+\omega _4 * P_{XGBoost} \end{aligned}$$

(12)

In Eq. (12), $ \omega $ means the weight of final training within [0, 1], and P represents the possibility of current prediction.

Here, we need to explain a question. GB and XGBoost are very similar, why we keep both two methods? In fact, after our experiments, the comprehensive experimental results of deleting GB are not better than preserve it, so we chose to retain them both.

3.4 Algorithm Framework

4 Experiments

4.1 Data Analysis

This dataset contains some basic information about customers. It includes ages, occupations, education and some macroeconomic indicators of the current time. A total of 20 features data, including about 40,000 customer information, of which more than 4,000 persons are marked as positive samples. The statistical results show that the distribution of positive and negative samples with many feature components are very similar (Fig. 3). Considering that, we introduce a constrained GAN naturally. Another point should to know is that we found that some features have no correlation with the result, which are discarded as noise value in our subsequent processing.

Finally, we chose the item 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 14, 15, 16 from the original dataset as the final feature components.

4.2 The Result of Different Encoding Methods

In machine learning, recall and precision are contradictory indicators. In this problem, we focus on how many customers that institutes want to pick correctly by our algorithm, and we use the recall to evaluate models.

$$\begin{aligned} Recall=\frac{TP}{TP+FN} \end{aligned}$$

(13)

where TP represents True Positive, FN means False Negative. Four data sets are divided into positive and negative samples with ratios of 1 : 1(4k : 4k), 10 : 1(10k : 1k), 20 : 1(20k : 1k) and 30 : 1(30k : 1k). The $X-axis$ represents the result of the division of this proportion. The validation set consists of 200 positive samples and 800 negative samples.

We evaluated the performance of different encodings. They are: pure binary coding (All data are encoded in one-hot encoding mode.), pure numerical coding (Data are encoded in integer mode.), and hybrid encoding mode proposed by us. This experiment concentrate on the samples with labeled 1.

We can see that different coding methods have different effects on classification recognition. The hybrid encoding method is obviously superior to other methods. At the same time, with the increase of the proportion of negative samples and positive samples, the accuracy decreases rapidly.

4.3 Generating Positive Samples

(a)
Firstly, the dataset of positive and negative samples are divided into two data sets. Then, according to the way in Sect. 3, the objective data and the index data are both converted into integer values. All numerical constraints are in the range of 0 to 255, then the data are stored as a matrix. In this way, two matrix graphs are obtained.
(b)
Cutting two matrix graphs into several pictures with size $ 20*13 $.
(c)
Sending the pictures which processed in (2) to the constrained GAN to produce images.
(d)
All pictures are inversely transformed according to the process in step 1 to obtain new data (Fig. 5).

4.4 The Constrained GAN

Previous experiment shows:

(a)
Unbalanced samples have a negative impact on the final recognition. An order of magnitude difference between positive and negative samples can cause a rapid decline in accuracy. Integrated classifiers do improve classification performance.
(b)
The encoding method influences the final classification results.

Next, in order to test the performance of enriching positive samples with con-strained GAN, we increased the number of positive samples from 4,000 to 10,000, and keep the ratio of positive and negative in 1:1 (10k : 10k) (Tables 3 and 4).

Table 3. Result of enriching positive samples

Full size table

Table 4. F1 Score of enriching positive samples

Full size table

After enriching positive samples, the recall and F1 score were improved, which indicates that our method is effective. However, we want to enhance that the way of increasing positive samples should not increase indefinitely (The correct proportion of positive and negative samples should be consistent with real situation, this article also keep the positive and negative ratio in 1:4.). Enriching positive samples can make the recall higher when the proportion of positive and negative samples are extremely unbalanced. but if the number of positive samples lose restriction to increase, it will only lead to inadequate learning of negative samples, even the accuracy of positive samples are improved, it becomes a meaningless digital game.

5 Conclusion

In this article, we proposed a hybrid encoding and sample enrichment method. Experiments show the proposed methods achieve an obvious performance improvement. Firstly, data with different attributes are transformed into different codes, and then positive samples are enriched by using constrained GAN. Constrained GAN can avoid producing false data when the distribution of positive and negative are similar. Finally, we select some stable and reliable classifiers with good performance through experiments, and use these classifiers to integrate a soft classifier. Our method works well on dataset mentioned in this paper.

Notes

1.
http://tenlearning.cn/.
2.
https://www.cifas.org.uk/.
3.
The dataset source: https://archive.ics.uci.edu/ml/datasets/Bank+Marketing.
4.
We used Gaussian distribution in this model. Well, people can implement other distribution functions.

References

Phua, C., Lee, V., Smith, K., Gayler, R.: A comprehensive survey of data mining-based fraud detection research, arXiv preprint arXiv:1009.6119 (2010)
Kou, Y., Lu, C.-T., Sirwongwattana, S., Huang, Y.-P.: Survey of fraud detection techniques. In: IEEE International Conference on Networking, Sensing and Control, 2004, vol. 2, pp. 749–754. IEEE (2004)
Google Scholar
Hung, E., Cheung, D.W.: Parallel mining of outliers in large database. Distrib. Parallel Databases 12(1), 5–26 (2002)
Article MathSciNet Google Scholar
Williams, G., Baxter, R., He, H., Hawkins, S., Gu, L.: A comparative study of RNN for outlier detection in data mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM 2003, pp. 709–712. IEEE (2002)
Google Scholar
Ngai, E.W., Hu, Y., Wong, Y., Chen, Y., Sun, X.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis. Support Syst. 50(3), 559–569 (2011)
Article Google Scholar
Brockett, P.L., Derrig, R.A., Golden, L.L., Levine, A., Alpert, M.: Fraud classification using principal component analysis of ridits. J. Risk Insur. 69(3), 341–371 (2002)
Article Google Scholar
Zanetti, M., Jamhour, E., Pellenz, M., Penna, M.: A new SVM-based fraud detection model for AMI. In: Skavhaug, A., Guiochet, J., Bitsch, F. (eds.) SAFECOMP 2016. LNCS, vol. 9922, pp. 226–237. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45477-1_18
Chapter Google Scholar
Ezawa, K.J., Norton, S.W.: Constructing bayesian networks to predict uncollectible telecommunications accounts. IEEE Expert 11(5), 45–51 (1996)
Article Google Scholar
Viaene, S., Dedene, G., Derrig, R.A.: Auto claim fraud detection using bayesian learning neural networks. Expert Syst. Appl. 29(3), 653–666 (2005)
Article Google Scholar
Viaene, S., Derrig, R.A., Dedene, G.: A case study of applying boosting naive bayes to claim fraud diagnosis. IEEE Trans. Knowl. Data Eng. 16(5), 612–620 (2004)
Article Google Scholar
Das, K., Moore, A., Schneider, J.: Belief state approaches to signaling alarms in surveillance systems. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 539–544. ACM (2004)
Google Scholar
Sharma, A., Panigrahi, P.K.: A review of financial accounting fraud detection based on data mining techniques. arXiv preprint arXiv:1309.3944 (2013)
Bolton, R.J., Hand, D.J., et al.: Unsupervised profiling methods for fraud detection, Credit Scoring and Credit Control VII, pp. 235–255 (2001)
Google Scholar
Lepoivre, M.R., Avanzini, C.O., Bignon, G., Legendre, L., Piwele, A.K.: Credit card fraud detection with unsupervised algorithms. J. Adv. Inf. Technol. 7(1), 34–38 (2016)
Article Google Scholar
Cerullo, M.J., Cerullo, V.: Using neural networks to predict financial reporting fraud: Part 1. Comput. Fraud Secur. 1999(5), 14–17 (1999)
Article Google Scholar
Dorronsoro, J.R., Ginel, F., Sánchez, C.R., Santa Cruz, C.: Neural fraud detection in credit card operations. IEEE Trans. Neural Networks 8(4), 827–834 (1997)
Article Google Scholar
Lin, J.W., Hwang, M.I., Becker, J.D.: A fuzzy neural network for assessing the risk of fraudulent financial reporting. Manag. Auditing J. 18(8), 657–665 (2003)
Article Google Scholar
Roy, A., Sun, J., Mahoney, R., Alonzi, L., Adams, S., Beling, P.: Deep learning detecting fraud in credit card transactions. In: 2018 Systems and Information Engineering Design Symposium (SIEDS), pp. 129–134. IEEE (2018)
Google Scholar
Wang, Y., Xu, W.: Leveraging deep learning with lda-based text analytics to detect automobile insurance fraud. Decis. Support Syst. 105, 87–95 (2018)
Article Google Scholar
Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. Chapman and Hall/CRC, New York (2012)
Book Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014)
Google Scholar
Lin, D., Fu, K., Wang, Y., Xu, G., Sun, X.: Marta gans: unsupervised representation learning for remote sensing image classification. IEEE Geosci. Remote Sens. Lett. 14(11), 2092–2096 (2017)
Article Google Scholar
Cassel, M., Lima, F.: Evaluating one-hot encoding finite state machines for SEU reliability in SRAM-based FPGAs. In: 12th IEEE International On-Line Testing Symposium, IOLTS 2006, pp. 6-pp. IEEE (2006)
Google Scholar
Sikora, R., et al.: A modified stacking ensemble machine learning algorithm using genetic algorithms. In Handbook of Research on Organizational Transformations Through Big Data Analytics, pp. 43–53. IGi Global (2015)
Google Scholar
Golson, S.: One-hot state machine design for FPGAs. In Proceedings 3rd Annual PLD Design Conference & Exhibit, vol. 1, no. 3 (1993)
Google Scholar
Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems, pp. 2224–2232 (2015)
Google Scholar
Song, Y., et al.: Vital: Visual tracking via adversarial learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8990–8999 (2018)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. arXiv preprint arXiv:1812.04948 (2018)
Mahalanobis, P.C.: On the generalized distance in statistics. National Institute of Science of India (1936)
Google Scholar
Rojas, R.: Adaboost and the super bowl of classifiers a tutorial introduction to adaptive boosting. Freie University, Berlin, Technical report (2009)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Yu, D., Yao, K., Su, H., Li, G., Seide, F.: Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7893–7897. IEEE (2013)
Google Scholar
Xiong, H., Cheng, W., Fu, Y., Hu, W., Bian, J., Guo, Z.: De-biasing covariance-regularized discriminant analysis. In: IJCAI, pp. 2889–2897 (2018)
Google Scholar
Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)
Article MathSciNet Google Scholar
Kamiński, B., Jakubczyk, M., Szufel, P.: A framework for sensitivity analysis of decision trees. CEJOR 26(1), 135–159 (2018)
Article MathSciNet Google Scholar
Liaw, A., Wiener, M., et al.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Google Scholar
Natekin, A., Knoll, A.: Gradient boosting machines, a tutorial. Front. Neurorobotics 7, 21 (2013)
Article Google Scholar
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Google Scholar
West, J., Bhattacharya, M.: Intelligent financial fraud detection: a comprehensive review. Comput. Secur. 57, 47–66 (2016)
Article Google Scholar
Patil, T.R., Sherekar, S.: Performance analysis of naive bayes and j48 classification algorithm for data classification. Int. J. Comput. Sci. Appl. 6(2), 256–261 (2013)
Google Scholar
Yan, X., Su, X.: Linear Regression Analysis: Theory and Computing. World Scientific, Singapore (2009)
Book Google Scholar
Mercer, L.C.: Fraud detection via regression analysis. Comput. Secur. 9(4), 331–338 (1990)
Article Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta G. (eds) Proceedings of COMPSTAT 2010, pp. 176–186. Physica, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Chapter Google Scholar

Download references

Acknowledgment

This work is supported by China National Basic Research Program (973 Program, No. 2015CB352400), National Natural Science Foundation of China (No. 61572488,61572487), Shenzhen Basic Research Program (No. JCYJ201803021457 31531,JCYJ20170818163026031), and Shenzhen Discipline Construction Project for Urban Computing and Data Intelligence.

Author information

Authors and Affiliations

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
Yuhang Zhang, Wensi Yang, Wanlin Sun, Kejiang Ye & Ming Chen
University of Chinese Academy of Sciences, Beijing, 100049, China
Yuhang Zhang & Wensi Yang
Northeast Normal University, Changchun, 130024, China
Wanlin Sun
Department of Computer and Information Science, Faculty of Science and Technology, State Key Laboratory of IoT for Smart City, University of Macau, Macau SAR, China
Cheng-Zhong Xu

Authors

Yuhang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wensi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wanlin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Kejiang Ye
View author publications
You can also search for this author in PubMed Google Scholar
Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Zhong Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kejiang Ye .

Editor information

Editors and Affiliations

Sunmi US Inc., Pleasanton, CA, USA
De Wang
Kingdee International Software Group Co., Ltd., Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Yang, W., Sun, W., Ye, K., Chen, M., Xu, CZ. (2019). The Constrained GAN with Hybrid Encoding in Predicting Financial Behavior. In: Wang, D., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2019. AIMS 2019. Lecture Notes in Computer Science(), vol 11516. Springer, Cham. https://doi.org/10.1007/978-3-030-23367-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-23367-9_2
Published: 20 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23366-2
Online ISBN: 978-3-030-23367-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Constrained GAN with Hybrid Encoding in Predicting Financial Behavior

Abstract

Similar content being viewed by others

Grammar-Based Multi-objective Genetic Programming with Token Competition and Its Applications in Financial Fraud Detection

LightGBM Model for Credit Card Fraud Discovery

Intelligent Fraud Detection Framework for PFMS Using HGRO Feature Selection and OC-LSTM Fraud Detection Technique

Keywords

1 Introduction

2 Related Work