Abstract
A novel hybrid approach between clustering methods and autoencoders (AEs) is introduced for detecting network anomalies in a semi-supervised manner. A previous work has developed regularized AEs, namely Shrink AE (SAE) and Dirac Delta Variational AE (DVAE) that learn to represent normal data into a very small region being close to the origin in their middle hidden layers (latent representation). This work based on the assumption that normal data points may share some common characteristics, so they can be forced to distribute in a small single cluster. In some scenarios, however, normal network data may contain data from very different network services, which may result in a number of clusters in the normal data. Our proposed hybrid model attempts to automatically discover these clusters in the normal data in the latent representation of AEs. At each iteration, an AE learns to map normal data into the latent representation while a clustering method tries to discover clusters in the latent normal data and force them being close together. The co-training strategy can help to reveal true clusters in normal data. When a querying data point coming, it is first mapped into the latent representation of the AE, and its distance to the closest cluster center can be used as an anomaly score. The higher anomaly score a data point has, the more likely it is anomaly. The method is evaluated with four scenarios in the CTU13 dataset, and experiments illustrate that the proposed hybrid model often out-performs SAE on three out of four scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C.: Outlier analysis. Data Mining, pp. 237–263. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-14142-8_8
Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 291–294 (1988). https://doi.org/10.1007/BF00332918
Bui, T.C., Cao, V.L., Hoang, M., Nguyen, Q.U.: A clustering-based shrink autoencoder for detecting anomalies in intrusion detection systems. In: Proceedings of the KSE, pp. 1–5. IEEE (2019)
Cao, V.L., Nicolau, M., McDermott, J.: A hybrid autoencoder and density estimation model for anomaly detection. In: Handl, J., Hart, E., Lewis, P.R., López-Ibáñez, M., Ochoa, G., Paechter, B. (eds.) PPSN 2016. LNCS, vol. 9921, pp. 717–726. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45823-6_67
Cao, V.L., Nicolau, M., McDermott, J.: Learning neural representations for network anomaly detection. IEEE Trans. Cybern. 49(8), 3074–3087 (2018)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)
Erfani, S.M., Rajasegarar, S., Karunasekera, S., Leckie, C.: High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recogn. 58, 121–134 (2016)
Fiore, U., Palmieri, F., Castiglione, A., De Santis, A.: Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122, 13–23 (2013)
Garcia, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)
Hawkins, S., He, H., Williams, G., Baxter, R.: Outlier detection using replicator neural networks. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 170–180. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46145-0_17
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length and Helmholtz free energy. In: Advances in Neural Information Processing Systems, pp. 3–10 (1994)
Japkowicz, N., Myers, C., Gluck, M.: A novelty detection approach to classification. In: IJCAI, pp. 518–523 (1995)
Phoha, V.V.: Internet Security Dictionary. Springer, Heidelberg (2007)
Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA, p. 4. ACM (2014)
Song, C., Liu, F., Huang, Y., Wang, L., Tan, T.: Auto-encoder based data clustering. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds.) CIARP 2013. LNCS, vol. 8258, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41822-8_15
Vu, L., Cao, V.L., Nguyen, Q.U., Nguyen, D.N., Hoang, D.T., Dutkiewicz, E.: Learning latent distribution for distinguishing network traffic in intrusion detection system. In: ICC 2019–2019 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, V.Q., Nguyen, V.H., Le-Khac, NA., Cao, V.L. (2020). Clustering-Based Deep Autoencoders for Network Anomaly Detection. In: Dang, T.K., Küng, J., Takizawa, M., Chung, T.M. (eds) Future Data and Security Engineering. FDSE 2020. Lecture Notes in Computer Science(), vol 12466. Springer, Cham. https://doi.org/10.1007/978-3-030-63924-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-63924-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63923-5
Online ISBN: 978-3-030-63924-2
eBook Packages: Computer ScienceComputer Science (R0)