Abstract
Anomaly detection based on data mining is one of the key technologies to be applied to intelligent detection. K-means is a classic clustering algorithm which is efficient for anomaly detection. Traditional K-means is sensitive to the selection of initial clustering centers. Different initial value can cause different clustering results. We combine improved DD algorithm with information entropy to improve the performance of K-means. Improved K-means can optimize the selection of initial clustering centers; automatically decide the number of clusters and output stable clustering results. After the pretreatment of PCA, the adaptability of improved K-means has a distinct progress. To solve the problem of massive data processing time, we adopt the technology of cloud computing and modify the algorithm for parallel processing. We analyze the performance of improved K-means by using different data sets, KDD Cup99 and public mobile malware data set (i.e. MalGenome). The experimental results illustrate that improved K-means has accurate results and can be applied to anomaly detection in mobile networks. This improved K-means also can be applied for image retrieval by calculating the similarity between each image.






Similar content being viewed by others
References
Anagnostopoulos M, Kambourakis G, Gritzalis S (2015) New facets of mobile botnet: architecture and evaluation. Int J Inf Secur 2015:1–19
Gu B, Sheng VS, Tay KY, Romano W, Li S (2014) Incremental support vector learning for ordinal regression. Ieee T Neur Net Learn 26(7):1403–1416
Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Netw 67:140–150
Laxman S, Sastry PS (2006) A survey of temporal data mining. Sadhana Acad P Eng S 31(2):173–198
Leea S, Kimb G, Kimc S (2011) Self-adaptive and dynamic clustering for online anomaly detection. Exp Syst Appl 38(12):14891–14898
Narudin FA, Feizollah A, Anuar NB, Gani A (2016) Evaluation of machine learning classifiers for mobile malware detection. Soft Comput 20(1):343–357
Pandeeswari N, Kumar G (2015) Anomaly detection system in cloud environment using fuzzy clustering based ANN. Mob Netw Appl 2015:1–12
Shamir O, Tishby N (2010) Stability and model selection in k-means clustering. Mach Learn 80(2):213–243
Tong XJ, Meng FR, Wang ZX (2011) Optimization to k-means initial cluster centers. Comput Eng Des 32(8):2721–2723
Villalba SD, Cunningham P (2007) An evaluation of dimension reduction techniques for one-class classification. Artif Intell Rev 27(4):273–294
Yin C (2014) Towards accurate node-based detection of P2P Botnets. Sci World J 2014:425–491
Yin C, Feng L, Ma L (2015) An improved Hoeffding-ID data-stream classification algorithm. J Supercomput 2015:1–12
Yin C, Ma L, Feng L (2016) A feature selection method for improved clonal algorithm towards intrusion detection. Int J Pattern Recognit Artif Intell 30(5):1–13
Yin C, Zou M, Iko D, Wang J (2013) Botnet detection based on correlation of malicious behaviors. Int J Hybrid Inf Technol 6(6):291–300
Yuan FY, Zhang XC, Luo SB (2011) Accurate property weighted K- means clustering algorithm based on information entropy. J Comput Appl 31(6):1675–1677
Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. 2012 I.E. Symp Secur Priv 59:95–109
Acknowledgments
Foundation item: This work was funded by the National Natural Science Foundation of China (No.61373134). It was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Jiangsu Key Laboratory of Meteorological Observation and Information Processing (No.KDXS1105) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yin, C., Zhang, S. Parallel implementing improved k-means applied for image retrieval and anomaly detection. Multimed Tools Appl 76, 16911–16927 (2017). https://doi.org/10.1007/s11042-016-3638-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3638-1