Abstract
Quality information retrieval from Web is essential for every search engine. But the quality of information is being exploited by spammers who make heavy use of malicious redirections for the purpose of phishing, downloading malware or attaining high search engine ranking. Malicious redirections present the irrelevant content to search user, thereby affecting user satisfaction. It also leads to wastage of network bandwidth. In this paper, we propose a neural framework for detecting redirection spam. We incorporated the feed-forward multilayer perceptron network and used scaled conjugate gradient algorithm that is able to perform very fast classification of URLs leading to redirection spam. We investigated the network empirically to choose the number of hidden layers and observed that when network is trained with two hidden layers, it gives better accuracy. We validated our proposed approach against the dataset of 2383 URLs and were able to detect the spammed redirections with high accuracy. The results indicate that neural networks are very effective technique to model the redirection spam detection.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Canali D, Cova M, Vigna G, Kruegel C (2011) Prophiler: A fast filter for the large-scale detection of malicious web pages categories and subject descriptors. In: International world wide web conference. ACM, Hyderabad, pp 197–206
Castiglione A, De Santis A, Fiore U, Palmieri F (2012) An asynchronous covert channel using spam. Comput Math Appl 63:437–447. doi:10.1016/j.camwa.2011.07.068
Castillo C, Davison BD (2011) Adversarial web search. In: Foundations and trends\({}^{\textregistered }\) in information retrieval, pp 377–486
Chellapilla K, Maykov A (2007) A taxonomy of JavaScript redirection spam. In: Proceedings of the 3rd international workshop on adversarial information retrieval on the web—AIRWeb ’07, pp 81–88
CloudMark (2015) Annual security threat report. https://www.cloudmark.com/releases/docs/threat_report/cloudmark-security-threat-report-annual-2015.pdf. Accessed 9 June 2016
Dell (2016) Security annual threat report. http://www.netthreat.co.uk/assets/assets/dell-security-annual-threat-report-2016-white-paper-197571.pdf. Accessed 9 June 2016
Demuth H, Beale M, Hagan M (2010) Neural network\(\text{toolbox}^{\text{ TM }}\) 6. User guide
Duan Z, Chen P, Sanchez F et al (2012) Detecting spam zombies by monitoring outgoing messages. IEEE Trans Dependable Secure Comput 9:198–210. doi:10.1109/TDSC.2011.49
Elssied NOF, Ibrahim O, Osman AH (2015) Enhancement of spam detection mechanism based on hybrid k-mean clustering and support vector machine. Soft Comput 19:3237–3248. doi:10.1007/s00500-014-1479-2
Eshete B (2013) Effective analysis, characterization, and detection of malicious web pages. In: Proceedings of the 22nd international conference on world wide web companion, pp 355–360
Fukushima Y, Hori Y, Sakurai K (2011) Proactive blacklisting for malicious web sites by reputation evaluation based on domain and IP address registration. In: In Proceedings of 10th international conference on trust, security and privacy in computing and communications. IEEE, pp 352–361
Gu B, Sheng VS, Tay KY et al (2015a) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26:1403–1416
Gu B, Sheng VS, Wang Z et al (2015) Incremental learning for \({\upnu }\)-Support Vector Regression. Neural Netw 67:140–150. doi:10.1016/j.neunet.2015.03.013
Gyongyi Z, Garcia-Molina H (2005) Spam: it’s not just for inboxes. Computer 38:28–34
Hans K, Ahuja L, Muttoo SK (2013) Characterization and detection of Redirection Spam. In: Wilkes-100 international conference on computing sciences (ICCS’13). Elsevier, Jallandhar, India, pp 325–331
Hans K, Ahuja L, Muttoo SK (2014) Approaches for web spam detection. Int J Comput Appl 101:975–987
Hans K, Ahuja L, Muttoo SK (2016) A fuzzy logic approach for detecting redirection spam. Int J Electron Secur Digit Forensics 8:191–204
Haykin S (1999) Neural networks: a comprehensive foundation, 2nd edn. Prentice-Hall, Washington
Henzinger MR, Motwani R, Silverstein C (2002) Challenges in web search engines. ACM SIGIR Forum 36:11–22. doi:10.1145/792550.792553
Johansson EM, Dowla FU, Goodman DM (1991) Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. Int J Neural Syst 2:291–301. doi:10.1142/S0129065791000261
Bhargrava VK, Brewer D, Li K (2009) A study of URL redirection indicating spam. In: Sixth conference on e-mail and anti-spam. Steve Sheng’s Publications, California, USA, pp 1–4
Lee S, Kim J (2013) Warningbird: a near real-time detection system for suspicious urls in twitter stream. IEEE Trans Dependable Secure Comput 10:183–195
Lee S, Kim J (2012) WARNING B IRD?: detecting suspicious URLs in twitter stream. In: Network and distributed system security symposium (NDSS). San Diego, USA
Leontiadis N, Moore T, Christin N (2011) Measuring and analyzing search-redirection attacks in the illicit online prescription drug trade. In: 20th USENIX security symposium. San Francisco, CA, pp 1–17
Li Z, Zhang K, Xie Y et al (2012) Knowing your enemy: understanding and detecting malicious web advertising. In: 19th conference on computer and communications security. ACM, Harvard, pp 674–686
Lu L, Perdisci R, Lee W (2011) SURF: detecting and measuring search poisoning. In: Proceedings of the 18th ACM conference on computer and communications security. ACM, Chicago, USA, pp 467–476
Ma J, Saul LK, Savage S, Voelker GM (2011) Learning to detect malicious URLs. ACM Trans Intell Syst Technol 2:1–24. doi:10.1145/1961189.1961202
Ma J, Saul LK, Savage S, Voelker GM (2009) Beyond blacklists? Learning to detect malicious web sites from suspicious URLs. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Paris, France, pp 1245–1253
Mekky H, Torres R, Zhang ZL et al (2014) Detecting malicious HTTP redirections using trees of user browsing activity. In: IEEE conference on computer communications. IEEE, Canada, pp 1159–1167
Møller M (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6:525–533
Niu Y, Wang Y-M, Chen H, et al (2007) A quantitative study of forum spamming using context-based analysis cloaking redirection. In: Proceedings of 15th network and distributed system security (NDSS) symposium. San Diego, USA, pp 1–15
OpenDNS (2015) PhishTank. https://www.phishtank.com/. Accessed 8 Nov 2015
Prieto VM, Álvarez M, López-García R, Cacheda F (2012) Analysing the effectiveness of crawlers on the client-side hidden web. In: Trends in practical applications of agents and multiagent systems. Springer Berlin Heidelberg, New Delhi, India, pp 141–148
Ruan G, Tan Y (2010) A three-layer back-propagation neural network for spam detection using artificial immune concentration. Soft Comput 14:139–150. doi:10.1007/s00500-009-0440-2
Sophos (2013) Security threat report 2013. https://www.sophos.com/en-us/medialibrary/PDFs/other/sophossecuritythreatreport2013.pdf. Accessed 7 Sept 2016
Takata Y, Goto S, Mori T (2011) Analysis of redirection caused by web-based malware. In: Proceedings of the Asia-Pacific advanced network, pp 53–62
Tao W, Shunzheng Y, Bailin X (2010) A novel framework for learning to detect malicious web pages. In: International forum on information technology and applications (IFITA). IEEE, China, pp 353–357
Thomas K, Grier C, Ma J et al (2011) Design and evaluation of a real-time URL spam filtering service. In: Symposium on security and privacy. IEEE, California, USA, pp 447–462
Wang Y, Ma M, Niu Y, Chen H (2007) Spam double-funnel: connecting web spammers with advertisers. In: Proceedings of the 16th international conference on world wide web. ACM, Alberta, Canada, pp 291–300
Wang YM, Ma M (2007) Strider search ranger: towards an autonomic anti-spam search engine. In: Fourth international conference on autonomic computing. IEEE, Florida, USA, pp 32–42
Watson MR, Marnerides AK, Mauthe A, Hutchison David (2016) Malware detection in cloud computing infrastructures. IEEE Trans Dependable Secure Comput 13:192–205
Websense (2014) Threat report. http://www.websense.com/assets/reports/report-2014-threat-report-en.pdf. Accessed 9 June 2016
Wen S, Zhou W, Zhang J et al (2014) Modeling and analysis on the propagation dynamics of modern email malware. IEEE Trans Dependable Secure Comput 11:361–374. doi:10.1109/TDSC.2013.49
Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295:395–406. doi:10.1016/j.ins.2014.10.040
Wu B, Davison B (2005) Cloaking and redirection: a preliminary study. In: First international workshop on adversarial information retrieval on the web (AIRWeb’05). ACM, Chiba, Japan, pp 7–16
Xia Z, Wang X, Sun X et al (2016a) Steganalysis of LSB matching using differences between nonadjacent pixels. Multimed Tools Appl 75:1947–1962. doi:10.1007/s11042-014-2381-8
Xia Z, Wang X, Sun X, Wang B (2014) Steganalysis of least significant bit matching using multi-order differences. Secur Commun Netw 7:1283–1291. doi:10.1002/sec.864
Xia Z, Wang X, Zhang L et al (2016) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Trans Inf Forensics Secur 11:2594–2608. doi:10.1109/TIFS.2016.2590944
Xie S, Wang Y (2014) Construction of tree network with limited delivery latency in homogeneous wireless sensor networks. Wireless Pers Commun 78:231–246. doi:10.1007/s11277-014-1748-5
Yue X, Abraham A, Chi ZX et al (2007) Artificial immune system inspired behavior-based anti-spam filter. Soft Comput 11:729–740. doi:10.1007/s00500-006-0116-0
Zhang W, Ding Y-X, Tang Y, Zhao B (2011) Malicious web page detection based on on-line learning algorithm. Int Conf Mach Learn Cybern 2011:1914–1919. doi:10.1109/ICMLC.2011.6016954
Zheng Y, Jeon B, Xu D et al (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28:961–973. doi:10.3233/IFS-141378
Zhou J, Ding Y (2012) An analysis of URLs generated from JavaScript code. In: Proceedings of 11th international conference on computer and information science. IEEE, Shanghai, China, pp 688–693
Zhou Z, Wang Y, Wu QMJ et al (2016) Effective and efficient global context verification for image copy detection. IEEE Trans Inf Forensics Secur 12:48–63. doi:10.1109/TIFS.2016.2601065
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Informed consent
Consent to submit has been received explicitly from all co-authors, as well as from the responsible authorities—tacitly or explicitly—at the institute/organization where the work has been carried out before the work is submitted.
Research involving human participants and/or animals
Our research does not include human participants or animals.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Hans, K., Ahuja, L. & Muttoo, S.K. Detecting redirection spam using multilayer perceptron neural network. Soft Comput 21, 3803–3814 (2017). https://doi.org/10.1007/s00500-017-2531-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2531-9