Skip to main content

An Efficient Filtering Method for Detecting Malicous Web Pages

  • Conference paper
Book cover Information Security Applications (WISA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7690))

Included in the following conference series:

Abstract

There are ways to detect malicious web pages, two of which are dynamic detection and static detection. Dynamic detection has a high detection rate but uses a high amount of resources and takes a long time, whereas static analysis only uses a small amount of resources but its detection rate is low. To minimize the weaknesses of these two methods, a filtering method was suggested. This method uses static analysis first to filter normal web pages and then uses dynamic analysis to test only the remaining suspicious web pages. In this filtering method, if a page is classified as normal at the filtering stage, it is not being tested any more. However, the existing filtering method does not consider this problem. In this paper, to solve this problem, our proposed filtering method utilizes a cost-sensitive method. Also, to increase the efficiency of the filter, features are grouped as three subsets depending on the difficulty of the extraction. The efficiency of the proposed filter can be increased, as our method only uses the necessary feature subset according to the characteristics of the web pages. An experiment showed that the proposed method shows fewer false negatives and greater efficiency than an existing filtering method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bannur, S.N., Saul, L.K., Savage, S.: Judging a site by its content: learning the textual, structural, and visual features of malicious web pages. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 1–10. ACM (2011)

    Google Scholar 

  2. Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: A fast filter for the large-scale detection of malicious web pages. In: Proceedings of the 20th International Conference on World Wide Web, pp. 197–206. ACM (2011)

    Google Scholar 

  3. Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: Proceedings of the 19th International Conference on World Wide Web, pp. 281–290. ACM (2010)

    Google Scholar 

  4. Domingos, P.: Metacost: A general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 155–164. ACM (1999)

    Google Scholar 

  5. Eshete, B., Villafiorita, A., Weldemariam, K.: Malicious website detection: Effectiveness and efficiency issues. In: First SysSec Workshop (SysSec 2011), pp. 123–126. IEEE (2011)

    Google Scholar 

  6. Hou, Y.T., Chang, Y., Chen, T., Laih, C.S., Chen, C.M.: Malicious web content detection by machine learning. Expert Systems with Applications 37(1), 55–60 (2010)

    Article  Google Scholar 

  7. JSUnpack, http://jsunpack.jeek.org

  8. Likarish, P., Jung, E., Jo, I.: Obfuscated malicious javascript detection using classification techniques. In: 4th International Conference on Malicious and Unwanted Software (MALWARE 2009), pp. 47–54. IEEE (2009)

    Google Scholar 

  9. Nazario, J.: Phoneyc: a virtual client honeypot. In: Proceedings of the 2nd USENIX Conference on Large-Scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More, p. 6. USENIX Association (2009)

    Google Scholar 

  10. The Honeynet Project. Capture-hpc, https://projects.honeynet.org/capture-hpc/

  11. Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)

    Google Scholar 

  12. Quinlan, J.R.: C4. 5: programs for machine learning. Morgan Kaufmann (1993)

    Google Scholar 

  13. Seifert, C., Welch, I., Komisarczuk, P.: Identification of malicious web pages with static heuristics. In: Australasian Telecommunication Networks and Applications Conference, ATNAC 2008, pp. 91–96. IEEE (2008)

    Google Scholar 

  14. Tao, W., Shunzheng, Y., Bailin, X.: A novel framework for learning to detect malicious web pages. In: 2010 International Forum onInformation Technology and Applications (IFITA), vol. 2, pp. 353–357. IEEE (2010)

    Google Scholar 

  15. Wang, K.: Mitre honeyclient development project. Internet, http://honeyclient.org (accessed: March 2009)

  16. Wang, Y.M., Beck, D., Jiang, X., Roussev, R., Verbowski, C., Chen, S., King, S.: Automated web patrol with strider honeymonkeys. In: Proceedings of the 2006 Network and Distributed System Security Symposium, pp. 35–49 (2006)

    Google Scholar 

  17. Weka, http://www.cs.waikato.ac.nz/ml/weka/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Choi, J., Kim, G., Kim, T.G., Kim, S. (2012). An Efficient Filtering Method for Detecting Malicous Web Pages. In: Lee, D.H., Yung, M. (eds) Information Security Applications. WISA 2012. Lecture Notes in Computer Science, vol 7690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35416-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35416-8_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35415-1

  • Online ISBN: 978-3-642-35416-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics