Skip to main content

Identifying Gambling and Porn Websites with Image Recognition

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2017 (PCM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10736))

Included in the following conference series:

Abstract

Gambling and porn websites are more and more harmful to the health and growth of the youth with the rapid development of the Internet, however, the text contents and URLs based website classification methods could not get satisfying on gambling and porn websites detection because domain names of them change fast. Meanwhile, the visual based website classification has gotten perfect results in phishing website detection which encourages us. Therefore, we introduce the visual feature to identify gambling websites and porn websites in this paper. Firstly, we develop a website screenshot tool which could save the full contents of a website to be a image, Secondly, the effective feature is chosen by BoW model to recognize the screenshots of gambling websites and porn websites, and the appropriate parameters are chosen to promote the efficiency of classification. Finally, experimental results on our collected gambling websites and porn website datasets demonstrate that our proposed method is able to recognize the gambling and porn websites and gets satisfying results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhalla, V.K., Kumar, N.: An efficient scheme for automatic web pages categorization using the support vector machine. New Rev. Hypermedia Multimed. 22, 223–242 (2016)

    Article  Google Scholar 

  2. Zheng, Y., Sun, C., Zhu, C.: LWCS: a large-scale web page classification system based on anchor graph hashing. In: IEEE International Conference on Software Engineering and Service Science, pp. 90–94 (2015)

    Google Scholar 

  3. Sarode, S., Gadge, J.: Hybrid dimensionality reduction approach for web page classification. In: International Conference on Communication, Information and Computing Technology (2015)

    Google Scholar 

  4. Sirageldin, A., Baharudin, B.B., Jung, L.T.: Malicious web page detection: a machine learning approach. In: Jeong, H.Y., Obaidat, M.S., Yen, N.Y., Park, J.J.J.H. (eds.) Advances in Computer Science and its Applications. LNEE, vol. 279, pp. 217–224. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-41674-3_32

    Chapter  Google Scholar 

  5. Rajalakshmi, R., Aravindan, C.: Web page classification using n-gram based URL features. In: International Conference on Advanced Computing, pp. 15–21 (2013)

    Google Scholar 

  6. Maurer, M.-E., Höfer, L.: Sophisticated phishers make more spelling mistakes: using URL similarity against phishing. In: Xiang, Y., Lopez, J., Kuo, C.-C.J., Zhou, W. (eds.) CSS 2012. LNCS, vol. 7672, pp. 414–426. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35362-8_31

    Chapter  Google Scholar 

  7. Zhou, Y., Zhang, Y., Xiao, J., Wang, Y., Lin, W.: Visual similarity based anti-phishing with the combination of local and global features. In: International Conference on Trust, Security and Privacy in Computing and Communications, pp. 189–196 (2014)

    Google Scholar 

  8. Rao, R.S., Ali, S.T.: A computer vision technique to detect phishing attacks. In: Fifth International Conference on Communication Systems and Network Technologies (2015)

    Google Scholar 

  9. Afroz, S., Greenstadt, R.: PhishZoo: detecting phishing websites by looking at them. In: Fifth IEEE International Conference on Semantic Computing, pp. 368–375 (2011)

    Google Scholar 

  10. Bozkir, A.S., Sezer, E.A.: Use of HOG descriptors in phishing detection (2016)

    Google Scholar 

  11. Cao, Z., Xiong, G., Zhao, Y., Li, Z., Guo, L.: A survey on encrypted traffic classification. In: Batten, L., Li, G., Niu, W., Warren, M. (eds.) ATIS 2014. CCIS, vol. 490, pp. 73–81. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45670-5_8

    Chapter  Google Scholar 

  12. Dong, K., Guo, L., Fu, Q.: An adult image detection algorithm based on bag-of-visual-words and text information. In: International Conference on Natural Computation, pp. 556–560 (2014)

    Google Scholar 

  13. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines, pp. 389–396 (2001). http://www.csie.ntu.edu.tw/~cjlin/libsvm

  14. Yao, N., Bai, T.C., Chen, J.: Improved fast corner detection based on Harris algorithm for Chinese characters, pp. 767–770 (2013)

    Article  Google Scholar 

  15. Bay, H., Tuytelaars, T., Gool, L.V.: SURF: speeded up robust features. Comput. Vis. Image Underst. 110(3), 404–417 (2006)

    Google Scholar 

  16. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(60), 91–110 (2004)

    Article  Google Scholar 

  17. Agrawal, M., Konolige, K., Blas, M.R.: CenSurE: Center surround extremas for realtime feature detection and matching. In: European Conference on Computer Vision, pp. 102–115. IEEE (2008)

    Google Scholar 

  18. Rublee, E., Rabaud, V., Konolige, K, Bradski, G.: ORB: an efficient alternative to SIFT or SURF. vol. 58, pp. 2564–2571 (2011)

    Google Scholar 

  19. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: binary robust independent elementary features. In: European Conference on Computer Vision, pp. 778–792. IEEE (2010)

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is supported by The National Natural Science Foundation of China (No. 61602472, No. U1636217), The National Key Research and Development Program of China (NO. 2016YFB0801200).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zigang Cao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, L., Gou, G., Xiong, G., Cao, Z., Li, Z. (2018). Identifying Gambling and Porn Websites with Image Recognition. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77383-4_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77382-7

  • Online ISBN: 978-3-319-77383-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics