Skip to main content

Using Transfer Learning to Detect Phishing in Countries with a Small Population

  • Conference paper
  • First Online:
  • 625 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1127))

Abstract

An increasing number of people are using social media services and with it comes a more attractive outlet for phishing attacks. Phishers curate tweets that lead users to websites that download malware. This is a major issue as phishers can gain access to the user’s digital identity and perform malicious acts. Phishing attacks also have a potential to be similar in different regions, perhaps at different time periods. We investigate the use of transfer learning to detect phishing models learned in one region to detect phishing in other regions. We use a semi-supervised algorithm to train a model on a US based dataset that we then apply to New Zealand. First, we evaluate how effectively transfer learning can be used in different regions to detect potential phishing attacks on online social networks in real time. Secondly, we investigate the different phishing attacks and discuss the differences in phishing attack features detected for different countries. We have collected a real world Twitter dataset over 6 months and show that we are able to detect phishing successfully using US phishing models despite only a low level of phishing occurring in smaller populations such as New Zealand.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/wernse/ADPT-Instance-Transfer.

References

  1. Aggarwal, A., Rajadesingan, A., Kumaraguru, P.: PhishAri: automatic realtime phishing detection on Twitter. In: 2012 eCrime Researchers Summit, pp. 1–12. IEEE (2012)

    Google Scholar 

  2. Al-Stouhi, S., Reddy, C.K.: Adaptive boosting for transfer learning using dynamic updates. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6911, pp. 60–75. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23780-5_14

    Chapter  Google Scholar 

  3. Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for transductive transfer learning. In: Seventh IEEE International Conference on Data Mining Workshops, ICDMW 2007, pp. 77–82, October 2007

    Google Scholar 

  4. Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)

    Google Scholar 

  5. Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Boosting for transfer learning. In: Proceedings of the 24th ICML, pp. 193–200. ACM, New York (2007)

    Google Scholar 

  6. Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Self-taught clustering. In: Proceedings of the 25th ICML, pp. 200–207. ACM, New York (2008)

    Google Scholar 

  7. Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. 4, 325–327 (1976)

    Article  Google Scholar 

  8. Farhadi, A., Forsyth, D., White, R.: Transfer learning in sign language. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

    Google Scholar 

  9. Go, A., Huang, L., Bhayani, R.: Twitter sentiment analysis. Entropy 17, 252 (2009)

    Google Scholar 

  10. Jeong, S.Y., Koh, Y.S., Dobbie, G.: Phishing detection on Twitter streams. In: Cao, H., Li, J., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI), vol. 9794, pp. 141–153. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42996-0_12

    Chapter  Google Scholar 

  11. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)

    Google Scholar 

  12. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  13. Smith, K.: 58 incredible and interesting twitter stats and statistics (2019). https://www.brandwatch.com/blog/twitter-stats-and-statistics

  14. Wang, P., Domeniconi, C., Hu, J.: Using wikipedia for co-clustering based cross-domain text classification. In: Eighth IEEE International Conference on Data Mining, pp. 1085–1090. IEEE (2008)

    Google Scholar 

  15. @yoyoel, @delbius: How Twitter is fighting spam and malicious automation (2018). https://blog.twitter.com/official/en_us/topics/company/2018/how-twitter-is-fighting-spam-and-malicious-automation.html

  16. Zangerle, E., Specht, G.: Sorry, I was hacked: a classification of compromised Twitter accounts. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 587–593. ACM (2014)

    Google Scholar 

Download references

Acknowledgements

This research is supported by InternetNZ (Grant No:IR170017).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Wernsen Wong or Yun Sing Koh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wong, W., Koh, Y.S., Dobbie, G. (2019). Using Transfer Learning to Detect Phishing in Countries with a Small Population. In: Le, T., et al. Data Mining. AusDM 2019. Communications in Computer and Information Science, vol 1127. Springer, Singapore. https://doi.org/10.1007/978-981-15-1699-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1699-3_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1698-6

  • Online ISBN: 978-981-15-1699-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics