Skip to main content

Application Marketplace Malware Detection by User Feedback Analysis

  • Conference paper
  • First Online:
Information Systems Security and Privacy (ICISSP 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 867))

Included in the following conference series:

Abstract

Smartphones are becoming increasingly ubiquitous. Like recommended best practices for personal computers, users are encouraged to install antivirus and intrusion detection software on their mobile devices. However, even with such software these devises are far from being fully protected. Given that application stores are the source of most applications, malware detection on these platforms is an important issue. Based on our intuition, which suggests that an application’s suspicious behavior will be noticed by some users and influence their feedback, we present an approach for analyzing user reviews in mobile application stores for the purpose of detecting malicious apps. The proposed method transfers an application’s text reviews to numerical features in two main steps: (1) extract domain-phrases based on external domain-specific textual corpus on computer and network security, and (2) compute three statistical features based on domain-phrases occurrences. We evaluated the proposed methods on 2,506 applications along with their 128,863 reviews collected from “Amazon AppStore”. The results show that proposed method yields an AUC of 86% in the detection of malicious applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Statista: Number of smartphones sold to end users worldwide from 2007 to 2016 (2017). https://www.statista.com/statistics/263437/global-smartphone-sales-to-end-users-since-2007/. Accessed Aug 2017

  2. Statista: Number of available apps in the Apple App Store from July 2008 to January 2017 (2017). https://www.statista.com/statistics/263795/number-of-available-apps-in-the-apple-app-store/. Accessed Aug 2017

  3. Statista: Number of available applications in the Google Play Store from December 2009 to June 2017 (2017). https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/. Accessed Aug 2017

  4. Statista: Number of available apps in the Amazon Appstore from March 2011 to April 2016 (2017). https://www.statista.com/statistics/307330/number-of-available-apps-in-the-amazon-appstore/. Accessed Aug 2017

  5. Check Point: Viking horde: a new type of android malware on Google Play (2017). https://blog.checkpoint.com/2016/05/09/viking-horde-a-new-type-of-android-malware-on-google-play/. Accessed Aug 2017

  6. Check Point: Dresscode android malware discovered on Google Play (2017). https://blog.checkpoint.com/2016/08/31/dresscode-android-malware-discovered-on-google-play/. Accessed Aug 2017

  7. Wang, K., Stolfo, S.J.: Anomalous payload-based network intrusion detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 203–222. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30143-1_11

    Chapter  Google Scholar 

  8. Blair-Goldensohn, S., Hannan, K., McDonald, R., Neylon, T., Reis, G.A., Reynar, J.: Building a sentiment summarizer for local service reviews. In: WWW Workshop on NLP in the Information Explosion Era, vol. 14, pp. 339–348 (2008)

    Google Scholar 

  9. Portier, K., Greer, G.E., Rokach, L., Ofek, N., Wang, Y., Biyani, P., Yu, M., Banerjee, S., Zhao, K., Mitra, P., et al.: Understanding topics and sentiment in an online cancer survivor community. JNCI Monogr. 47, 195–198 (2013)

    Article  Google Scholar 

  10. Hadad, T., Sidik, B., Ofek, N., Puzis, R., Rokach, L.: User feedback analysis for mobile malware detection. In: ICISSP, pp. 83–94 (2017)

    Google Scholar 

  11. Xie, L., Zhang, X., Seifert, J.P., Zhu, S.: pBMDS: a behavior-based malware detection system for cellphone devices. In: Proceedings of the Third ACM Conference on Wireless Network Security, pp. 37–48. ACM (2010)

    Google Scholar 

  12. Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for Android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)

    Google Scholar 

  13. Shabtai, A., Tenenboim-Chekina, L., Mimran, D., Rokach, L., Shapira, B., Elovici, Y.: Mobile malware detection through analysis of deviations in application network behavior. Comput. Secur. 43, 1–18 (2014)

    Article  Google Scholar 

  14. Aung, Z., Zaw, W.: Permission-based Android malware detection. Int. J. Sci. Technol. Res. 2, 228–234 (2013)

    Google Scholar 

  15. Zhang, Y., Yang, M., Xu, B., Yang, Z., Gu, G., Ning, P., Wang, X.S., Zang, B.: Vetting undesirable behaviors in Android apps with permission use analysis. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, pp. 611–622. ACM (2013)

    Google Scholar 

  16. Rastogi, V., Chen, Y., Jiang, X.: Droidchameleon: evaluating android anti-malware against transformation attacks. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 329–334. ACM (2013)

    Google Scholar 

  17. Yang, Z., Yang, M., Zhang, Y., Gu, G., Ning, P., Wang, X.S.: Appintent: analyzing sensitive data transmission in Android for privacy leakage detection. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, pp. 1043–1054. ACM (2013)

    Google Scholar 

  18. Zheng, M., Sun, M., Lui, J.C.: Droid analytics: a signature based analytic system to collect, extract, analyze and associate Android malware. In: 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 163–171. IEEE (2013)

    Google Scholar 

  19. Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: 2007 Twenty-Third Annual Computer Security Applications Conference, ACSAC 2007, pp. 421–430. IEEE (2007)

    Google Scholar 

  20. Ofek, N., Rokach, L., Mitra, P.: Methodology for connecting nouns to their modifying adjectives. In: Gelbukh, A. (ed.) CICLing 2014. LNCS, vol. 8403, pp. 271–284. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54906-9_22

    Chapter  Google Scholar 

  21. Katz, G., Ofek, N., Shapira, B.: Consent: context-based sentiment analysis. Knowl.-Based Syst. 84, 162–178 (2015)

    Article  Google Scholar 

  22. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004)

    Google Scholar 

  23. Ofek, N., Shabtai, A.: Dynamic latent expertise mining in social networks. IEEE Internet Comput. 18, 20–27 (2014)

    Article  Google Scholar 

  24. Choi, Y., Cardie, C.: Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 793–801. Association for Computational Linguistics (2008)

    Google Scholar 

  25. Balahur, A., Steinberger, R., Kabadjov, M., Zavarella, V., Van Der Goot, E., Halkia, M., Pouliquen, B., Belyaeva, J.: Sentiment analysis in the news. arXiv preprint arXiv:1309.6202 (2013)

  26. Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst. Appl. 36, 6527–6535 (2009)

    Article  Google Scholar 

  27. Mullen, T., Collier, N.: Sentiment analysis using support vector machines with diverse information sources. In: EMNLP, vol. 4, pp. 412–418 (2004)

    Google Scholar 

  28. Ofek, N., Katz, G., Shapira, B., Bar-Zev, Y.: Sentiment analysis in transcribed utterances. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 27–38. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_3

    Chapter  Google Scholar 

  29. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  30. Baron, N.S.: Language of the internet. In: The Stanford Handbook for Language Engineers, pp. 59–127 (2003)

    Google Scholar 

  31. Onix: Onix text retrieval toolkit (2016). http://www.lextek.com/manuals/onix/stopwords1.html. Accessed Apr 2016

  32. Twitter: Twitter dictionary: a guide to understanding twitter lingo (2016). http://www.webopedia.com/quick_ref/Twitter_Dictionary_Guide.asp. Accessed Apr 2016

  33. Porter, M.F.: An algorithm for suffix stripping. Program 14, 130–137 (1980)

    Article  Google Scholar 

  34. Dunham, K.: Mobile Malware Attacks and Defense. Syngress, Maryland Heights (2008)

    Google Scholar 

  35. Syngress, E.O., O’Farrell, N.: Hackproofing Your Wireless Network (2002)

    Google Scholar 

  36. Shostack, A.: Threat Modeling: Designing for Ssecurity. Wiley, Hoboken (2014)

    Google Scholar 

  37. Nayak, U., Rao, U.H.: The InfoSec Handbook: An Introduction to Information Security. Apress, New York City (2014)

    Google Scholar 

  38. Bosworth, S., Kabay, M.E.: Computer Security Handbook. Wiley, Hoboken (2002)

    Google Scholar 

  39. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-2002 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)

    Google Scholar 

  40. Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on World Wide Web, pp. 519–528. ACM (2003)

    Google Scholar 

  41. De Marneffe, M.C., MacCartney, B., Manning, C.D., et al.: Generating typed dependency parses from phrase structure parses. In: Proceedings of LREC, vol. 6, pp. 449–454 (2006)

    Google Scholar 

  42. Ranveer, S., Hiray, S.: Comparative analysis of feature extraction methods of malware detection. Int. J. Comput. Appl. 120, 1–7 (2015)

    Google Scholar 

  43. VirusTotal: Virustotal, a free online service that analyzes files and urls enabling the identification of viruses, worms, trojans and other kinds of malicious content. https://www.virustotal.com/. Accessed Aug 2017

  44. Šrndic, N., Laskov, P.: Detection of malicious pdf files based on hierarchical document structure. In: Proceedings of the 20th Annual Network & Distributed System Security Symposium (2013)

    Google Scholar 

  45. Nissim, N., Cohen, A., Moskovitch, R., Shabtai, A., Edry, M., Bar-Ad, O., Elovici, Y.: ALPD: Active learning framework for enhancing the detection of malicious pdf files. In: 2014 IEEE Joint Conference on Intelligence and Security Informatics Conference (JISIC), pp. 91–98. IEEE (2014)

    Google Scholar 

  46. Quinlan, J.R.: C4. 5: Programming for Machine Learning, p. 38. Morgan Kauffmann, Burlington (1993)

    Google Scholar 

  47. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)

    Article  Google Scholar 

  48. Walker, S.H., Duncan, D.B.: Estimation of the probability of an event as a function of several independent variables. Biometrika 54, 167–179 (1967)

    Article  MathSciNet  Google Scholar 

  49. Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)

    Article  Google Scholar 

  50. Amazon: Amazon appstore (2017). http://www.amazon.com/mobile-apps/b?node=2350149011. Accessed Aug 2017

  51. WEKA. http://www.cs.waikato.ac.nz/ml/weka/. Accessed Aug 2017

  52. Bishop, C.M.: Pattern recognition. Mach. Learn. 128, 1–58 (2006)

    Google Scholar 

  53. Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14, pp. 1137–1145 (1995)

    Google Scholar 

  54. Singh, Y., Kaur, A., Malhotra, R.: Comparative analysis of regression and machine learning methods for predicting fault proneness models. Int. J. Comput. Appl. Technol. 35, 183–193 (2009)

    Article  Google Scholar 

  55. Oommen, T., Baise, L.G., Vogel, R.M.: Sampling bias and class imbalance in maximum-likelihood logistic regression. Math. Geosci. 43, 99–120 (2011)

    Article  Google Scholar 

  56. Hong, L., Davison, B.D.: Empirical study of topic modeling in Twitter. In: Proceedings of the First Workshop on Social Media Analytics, pp. 80–88. ACM (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tal Hadad , Rami Puzis , Bronislav Sidik , Nir Ofek or Lior Rokach .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hadad, T., Puzis, R., Sidik, B., Ofek, N., Rokach, L. (2018). Application Marketplace Malware Detection by User Feedback Analysis. In: Mori, P., Furnell, S., Camp, O. (eds) Information Systems Security and Privacy. ICISSP 2017. Communications in Computer and Information Science, vol 867. Springer, Cham. https://doi.org/10.1007/978-3-319-93354-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93354-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93353-5

  • Online ISBN: 978-3-319-93354-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics