Skip to main content
Log in

Lib2Desc: automatic generation of security-centric Android app descriptions using third-party libraries

  • Regular contribution
  • Published:
International Journal of Information Security Aims and scope Submit manuscript

Abstract

Android app developers are expected to specify the use of dangerous permissions in their app descriptions. The absence of such data indicates suspicious behavior. However, this is not always caused by the malicious intent of developers; it may be due to the lack of documentation of the third-party libraries they use. To fill this gap in the literature, this study aims to enrich application descriptions with security-centric information of third-party libraries. To automatically generate application definitions, the study explores classifying libraries and extracting code summaries of library methods that use dangerous permissions and/or leak data. Both the textual information of third-party libraries and their source code are used to create these definitions. To the best of our knowledge, this is the first approach in the literature that creates app descriptions based on third-party libraries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The datasets generated during and/or analyzed during the current study are available in the Lib2Desc repository, [https://wise.cs.hacettepe.edu.tr/projects/desre/Lib2Desc/].

References

  1. Sen, S., Can, B.: Android security using nlp techniques: a review. Preprint arXiv:2107.03072, (2021)

  2. Qu, Z., Rastogi, V., Zhang, X., Chen, Y., Zhu, T., Chen, Z.: Autocog: measuring the description-to-permission fidelity in android applications. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1354–1365 (2014)

  3. Feng, Y., Chen, L., Zheng, A., Gao, C., Zheng, Z.: Ac-net: assessing the consistency of description and permission in android apps. IEEE Access 7, 57829–57842 (2019)

    Article  Google Scholar 

  4. Alecakir, H., Kabukcu, M., Can, B., Sen, S.: Discovering inconsistencies between requested permissions and application metadata by using deep learning. In: 2020 International Conference on Information Security and Cryptology (ISCTURKEY), pp. 56–56, IEEE (2020)

  5. Alecakir, H., Can, B., Sen, S.: Attention: there is an inconsistency between android permissions and application metadata!, pp. 1–19 (2021)

  6. Andow, B., Nadkarni, A., Bassett, B., Enck, W., Xie, T.: A study of grayware on google play. In: 2016 IEEE Security and Privacy Workshops (SPW), pp. 224–233, IEEE (2016)

  7. Wang, H., Guo, Y.: Understanding third-party libraries in mobile app analysis. In: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pp. 515–516, IEEE (2017)

  8. Privacygrade: Grading the privacy of smartphone apps. (2021). (Visited September 2021) [Online]. Available: http://privacygrade.org/

  9. Wang, H., Guo, Y., Ma, Z., Chen, X.: Wukong: a scalable and accurate two-phase approach to android app clone detection. In: Proceedings of the 2015 International Symposium on Software Testing and Analysis, pp. 71–82 (2015)

  10. Book, T., Pridgen, A., Wallach, D.S.: Longitudinal analysis of android ad library permissions. Preprint arXiv:1303.0857, (2013)

  11. Stevens, R., Gibler, C., Crussell, J., Erickson, J., Chen, H.: Investigating user privacy in android ad libraries. In: Workshop on Mobile Security Technologies (MoST), vol. 10, Citeseer (2012)

  12. Zhang, M., Duan, Y., Feng, Q., Yin, H.: Towards automatic generation of security-centric descriptions for android apps. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 518–529, ACM (2015)

  13. Wu, T., Tang, L., Zhang, R., Wen, S., Paris, C., Nepal, S., Grobler, M., Xiang, Y.: Catering to your concerns: automatic generation of personalised security-centric descriptions for android apps. ACM Trans. Cyber-Phys. Syst. 3(4), 36 (2019)

    Article  Google Scholar 

  14. Ahmad, W.U., Chakraborty, S., Ray, B., Chang, K.: A transformer-based approach for source code summarization. CoRR, arXiv:abs/2005.00653, (2020)

  15. Liu, X., Leng, Y., Yang, W., Zhai, C., Xie, T.: Mining android app descriptions for permission requirements recommendation. In: 2018 IEEE 26th International Requirements Engineering Conference (RE), pp. 147–158, IEEE (2018)

  16. Wu, T., Tang, L., Zhang, R., Wen, S., Paris, C., Nepal, S., Grobler, M., Xiang, Y.: Catering to your concerns. ACM Trans. Cyber-Phys. Syst. 3, 1–21 (2019)

    Article  Google Scholar 

  17. John, O., Naumann, L., Soto, C.: Paradigm shift to the integrative big five trait taxonomy: History, measurement, and conceptual issues, pp. 114–158. 01 (2008)

  18. Yu, L., Zhang, T., Luo, X., Xue, L.: Autoppg: towards automatic generation of privacy policy for android applications. In: Proceedings of the 5th Annual ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 39–50 (2015)

  19. Chen, W., Aspinall, D., Gordon, A.D., Sutton, C., Muttik, I.: A text-mining approach to explain unwanted behaviours. In: Proceedings of the 9th European Workshop on System Security, p. 4, ACM (2016)

  20. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  21. Grace, M.C., Zhou, W., Jiang, X., Sadeghi, A.-R.: Unsafe exposure analysis of mobile in-app advertisements. In: Proceedings of the fifth ACM conference on Security and Privacy in Wireless and Mobile Networks, pp. 101–112 (2012)

  22. He, Y., Yang, X., Hu, B., Wang, W.: Dynamic privacy leakage analysis of android third-party libraries. J. Inf. Secur. Appl. 46, 259–270 (2019)

    Google Scholar 

  23. Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: Proceedings of the 36th International Conference on Software Engineering, pp. 1025–1035 (2014)

  24. Zhang, C., Wang, H., Wang, R., Guo, Y., Xu, G.: Re-checking app behavior against app description in the context of third-party libraries. In: SEKE, pp. 665–664 (2018)

  25. Narayanan, A., Chen, L., Chan, C.K.: Addetect: automated detection of android ad libraries using semantic analysis. In: 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), pp. 1–6, IEEE (2014)

  26. Liu, B., Liu, B., Jin, H., Govindan, R.: Efficient privilege de-escalation for ad libraries in mobile apps. In: Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, pp. 89–103 (2015)

  27. Allamanis, M., Barr, E.T., Devanbu, P., Sutton, C.: A survey of machine learning for big code and naturalness. ACM Comput. Surv., vol. 51 (2018)

  28. Haiduc, S., Aponte, J., Marcus, A.: Supporting program comprehension with source code summarization. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2, ICSE ’10, (New York, NY, USA), pp. 223–226, Association for Computing Machinery (2010)

  29. LeClair, A., Jiang, S., McMillan, C.: A neural model for generating natural language summaries of program subroutines. CoRR, arXiv:abs/1902.01954 (2019)

  30. Sridhara, G., Pollock, L., Vijay-Shanker, K.: Automatically detecting and describing high level actions within methods. In: 2011 33rd International Conference on Software Engineering (ICSE), pp. 101–110 (2011)

  31. McBurney, P., McMillan, C.: Automatic documentation generation via source code summarization of method context. In: 2nd International Conference on Program Comprehension, ICPC 2014 - Proceedings, 06 (2014)

  32. Alon, U., Levy, O., Yahav, E.: code2seq: Generating sequences from structured representations of code. CoRR, arXiv:abs/1808.01400, (2018)

  33. Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: Proceedings of the 26th Conference on Program Comprehension, ICPC ’18, (New York, NY, USA), pp. 200–210, Association for Computing Machinery (2018)

  34. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. CoRR, arXiv:abs/1409.3215, (2014)

  35. Wang, W., Zhang, Y., Zeng, Z., Xu, G.: Trans \(\hat{}\) 3: A transformer-based framework for unifying code summarization and code search. CoRR, arXiv:abs/2003.03238, (2020)

  36. Shi, E., Wang, Y., Du, L., Chen, J., Han, S., Zhang, H., Zhang, D., Sun, H.: Neural code summarization: How far are we? (2021)

  37. Hu, X., Li, G., Xia, X., Lo, D., Lu, S., Jin, Z.: Summarizing source code with transferred api knowledge. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 2269–2275, International Joint Conferences on Artificial Intelligence Organization, 7 (2018)

  38. Rodeghero, P., McMillan, C., Shirey, A.: Api usage in descriptions of source code functionality. In: 2017 IEEE/ACM 1st International Workshop on API Usage and Evolution (WAPI), pp. 3–6, IEEE (2017)

  39. Shahbazi, R., Sharma, R., Fard, F.H.: Api2com: On the improvement of automatically generated code comments using API documentations. CoRR, arXiv:abs/2103.10668, (2021)

  40. Android arsenal: Android developer portal with tools, libraries, and app. https://android-arsenal.com/. Online; last accessed on April 4 (2022)

  41. Sonatype, Maven central repository search. https://search.maven.org/, 2017. Online; last accessed on November 2 (2021)

  42. JFrog, I.: Spring.io. https://repo.spring.io/, 2013. Online; last accessed on November 2 (2021)

  43. JFrog, I.: Jcenter is the place to find and share popular apache maven packages. https://bintray.com/bintray/jcenter, 2016. Online; last accessed on November 2 (2021)

  44. JitPack, Jitpack | publish jvm and android libraries. https://jitpack.io/, 2015. Online; last accessed on November 2 (2021)

  45. Backes, M., Bugiel, S., Derr, E., McDaniel, P., Octeau, D., Weisgerber, S.: On demystifying the android application framework: Re-visiting android permission specification analysis. In: 25th \(\{\)USENIX\(\}\) security symposium (\(\{\)USENIX\(\}\) security 16), pp. 1101–1118 (2016)

  46. Ma, Z., Wang, H., Guo, Y., Chen, X.: Libradar: Fast and accurate detection of third-party libraries in android apps. In: 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C), pp. 653–656 (2016)

  47. Li, M., Wang, W., Wang, P., Wang, S., Wu, D., Liu, J., Xue, R., Huo, W.: Libd: scalable and precise third-party library detection in android markets. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 335–346 (2017)

  48. Zhang, Y., Dai, J., Zhang, X., Huang, S., Yang, Z., Yang, M., Chen, H.: Detecting third-party libraries in android applications with high precision and recall. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 141–152 (2018)

  49. Derr, E., Bugiel, S., Fahl, S., Acar, Y., Backes, M.: Keep me updated: An empirical study of third-party library updatability on android. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS ’17, (New York, NY, USA), pp. 2187–2200, ACM (2017)

  50. Zhang, J., Beresford, A.R., Kollmann, S.A.: Libid: reliable identification of obfuscated third-party android libraries. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, (New York, NY, USA), p. 55-65, Association for Computing Machinery (2019)

  51. Backes, M., Bugiel, S., Derr, E., McDaniel, P., Octeau, D., Weisgerber, S.: On demystifying the android application framework: Re-visiting android permission specification analysis. In: 25th \(\{\)USENIX\(\}\) security symposium (\(\{\)USENIX\(\}\) security 16), pp. 1101–1118 (2016)

  52. Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. Acm Sigplan Notices 49(6), 259–269 (2014)

    Article  Google Scholar 

  53. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, arXiv:abs/1810.04805, (2018)

  54. Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A.M.: Transformers: State-of-the-Art Natural Language Processing 10 (2020)

  55. Bird, S.: Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69–72 (2006)

  56. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  57. Stevens, R., Gibler, C., Crussell, J., Erickson, J., Chen, H.: Investigating user privacy in android ad libraries. In: Workshop on Mobile Security Technologies (MoST), vol. 10, Citeseer (2012)

  58. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR, arXiv:abs/1706.03762, (2017)

  59. Hu, X., Li, G., Xia, X., Lo, D., Lu, S., Jin, Z.: Summarizing source code with transferred api knowledge. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 2269–2275, International Joint Conferences on Artificial Intelligence Organization, 7 (2018)

  60. LeClair, A., McMillan, C.: Recommendations for datasets for source code summarization. CoRR, arXiv:abs/1904.02660, (2019)

  61. Feizollah, A., Anuar, N.B., Salleh, R., Wahab, A.W.A.: A review on feature selection in mobile malware detection. Digital Investig. 13, 22–37 (2015)

    Article  Google Scholar 

  62. Qu, Z., Rastogi, V., Zhang, X., Zhu, T., Chen, Z.: Autocog: measuring the description-to-permission fidelity in android applications. In: Proceedings of the ACM Conference on Computer and Communications Security, pp. 1354–1365, 11 (2014)

  63. Zhang, F., Huang, H., Zhu, S., Wu, D., Liu, P.: Viewdroid: towards obfuscation-resilient mobile application repackaging detection. In: Proceedings of the 2014 ACM Conference on Security and Privacy in Wireless & Mobile Networks, pp. 25–36 (2014)

Download references

Acknowledgements

We would like to thank TUBITAK for its support. This study is supported by the Scientific and Technological Research Council of Turkey (TUBITAK-118E141).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beyza Cevik.

Ethics declarations

Conflict of interest

Author Beyza Cevik declares that he has no conflict of interest. Author Nur Altiparmak declares that she has no conflict of interest. Author Murat Aksu declares that she has no conflict of interest. Author Sevil Sen declares that she has no conflict of interest.

Ethical approval

This article does not contain any study with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cevik, B., Altiparmak, N., Aksu, M. et al. Lib2Desc: automatic generation of security-centric Android app descriptions using third-party libraries. Int. J. Inf. Secur. 21, 1107–1125 (2022). https://doi.org/10.1007/s10207-022-00601-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10207-022-00601-x

Keywords

Navigation