skip to main content
10.1145/3324884.3416582acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Automated third-party library detection for Android applications: are we there yet?

Authors Info & Claims
Published:27 January 2021Publication History

ABSTRACT

Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them.

To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results show that LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.

References

  1. 2007. survey. Guidelines for performing systematic literature reviews in software engineering.Google ScholarGoogle Scholar
  2. 2010-2019. AppBrain. https://www.appbrain.com/stats/libraries/.Google ScholarGoogle Scholar
  3. 2013. ART. https://source.android.com/devices/tech/dalvik.Google ScholarGoogle Scholar
  4. 2013. sdhash. http://roussev.net/sdhash/sdhash.html.Google ScholarGoogle Scholar
  5. 2016. Androguard. https://github.com/androguard/androguard.Google ScholarGoogle Scholar
  6. 2016. LibRadar. https://github.com/pkumza/LibRadarGoogle ScholarGoogle Scholar
  7. 2017. LibD. https://github.com/IIE-LibD/libdGoogle ScholarGoogle Scholar
  8. 2019. Allatori. http://www.allatori.com/Google ScholarGoogle Scholar
  9. 2019. Apktool. https://ibotpeaches.github.io/Apktool/.Google ScholarGoogle Scholar
  10. 2019. App Future. https://www.smashingmagazine.com/2017/02/current-trends-future-prospects-mobile-app-market/Google ScholarGoogle Scholar
  11. 2019. Benchmark data. https://github.com/presto-osu/orlis-orcis/tree/master/orlis/open_source_benchmarksGoogle ScholarGoogle Scholar
  12. 2019. BitBucket. https://bitbucket.org/Google ScholarGoogle Scholar
  13. 2019. DashO. https://www.preemptive.com/products/dasho/overviewGoogle ScholarGoogle Scholar
  14. 2019. dex2jar. https://github.com/pxb1988/dex2jarGoogle ScholarGoogle Scholar
  15. 2019. F-Droid. https://f-droid.org/en/packages/Google ScholarGoogle Scholar
  16. 2019. Github. https://github.com/Google ScholarGoogle Scholar
  17. 2019. Google Mvn. https://dl.google.com/dl/android/maven2/index.htmlGoogle ScholarGoogle Scholar
  18. 2019. Jcenter. https://jcenter.bintray.com/Google ScholarGoogle Scholar
  19. 2019. Library Scraper. https://github.com/reddr/LibScout/blob/master/scripts/library-scraper.pyGoogle ScholarGoogle Scholar
  20. 2019. Maven. https://mvnrepository.com/Google ScholarGoogle Scholar
  21. 2019. Proguard. https://www.guardsquare.com/en/products/proguardGoogle ScholarGoogle Scholar
  22. 2019. Soot. https://github.com/Sable/sootGoogle ScholarGoogle Scholar
  23. 2019. statista. https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/.Google ScholarGoogle Scholar
  24. 2020. Android App Bundle. https://developer.android.com/platform/technology/app-bundle.Google ScholarGoogle Scholar
  25. 2020. F1 score. https://en.wikipedia.org/wiki/F1_score.Google ScholarGoogle Scholar
  26. 2020. gurobi. https://www.gurobi.com/.Google ScholarGoogle Scholar
  27. 2020. kotlin. https://kotlinlang.org/.Google ScholarGoogle Scholar
  28. 2020. LibDetect. https://sites.google.com/view/libdetectGoogle ScholarGoogle Scholar
  29. 2020. LibDetect. https://sites.google.com/view/libdetect/.Google ScholarGoogle Scholar
  30. 2020. LibID updated code. https://github.com/MIchicho/LibIDGoogle ScholarGoogle Scholar
  31. 2020. National Vulnerability Database. https://nvd.nist.gov/Google ScholarGoogle Scholar
  32. 2020. Questionnaire of User Study. https://forms.gle/ueJAkuone9ZnCXn68.Google ScholarGoogle Scholar
  33. Michael Backes, Sven Bugiel, and Erik Derr. 2016. Reliable Third-Party Library Detection in Android and Its Security Applications. In CCS.Google ScholarGoogle Scholar
  34. Salman A. Baset, Shih-Wei Li, Philippe Suter, and Omer Tripp. 2017. Identifying Android Library Dependencies in the Presence of Code Obfuscation and Minimization. In Proceedings of the 39th International Conference on Software Engineering Companion.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Baykara and E. Colak. 2018. A review of cloned mobile malware applications for Android devices. In Proc. ISDFS. 1--5. Google ScholarGoogle ScholarCross RefCross Ref
  36. Kai Chen, Peng Liu, and Yingjun Zhang. 2014. Achieving accuracy and scalability simultaneously in detecting application clones on Android markets. In Proceedings of the 36th International Conference on Software Engineering. ACM, 175--186.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Kai Chen, Peng Liu, and Y. Zhang. 2014. Achieving Accuracy and Scalability Simultaneously in Detecting Application Clones on Android Markets. In Proc. ICSE.Google ScholarGoogle Scholar
  38. Sen Chen, Lingling Fan, Guozhu Meng, Ting Su, Minhui Xue, Yinxing Xue, Yang Liu, and Lihua Xu. 2020. An Empirical Assessment of Security Risks of Global Android Banking Apps. In Proceedings of the 42st International Conference on Software Engineering. IEEE Press, 596--607.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sen Chen, Ting Su, Lingling Fan, Guozhu Meng, Minhui Xue, Yang Liu, and Lihua Xu. 2018. Are mobile banking apps secure? What can be improved?. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 797--802.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shauvik Roy Choudhary, Alessandra Gorla, and Alessandro Orso. 2015. Automated Test Input Generation for Android: Are We There Yet?. In Proc. ASE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, and Geguang Pu. 2018. Efficiently manifesting asynchronous programming errors in android apps. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 486--497.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, Geguang Pu, and Zhendong Su. 2018. Large-scale analysis of framework-specific exceptions in Android apps. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 408--419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Hongmu Han, Ruixuan Li, and Junwei Tang. 2018. Identify and Inspect Libraries in Android Applications. Wireless Personal Communications vol 103, pp491--503 (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. Hanna, L. Huang, E. Wu, S. Li, C. Chen, and D. Song. 2012. Juxtapp: a scalable system for detecting code reuse among Android applications. In Proc. DIMVA.Google ScholarGoogle Scholar
  45. C. Kai, W. Peng, L. Yeonjoon, Wang XiaoFeng, Zhang Nan, Huang Heqing, Zou Wei, and Liu Peng. 2015. Finding unknown malice in 10 seconds: Mass vetting for new threats at the google-play scale. In Proc. USENIX Security.Google ScholarGoogle Scholar
  46. P. Kong, L. Li, J. Gao, K. Liu, T. F. Bissyandé, and J. Klein. 2019. Automated Testing of Android Apps: A Systematic Literature Review. IEEE Transactions on Reliability 68, 1 (March 2019), 45--66.Google ScholarGoogle ScholarCross RefCross Ref
  47. Li Li, Taegawende Bissyandé, Jacques Klein, and Yves Le Traon. 2016. An Investigation into the Use of Common Libraries in Android Apps. In SANER.Google ScholarGoogle Scholar
  48. L. Li, T. F. Bissyande, and J. Klein. 2019. Rebooting Research on Detecting Repackaged Android Apps: Literature Review and Benchmark. IEEE Transactions on Software Engineering (2019), 1--1. Google ScholarGoogle ScholarCross RefCross Ref
  49. M. Li, P. Wang, W. Wang, S. Wang, D. Wu, J. Liu, R. Xue, W. Huo, and W. Zou. 2018. Large-scale Third-party Library Detection in Android Markets. IEEE Transactions on Software Engineering (2018), 1--1. Google ScholarGoogle ScholarCross RefCross Ref
  50. Menghao Li, Wei Wang, Pei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, and Wei Huo. 2017. LibD: Scalable and Precise Third-party Library Detection in Android Markets. In Proc. ICSE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. J. Lin, B. Liu, N. Sadeh, and J.I. Hong. 2014. Modeling users mobile app privacy preferences: Restoring usability in a sea of permission settings. In Proc. SOUPS.Google ScholarGoogle Scholar
  52. B. Liu, B. Liu, H. Jin, and R. Govindan. 2015. Efficient privilege de-escalation for ad libraries in mobile apps. In MobiSys.Google ScholarGoogle Scholar
  53. Ziang Ma, Haoyu Wang, Yao Guo, and Xiangqun Chen. 2016. LibRadar: Fast and Accurate Detection of Third-party Libraries in Android Apps. In Proc. ICSE-C.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Annamalai Narayanan, Lihui Chen, and Chee Keong Chan. 2014. AdDetect: Automated detection of Android ad libraries using semantic analysis. In Proc. ISSNIP.Google ScholarGoogle ScholarCross RefCross Ref
  55. Yuru Shao, Xiapu Luo, Chenxiong Qian, Pengfei Zhu, and Lei Zhang. 2014. Towards a scalable resource-driven approach for detecting repackaged Android applications. In Proc. ACSAC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. C. Soh, H. B. K. Tan, Y. L. Arnatovich, A. Narayanan, and L. Wang. 2016. LibSift: Automated Detection of Third-Party Libraries in Android Applications. In APSEC.Google ScholarGoogle Scholar
  57. Ting Su, Lingling Fan, Sen Chen, Yang Liu, Lihua Xu, Geguang Pu, and Zhendong Su. 2020. Why My App Crashes Understanding and Benchmarking Framework-specific Exceptions of Android apps. IEEE Transactions on Software Engineering (2020).Google ScholarGoogle ScholarCross RefCross Ref
  58. Haoyu Wang and Yao Guo. 2017. Understanding Third-party Libraries in Mobile App Analysis. In Proc. ICSE-C.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yan Wang, Haowei Wu, Hailong Zhang, and Atanas Rountev. 2018. ORLIS: Obfuscation-resilient Library Detection for Android. In Proc. MOBILESoft.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Claes Wohlin. 2014. Guidelines for Snowballing in Systematic Literature Studies and a Replication in Software Engineering. In Proc. 18thInt. Conf. Eval. Assessment Softw. Eng.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Xia Zeng, Dengfeng Li, Wujie Zheng, Fan Xia, Yuetang Deng, Wing Lam, Wei Yang, and Tao Xie. 2016. Automated Test Input Generation for Android: Are We Really There yet in an Industrial Case?. In Proc. FSE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Xian Zhan, Tao Zhang, and Yutian Tang. 2019. A Comparative Study of Android Repackaged Apps Detection Techniques. In Proc. SANER.Google ScholarGoogle ScholarCross RefCross Ref
  63. Fangfang Zhang, Heqing Huang, Sencun Zhu, Dinghao Wu, and Peng Liu. 2014. ViewDroid: Towards Obfuscation-Resilient Mobile Application Repackaging Detection. In Proc. ACM WiSec.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yuan Zhang, Jiarun Dai, Xiaohan Zhang, Sirong Huang, Zhemin Yang, Min Yang, and Hao Chen. 2018. Detecting third-party libraries in Android applications with high precision and recall. In SANER.Google ScholarGoogle Scholar
  65. W. Zhou, Y. Zhou, M. Grace, X. Jiang, and S. Zou. 2013. Fast, scalable detection of Piggybacked mobile applications. In Proc. CODASPY.Google ScholarGoogle Scholar
  66. W. Zhou, Y. Zhou, X. Jiang, and P. Ning. 2012. Detecting repackaged smartphone applications in third-party Android marketplaces. In Proc. CODASPY.Google ScholarGoogle Scholar

Index Terms

  1. Automated third-party library detection for Android applications: are we there yet?

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
        December 2020
        1449 pages
        ISBN:9781450367684
        DOI:10.1145/3324884

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 January 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate82of337submissions,24%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader