Abstract
Android malware is most commonly delivered to a user through the many open app marketplaces. Several recent attacks have shown that the same malware infects different apps in the app market. Automated triaging by computing similarity of apps to known software components can help learn the evolution and propagation of malware. While the emphasis of existing research is on detecting repackaged apps, a similarity analysis system that can identify similar portions of code in dissimilar apps, is important. Only few public tools exist that furnish these details accurately. In this paper, we present a proof-of-concept of an analysis system that compares Android apps using a technique that combines class and method features of an app. We use a two-step process that first compute similar classes and then compute similar methods of those classes. To identify similar classes, we propose a novel set of software birthmarks. We use Normalized Compression Distance to compute similar methods. The birthmarks are evaluated on a set of over 65,000 classes from 60 APKs. To evaluate the performance of our tool, we establish ground truth by manually reverse engineering each app. The proposed system is compared with Google’s androsim, the only open-source tool for similarity analysis that also uses NCD. Our approach shows an improvement in accuracy in the worst-case when comapred to androsim. Finally, we furnish a case-study of our system to detect fake and repackaged apps by analyzing 1470 Android apps from various sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Androguard (2016). https://code.google.com/p/androguard/
Android: Configure Apps With Over 64K Methods (2016). https://developer.android.com/studio/build/multidex.html
Cebrián, M., Alfonseca, M., Ortega, A.: Common pitfalls using the normalized compression distance: what to watch out for in a compressor. Commun. Inf. Syst. 5, 367–384 (2005)
Chen, K., Liu, P., Zhang, Y.: Achieving accuracy and scalability simultaneously in detecting application clones on Android markets. In: ICSE (2014)
Chen, K., et al.: Finding unknown malice in 10 seconds: massing vetting for new threats at Google-pay scale. USENIX (2015)
Chen, S., Ma, B., Zhang, K.: On the similarity metric and the distance metric. Theor. Comput. Sci. 410, 2365–2376 (2009)
Chen, X., Francia, B., Li, M., McKinnon, B., Seker, A.: Shared information and program plagiarism detection. IEEE Trans. Inf. Theory 50, 1545–1551 (2004)
Crussell, J., Gibler, C., Chen, H.: Attack of the clones: detecting cloned applications on Android markets. In: Foresti, S., Yung, M., Martinelli, F. (eds.) ESORICS 2012. LNCS, vol. 7459, pp. 37–54. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33167-1_3
Desnos, A.: Android: static analysis using similarity distance. In: AusPDC (2010)
Desnos, A.: Measuring similarity of Android applications via reversing and k-gram birthmarking. In: AusPDC (2010)
Fan, M., Liu, J., Wang, W., Li, H., Tian, Z., Liu, T.: DAPASA: detecting Android piggybacked apps through sensitive subgraph analysis. IEEE Trans. Inf. Forensics Secur. 12, 1772–1785 (2017)
Faruki, P., et al.: Android security: a survey of issues, malware penetration, and defenses. IEEE Commun. Surv. Tutor. 17, 998–1022 (2015)
Faruki, P., Ganmoor, V., Laxmi, V., Gaur, M., Bharmal, A.: AndroSimilar: robust statistical feature signature for Android malware detection. In: AusPDC (2010)
Gadyatskaya, O., Lezza, A.-L., Zhauniarovich, Y.: Evaluation of resource-based app repackaging detection in Android. In: Brumley, B.B., Röning, J. (eds.) NordSec 2016. LNCS, vol. 10014, pp. 135–151. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47560-8_9
Gascon, H., Yamaguchi, F., Arp, D.: Structural detection of Android malware using embedded call graphs (2018)
Guan, Q., Huang, H., Luo, W., Zhu, S.: Semantics-based repackaging detection for mobile apps. In: Caballero, J., Bodden, E., Athanasopoulos, E. (eds.) ESSoS 2016. LNCS, vol. 9639, pp. 89–105. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30806-7_6
Gurulian, I., Markantonakis, K., Cavallaro, L., Mayes, K.: You can’t touch this: consumer-centric Android application repackaging detection. Future Gener. Comput. Syst. 65, 1–9 (2016)
Hanna, S., Huang, L., Wu, E., Li, S., Chen, C., Song, D.: Juxtapp: a scalable system for detecting code reuse among Android applications. In: Flegel, U., Markatos, E., Robertson, W. (eds.) DIMVA 2012. LNCS, vol. 7591, pp. 62–81. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37300-8_4
Haoshi, H.: Detecting repackaged Android apps using server-side analysis. Master’s thesis, Eindhoven University of Technology (2016)
Huang, H., Zhu, S., Liu, P., Wu, D.: A framework for evaluating mobile app repackaging detection algorithms. In: Huth, M., Asokan, N., Čapkun, S., Flechais, I., Coles-Kemp, L. (eds.) Trust 2013. LNCS, vol. 7904, pp. 169–186. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38908-5_13
Ishii, Y., Watanabe, T., Akiyama, M., Mori, T.: Clone or relative?: understanding the origins of similar Android apps. In: IWSPA (2016)
Ishio, T., Sakaguchi, Y., Ito, K., Inoue, K.: Source file set search for clone-and-own reuse analysis. In: ICSE (2017)
Kang, H., Jang, J., Mohaisen, A., Kim, H.K.: Detecting and classifying Android malware using static analysis along with creator information. IJDSN 11, 479174 (2015)
Kang, S., Shim, H., Cho, S., Park, M., Han, S.: A robust and efficient birthmark-based Android application filtering system. In: RACS (2014)
Kim, D., Gokhale, A., Ganapathy, V., Srivastava, A.: Detecting plagiarized mobile apps using API birthmarks. Autom. Softw. Eng. 23, 591–618 (2016)
Kornblum, J.D.: Identifying almost identical files using context triggered piece-wise hashing. Digit. Invest. 3, 91–97 (2006)
Li, L., Bissyande, T.F., Klein, J., Traon, Y.L.: An investigation into the use of common libraries in Android apps. CoRR (2015)
Li, L., et al.: On locating malicious code in piggybacked Android apps. J. Comput. Sci. Technol. 32, 1108–1124 (2017)
Li, L., et al.: Automatically locating malicious packages in piggybacked Android apps. In: MOBILESoft (2017)
Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.M.B.: The similarity metric. IEEE Trans. Inf. Theory 50, 3250–3264 (2004)
Lina, S., et al.: AppIS: protect Android apps against runtime repackaging attacks. In: ICPADS (2017)
Linares-Vásquez, M., Holtzhauer, A., Poshyvanyk, D.: On automatically detecting similar Android apps. In: IEEE ICPC (2016)
Lyu, F., Lin, Y., Yang, J.: An efficient and packing-resilient two-phase Android cloned application detection approach. Mob. Inf. Syst. 2017, 12 p. (2017). https://doi.org/10.1155/2017/6958698. Article ID 6958698
Pouik, G.: Phrack (2016). http://phrack.org/issues/68/15.html#article
Salem, A.: Stimulation and detection of Android repackaged malware with active learning. arXiv (2018)
Soh, C., Tan, H.B.K., Arnatovich, Y.L., Wang, L.: Detecting clones in Android applications through analyzing user interfaces. In: ICSE (2015)
Sophos (2018). https://nakedsecurity.sophos.com/2017/08/24/malware-rains-on-googles-android-oreo-parade/
Suarezl, G., Tapiador, J.E., Peris-Lopez, P., Blasco, J.: Dendroid: a text mining approach to analyzing and classifying code structures in Android malware families. Expert Syst. Appl. 41, 1104–1117 (2014)
Sun, M., Li, M., Lui, J.C.S.: DroidEagle: seamless detection of visually similar Android apps. In: ACM WiSec (2015)
Tamada, H.: (2016). http://stigmata.osdn.jp/
Tamada, H., Nakamura, M., Monden, A., Matsumoto, K.I.: Java birthmarks - detecting the software theft. IEICE Trans. 88, 2148–2158 (2005)
Thomas, D.R., Beresford, A.R., Rice, A.C.: Security metrics for the Android ecosystem. In: SPSMCCS. ACM (2015)
Tian, K., Yao, D., Ryder, B.G., Tan, G.: Analysis of code heterogeneity for high-precision classification of repackaged malware. In: SPW (2016)
Gayoso Martínez, V., Hernández Álvarez, F., Hernández Encinas, L.: State of the art in similarity preserving hashing functions. SAM (2014)
Wang, H., Guo, Y., Ma, Z., Chen, X.: WuKong: a scalable and accurate two-phase approach to Android app clone detection. In: ISSTA. ACM SIGSOFT (2015)
Yue, S., et al.: RepDroid: an automated tool for Android application repackaging detection. In: ICPC (2017)
Zhauniarovich, Y., Gadyatskaya, O., Crispo, B., La Spina, F., Moser, E.: FSquaDRA: fast detection of repackaged applications. In: Atluri, V., Pernul, G. (eds.) DBSec 2014. LNCS, vol. 8566, pp. 130–145. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43936-4_9
Zhou, W., Zhou, Y., Grace, M., Jiang, X., Zou, S.: Fast, scalable detection of “piggybacked” mobile applications. In: CODASPY. ACM (2013)
Zhou, W., Zhou, Y., Jiang, X., Ning, P.: Detecting repackaged smartphone applications in third-party Android marketplaces. In: CODASPY. ACM (2013)
Zhou, Y., Jiang, X.: Dissecting Android malware: characterization and evolution. In: IEEE Symposium on S&P. IEEE (2012)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kishore, S., Kumar, R., Rajan, S. (2018). Towards Accuracy in Similarity Analysis of Android Applications. In: Ganapathy, V., Jaeger, T., Shyamasundar, R. (eds) Information Systems Security. ICISS 2018. Lecture Notes in Computer Science(), vol 11281. Springer, Cham. https://doi.org/10.1007/978-3-030-05171-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-05171-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05170-9
Online ISBN: 978-3-030-05171-6
eBook Packages: Computer ScienceComputer Science (R0)