Abstract
Malware often conceal their malicious behavior by making unscrupulous use of library APIs. Hence any accurate malware analysis must track data-flows not only through the application but also through the library. Libraries like Android (2 mLOC) are too large to be analyzed repeatedly with each application, hence we need to compute data-flow summaries of libraries that are expressive enough to reveal possible malicious flows, and compact to be included in malware analysis along with each application.
We present FlowMiner, a novel approach to automatically extract the data-flow summary of a Java library, given its source or bytecode. FlowMiner’s summaries are fine-grained, i.e., preserve key artifacts from the original library to enable accurate context, object, field, flow and type-sensitive malware analysis of applications in conjunction with the library. Unlike prior summarization techniques, FlowMiner resolves method calls to anonymous classes to a single target, making it more precise. FlowMiner’s summaries are compact, e.g., contain only about a third (fourth) of the nodes (edges, resp.) in the data-flow semantics of recent versions of Android. FlowMiner’s summaries are stored in XML, allowing any analysis tool to use them for analysis.
http://powerofpi.github.io/FlowMiner/.
This material is based on research sponsored by DARPA under agreement numbers FA8750-15-2-0080 and FA8750-12-2-0126. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or the U.S. Government.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We omit details of the Atlas platform; the interested reader can refer [13].
- 2.
Recall that each call site in \(\textit{R}^{+}\) can be resolved to a single target.
- 3.
For each version, we downloaded the Android framework from the build for the aosp_arm-user device configuration and then generated corresponding JVM bytecode that can be analyzed with Atlas.
References
Automated program analysis for cybersecurity (apac), July 2011. https://www.fbo.gov/index?s=opportunity&mode=form&id=a14e4533c2a44c3288b6a29fa6fc5841&tab=core&_cview=1
Android 4.4.4 (kitkat), May 2015. http://www.android.com/versions/kit-kat-4-4/
Extensible common software graph, March 2015. http://ensoftatlas.com/wiki/Extensible_Common_Software_Graph
Ali, K., Lhoták, O.: Application-only call graph construction. In: Noble, J. (ed.) ECOOP 2012. LNCS, vol. 7313, pp. 688–712. Springer, Heidelberg (2012)
Ali, K., Lhoták, O.: Averroes: whole-program analysis without the whole program. In: Castagna, G. (ed.) ECOOP 2013. LNCS, vol. 7920, pp. 378–400. Springer, Heidelberg (2013)
Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. SIGPLAN Not. 49(6), 259–269 (2014)
Burnette, E.: Hello, Android: introducing Google’s mobile development platform. Pragmatic Bookshelf (2009)
Callahan, D.: The program summary graph and flow-sensitive interprocedual data flow analysis, vol. 23. ACM (1988)
Cao, Y., Fratantonio, Y., Bianchi, A., Egele, M., Kruegel, C., Vigna, G., Chen, Y.: Edgeminer: Automatically detecting implicit control flow transitions through the android framework. 22nd Annual Network and Distributed System Security Symposium, NDSS San Diego, California, USA (2015)
Chatterjee, R., Ryder, B.G., Landi, W.A.: Relevant context inference. In: ACM Symposium on Principles of Programming Languages, pp. 133–146. ACM (1999)
Clapp, L., Anand, S., Aiken, A.: Modelgen: mining explicit information flow specifications from concrete executions. In: International Symposium on Software Testing and Analysis, pp. 129–140. ACM (2015)
Deering, T.: April 2015. http://powerofpi.github.io/FlowMiner/
Deering, T., Kothari, S., Sauceda, J., Mathews, J.: Atlas: a new way to explore software, build analysis tools. In: Companion Proceedings of the International Conference on Software Engineering, pp. 588–591. ACM (2014)
Felt, A.P., Finifter, M., Chin, E., Hanna, S., Wagner, D.: A survey of mobile malware in the wild. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, SPSM 2011, pp. 3–14. ACM (2011)
Grove, D., Chambers, C.: A framework for call graph construction algorithms. ACM Trans. Prog. Lang. Syst. (TOPLAS) 23(6), 685–746 (2001)
LaToza, T., Myers, B.: Visualizing call graphs. In: Visual Languages and Human-Centric Computing (VL/HCC), Symposium on, pp. 117–124. IEEE (2011)
Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Computer security applications conference, pp. 421–430. IEEE (2007)
Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via graph reachability. In: Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 49–61. ACM (1995)
Rogers, R., Lombardo, J., Mednieks, Z., Meike, B.: Android Application Development: Programming with the Google SDK. O’Reilly Media, Inc., Sebastopol (2009)
Rosen, S., Qian, Z., Mao, Z.M.: Appprofiler: a flexible method of exposing privacy-related behavior in android applications to end users. In: Proceedings of the ACM conference on Data and application security and privacy, pp. 221–232. ACM (2013)
Rountev, A., Kagan, S., Marlowe, T.: Interprocedural dataflow analysis in the presence of large libraries. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 2–16. Springer, Heidelberg (2006)
Rountev, A., Sharp, M., Xu, G.: IDE dataflow analysis in the presence of large object-oriented libraries. In: Hendren, L. (ed.) CC 2008. LNCS, vol. 4959, pp. 53–68. Springer, Heidelberg (2008)
Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In: Muchnick, S.S., Jones, N.D. (eds.) Program Flow Analysis: Theory and Applications, pp. 189–234. Prentice Hall, New York (1981)
Yan, D., Xu, G., Rountev, A.: Rethinking soot for summary-based whole-program analysis. In: Proceedings of the ACM SIGPLAN International Workshop on State of the Art in Java Program analysis, pp. 9–14. ACM (2012)
Zhang, W., Ryder, B.: Constructing accurate application call graphs for java to model library callbacks. In: Sixth IEEE International Workshop on Source Code Analysis and Manipulation, SCAM 2006, pp. 63–74. IEEE (2006)
Zhou, Y., Jiang, X.: Dissecting android malware: Characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 95–109. IEEE (2012)
Acknowledgements
We would like to thank the team at EnSoft Corp who developed the Atlas platform, including Jeremias Sauceda, Jon Matthews, and Nikhil Ranade; and ISU APAC team members who contributed to the malware detection tooling.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Deering, T., Santhanam, G.R., Kothari, S. (2015). FlowMiner: Automatic Summarization of Library Data-Flow for Malware Analysis. In: Jajoda, S., Mazumdar, C. (eds) Information Systems Security. ICISS 2015. Lecture Notes in Computer Science(), vol 9478. Springer, Cham. https://doi.org/10.1007/978-3-319-26961-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-26961-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26960-3
Online ISBN: 978-3-319-26961-0
eBook Packages: Computer ScienceComputer Science (R0)