Skip to main content

FlowMiner: Automatic Summarization of Library Data-Flow for Malware Analysis

  • Conference paper
  • First Online:
Book cover Information Systems Security (ICISS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9478))

Included in the following conference series:

Abstract

Malware often conceal their malicious behavior by making unscrupulous use of library APIs. Hence any accurate malware analysis must track data-flows not only through the application but also through the library. Libraries like Android (2 mLOC) are too large to be analyzed repeatedly with each application, hence we need to compute data-flow summaries of libraries that are expressive enough to reveal possible malicious flows, and compact to be included in malware analysis along with each application.

We present FlowMiner, a novel approach to automatically extract the data-flow summary of a Java library, given its source or bytecode. FlowMiner’s summaries are fine-grained, i.e., preserve key artifacts from the original library to enable accurate context, object, field, flow and type-sensitive malware analysis of applications in conjunction with the library. Unlike prior summarization techniques, FlowMiner resolves method calls to anonymous classes to a single target, making it more precise. FlowMiner’s summaries are compact, e.g., contain only about a third (fourth) of the nodes (edges, resp.) in the data-flow semantics of recent versions of Android. FlowMiner’s summaries are stored in XML, allowing any analysis tool to use them for analysis.

http://powerofpi.github.io/FlowMiner/.

This material is based on research sponsored by DARPA under agreement numbers FA8750-15-2-0080 and FA8750-12-2-0126. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or the U.S. Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We omit details of the Atlas platform; the interested reader can refer [13].

  2. 2.

    Recall that each call site in \(\textit{R}^{+}\) can be resolved to a single target.

  3. 3.

    For each version, we downloaded the Android framework from the build for the aosp_arm-user device configuration and then generated corresponding JVM bytecode that can be analyzed with Atlas.

References

  1. Automated program analysis for cybersecurity (apac), July 2011. https://www.fbo.gov/index?s=opportunity&mode=form&id=a14e4533c2a44c3288b6a29fa6fc5841&tab=core&_cview=1

  2. Android 4.4.4 (kitkat), May 2015. http://www.android.com/versions/kit-kat-4-4/

  3. Extensible common software graph, March 2015. http://ensoftatlas.com/wiki/Extensible_Common_Software_Graph

  4. Ali, K., Lhoták, O.: Application-only call graph construction. In: Noble, J. (ed.) ECOOP 2012. LNCS, vol. 7313, pp. 688–712. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Ali, K., Lhoták, O.: Averroes: whole-program analysis without the whole program. In: Castagna, G. (ed.) ECOOP 2013. LNCS, vol. 7920, pp. 378–400. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. SIGPLAN Not. 49(6), 259–269 (2014)

    Article  Google Scholar 

  7. Burnette, E.: Hello, Android: introducing Google’s mobile development platform. Pragmatic Bookshelf (2009)

    Google Scholar 

  8. Callahan, D.: The program summary graph and flow-sensitive interprocedual data flow analysis, vol. 23. ACM (1988)

    Google Scholar 

  9. Cao, Y., Fratantonio, Y., Bianchi, A., Egele, M., Kruegel, C., Vigna, G., Chen, Y.: Edgeminer: Automatically detecting implicit control flow transitions through the android framework. 22nd Annual Network and Distributed System Security Symposium, NDSS San Diego, California, USA (2015)

    Google Scholar 

  10. Chatterjee, R., Ryder, B.G., Landi, W.A.: Relevant context inference. In: ACM Symposium on Principles of Programming Languages, pp. 133–146. ACM (1999)

    Google Scholar 

  11. Clapp, L., Anand, S., Aiken, A.: Modelgen: mining explicit information flow specifications from concrete executions. In: International Symposium on Software Testing and Analysis, pp. 129–140. ACM (2015)

    Google Scholar 

  12. Deering, T.: April 2015. http://powerofpi.github.io/FlowMiner/

  13. Deering, T., Kothari, S., Sauceda, J., Mathews, J.: Atlas: a new way to explore software, build analysis tools. In: Companion Proceedings of the International Conference on Software Engineering, pp. 588–591. ACM (2014)

    Google Scholar 

  14. Felt, A.P., Finifter, M., Chin, E., Hanna, S., Wagner, D.: A survey of mobile malware in the wild. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, SPSM 2011, pp. 3–14. ACM (2011)

    Google Scholar 

  15. Grove, D., Chambers, C.: A framework for call graph construction algorithms. ACM Trans. Prog. Lang. Syst. (TOPLAS) 23(6), 685–746 (2001)

    Article  Google Scholar 

  16. LaToza, T., Myers, B.: Visualizing call graphs. In: Visual Languages and Human-Centric Computing (VL/HCC), Symposium on, pp. 117–124. IEEE (2011)

    Google Scholar 

  17. Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Computer security applications conference, pp. 421–430. IEEE (2007)

    Google Scholar 

  18. Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via graph reachability. In: Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 49–61. ACM (1995)

    Google Scholar 

  19. Rogers, R., Lombardo, J., Mednieks, Z., Meike, B.: Android Application Development: Programming with the Google SDK. O’Reilly Media, Inc., Sebastopol (2009)

    Google Scholar 

  20. Rosen, S., Qian, Z., Mao, Z.M.: Appprofiler: a flexible method of exposing privacy-related behavior in android applications to end users. In: Proceedings of the ACM conference on Data and application security and privacy, pp. 221–232. ACM (2013)

    Google Scholar 

  21. Rountev, A., Kagan, S., Marlowe, T.: Interprocedural dataflow analysis in the presence of large libraries. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 2–16. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  22. Rountev, A., Sharp, M., Xu, G.: IDE dataflow analysis in the presence of large object-oriented libraries. In: Hendren, L. (ed.) CC 2008. LNCS, vol. 4959, pp. 53–68. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  23. Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In: Muchnick, S.S., Jones, N.D. (eds.) Program Flow Analysis: Theory and Applications, pp. 189–234. Prentice Hall, New York (1981)

    Google Scholar 

  24. Yan, D., Xu, G., Rountev, A.: Rethinking soot for summary-based whole-program analysis. In: Proceedings of the ACM SIGPLAN International Workshop on State of the Art in Java Program analysis, pp. 9–14. ACM (2012)

    Google Scholar 

  25. Zhang, W., Ryder, B.: Constructing accurate application call graphs for java to model library callbacks. In: Sixth IEEE International Workshop on Source Code Analysis and Manipulation, SCAM 2006, pp. 63–74. IEEE (2006)

    Google Scholar 

  26. Zhou, Y., Jiang, X.: Dissecting android malware: Characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 95–109. IEEE (2012)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the team at EnSoft Corp who developed the Atlas platform, including Jeremias Sauceda, Jon Matthews, and Nikhil Ranade; and ISU APAC team members who contributed to the malware detection tooling.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suresh Kothari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Deering, T., Santhanam, G.R., Kothari, S. (2015). FlowMiner: Automatic Summarization of Library Data-Flow for Malware Analysis. In: Jajoda, S., Mazumdar, C. (eds) Information Systems Security. ICISS 2015. Lecture Notes in Computer Science(), vol 9478. Springer, Cham. https://doi.org/10.1007/978-3-319-26961-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26961-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26960-3

  • Online ISBN: 978-3-319-26961-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics