Skip to main content

Advertisement

Lifting inter-app data-flow analysis to large app sets

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Mobile apps process increasing amounts of private data, giving rise to privacy concerns. Such concerns do not arise only from single apps, which might—accidentally or intentionally—leak private information to untrusted parties, but also from multiple apps communicating with each other. Certain combinations of apps can create critical data flows not detectable by analyzing single apps individually. While sophisticated tools exist to analyze data flows inside and across apps, none of these scale to large numbers of apps, given the combinatorial explosion of possible (inter-app) data flows. We present a scalable approach to analyze data flows across Android apps. At the heart of our approach is a graph-based data structure that represents inter-app flows efficiently. Following ideas from product-line analysis, the data structure exploits redundancies among flows and thereby tames the combinatorial explosion. Instead of focusing on specific installations of app sets on mobile devices, we lift traditional data-flow analysis approaches to analyze and represent data flows of all possible combinations of apps. We developed the tool Sifta and applied it to several existing app benchmarks and real-world app sets, demonstrating its scalability and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. Other means of communication (e.g., shared files, native code) exist, but are outside of the scope of this paper.

  2. The flow is only a potential flow, as our analysis is static and can produce false positives (as taint analysis, in general).

  3. Our experiments are based on the DidFail variant published at the SOAP workshop (Klieber et al. 2014). More recently, the authors describe an improvement of DidFail  (Burket et al. 2015), however the focus of the improvement is DidFail ’s accuracy, not its scalability. Therefore, our analysis of DidFail ’s scalability does still hold even though the accuracy of the new DidFail version is better than suggested in our experiments (Sect. 5.1).

  4. The MIME standards define content types (e.g., JPEG, GIF, or AVI) of data attached to communication messages. They are also used in e-mail and HTTP protocols. There, clients use MIME types to determine how attached data should be opened.

  5. \(\lfloor URIs \rfloor = URIs \cup \bot \), where \(\bot \) represents an absent MIME type.

  6. There are alternatives to Epicc, such as IC3 (Octeau et al. 2015), but our considerations and results are not affected by this choice, as we discuss in Sect. 6.2.

  7. Obtained from the authors of Amandroid.

  8. http://github.com/secure-software-engineering/DroidBench/.

  9. We provide histograms for the path lengths of E3 and E4 on our Web site.

References

  • Apel, S., von Rhein, A., Wendler, P., Größlinger, A., Beyer, D.: Strategies for product-line verification: case studies and experiments. In: Proceedings of ICSE, pp. 482–491. IEEE (2013)

  • Apple.: App Store Sales Top $10 Billion in 2013. http://www.apple.com/pr/library/2014/01/07App-Store-Sales-Top-10-Billion-in-2013.html (2014)

  • Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel. P.: FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: Proceedings of PLDI, pp. 259–269. ACM (2014)

  • Berger, T., Pfeiffer, R.-H., Tartler, R., Dienst, S., Czarnecki, K., Wasowski, A., She, S.: Variability mechanisms in software ecosystems. Inf. Softw. Technol. 56(11), 1520–1535 (2014)

    Article  Google Scholar 

  • Al Bidani, N., Vigant Raffay, A.: Systematic Literature Review of Mobile Inter-Application Security. Master’s thesis, IT University of Copenhagen (2014)

  • Bodden, E., Tolêdo, T., Ribeiro, M., Brabrand, C., Borba, P., Mezini, M., SPLLIFT: Statically analyzing software product lines in minutes instead of years. In: Proceedings of PLDI, pp. 355–364. ACM (2013)

  • Burket, J., Flynn, L., Klieber, W., Lim, J., Shen, W., Snavely, W.: Making DidFail Succeed: Enhancing the CERT Static Taint Analyzer for Android App Sets. Technical Report CMU/SEI-2015-TR-001, Software Engineering Institute (2015)

  • Chin, E., Porter Felt, A., Greenwood, K., Wagner, D.: Analyzing inter-application communication in android. In: Proceedings of MobiSys, pp. 239–252. ACM (2011)

  • Classen, A., Heymans, P., Schobbens, P.-Y., Legay, A., Raskin. J.-F.: Model checking lots of systems: efficient verification of temporal properties in software product lines. In: Proceedings of ICSE, pp. 335–344. ACM (2010)

  • Czarnecki, K., Antkiewicz, M.: Mapping features to models: a template approach based on superimposed variants. In: Proceedings of GPCE, pp. 422–437. Springer (2005)

  • Czarnecki, K., Pietroszek, K.: Verifying feature-based model templates against well-formedness OCL constraints. In: Proceedings of GPCE, pp. 211–220. ACM (2006)

  • Dienst, S., Berger, T.: Static Analysis of App Dependencies in Android Bytecode. Technical note. http://informatik.uni-leipzig.de/~berger/tr/2012-dienst.pdf (2012)

  • Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B., Cox, L., Jung, J., McDaniel, P., Sheth, A.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM TOCS 32(2), 5:1–5:29 (2014)

    Article  Google Scholar 

  • Enck, W., Octeau, D., McDaniel, P., Chaudhuri. S.: A study of android application security. In: Proceedings of USENIX, pp. 21–21. USENIX Association (2011)

  • Hardy, N.: The confused deputy (or why capabilities might have been invented). ACM SIGOPS 22(4), 36–38 (1988)

    Article  Google Scholar 

  • Kästner, C., Giarrusso, P., Rendel, T., Erdweg, S., Ostermann, K., Berger, T.: Variability-aware parsing in the presence of lexical macros and conditional compilation. In: Proceeedings of OOPSLA, pp. 805–824. ACM (2011)

  • Klieber, W., Flynn, L., Bhosale, A., Jia, L., Bauer, L.: Android taint flow analysis for app sets. In: Proceedings of SOAP, pp. 1–6. ACM (2014)

  • Li, L., Bartel, A., Bissyande, T., Klein, J., Le Traon, Y., Arzt, S., Rasthofer, S., Bodden, E., Octeau, D., McDaniel, P.: IccTA: detecting inter-component privacy leaks in android apps. In: Proceedings of ICSE, pp. 280–292. IEEE (2015)

  • Li, L., Bissyande, T., Papadakis, M., Rasthofer, S., Bartel, A., Octeau, D., Klein, J., Le Traon, Y.: Static analysis of android apps: a systematic literature review. Technical report, University of Luxembourg, Fraunhofer SIT/TU Darmstadt, University of Wisconsin and Pennsylvania State University (2016)

  • Liebig, J., von Rhein, A., Kästner, C., Apel, S.,  Dörre, J., Lengauer, C.: Scalable analysis of variable software. In: Proceedings of ESEC/FSE, pp. 81–91. ACM (2013)

  • Lu, L., Li, Z., Wu, Z., Lee, W., Jiang, G.: CHEX: statically vetting android apps for component hijacking vulnerabilities. In: Proceedings of CCS, pp. 229–240. ACM (2012)

  • Martin, W., Harman, M., Jia, Y., Sarro, F., Zhang, Y.: The app sampling problem for app store mining. In: Proceedings of MSR, pp. 123–133. ACM (2015)

  • Mojica, I., Adams, B., Nagappan, M., Dienst, S., Berger, T., Hassan, A.: A large scale empirical study on software reuse in mobile apps. IEEE Softw. 31(2), 78–86 (2014)

    Article  Google Scholar 

  • Mojica, I.J., Nagappan, M., Adams, B., Berger, T., Dienst, S., Hassan, A.E.: On ad library updates in android apps. IEEE Softw. (2015). Online first

  • Nadi, S., Berger, T., Kästner, C., Czarnecki, K.: Mining configuration constraints: static analyses and empirical results. In: Proceedings of ICSE, pp. 140–151. ACM (2014)

  • Nauman, M., Khan, S., Zhang, X.: Apex: extending android permission model and enforcement with user-defined runtime constraints. In: Proceedings of ASIACCS, pp. 328–332. ACM (2010)

  • Octeau, D., Jha, S., Dering, M., McDaniel, P., Bartel, A., Li, L., Klein, J., Le Traon, Y.: Combining static analysis with probabilistic models to enable market-scale android inter-component analysis. In: Proceedings of international symposium on principles of programming languages (POPL), pp. 469–484. ACM (2016)

  • Octeau, D., Luchaup, D., Dering, M., Jha, S., McDaniel, P.: Composite constant propagation: application to android inter-component communication analysis. In: Proceedings of ICSE, pp. 77–88. IEEE (2015)

  • Octeau, D., McDaniel, P., Jha, S., Bartel, A., Bodden, E., Klein, J., Le Traon, Y.: Effective inter-component communication mapping in android with epicc: an essential step towards holistic security analysis. In: Proceedings of USENIX, pp. 543–558. USENIX Association (2013)

  • Ongtang, M., McLaughlin, S., Enck, W., McDaniel, P.: Semantically rich application-centric security in android. Secur. Commun. Netw. 5(6), 658–673 (2012)

    Article  Google Scholar 

  • Porter Felt, A., Wang, H., Moshchuk, A., Hanna, S., Chin, E.: Permission re-delegation: attacks and defenses. In: Proceedings of USENIX, pp. 22. USENIX Association (2011)

  • Sadeghi, A. Bagheri, H., Malek, S.: Analysis of android inter-app security vulnerabilities using COVERT. In: Proceedings of ICSE, pp. 725–728. IEEE (2015)

  • Sbîrlea, D., Burke, M.G., Guarnieri, S., Pistoia, M., Sarkar, V.: Automatic detection of inter-application permission leaks in android applications. IBM J. Res. Dev. 57(6), 10:1–10:12 (2013)

    Article  Google Scholar 

  • Thüm, T., Apel, S., Kästner, C., Schaefer, I., Saake, G.: A classification and survey of analysis strategies for software product lines. ACM Comput. Surv. 47(1), 6:1–6:45 (2014)

    Article  Google Scholar 

  • Viennot, N., Garcia, E., Nieh, J.: A measurement study of google play. In: Proceedings of SIGMETRICS, pp. 221–233. ACM (2014)

  • Walkingshaw, E., Kästner, C., Erwig, M., Apel, S., Bodden, E.: Variational data structures: exploring trade-offs in computing with variability. In Proceedings of onward!, pp. 213–226. ACM (2014)

  • Wei, F., Roy, S., Ou, X., Robby: Amandroid: a precise and general inter-component data flow analysis framework for security vetting of android apps. In: Proceedings of CCS, pp. 1329–1341. ACM (2014)

  • Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: Proceedings of SSP, pp. 95–109. IEEE (2012)

Download references

Acknowledgements

We thank Eric Bodden, Steven Arzt, Li Li, Fengguo Wei, and Yajin Zhou for helpful discussions on our implementation, on their tools (IccTA and Amandroid), and for making their benchmark sets available. The work has been supported by the German Research Foundation (AP 206/4 and AP 206/6).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sven Apel.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sattler, F., von Rhein, A., Berger, T. et al. Lifting inter-app data-flow analysis to large app sets. Autom Softw Eng 25, 315–346 (2018). https://doi.org/10.1007/s10515-017-0228-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10515-017-0228-z

Keywords