skip to main content
10.1145/3485832.3485925acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

MAppGraph: Mobile-App Classification on Encrypted Network Traffic using Deep Graph Convolution Neural Networks

Published:06 December 2021Publication History

ABSTRACT

Identifying mobile apps based on network traffic has multiple benefits for security and network management. However, it is a challenging task due to multiple reasons. First, network traffic is encrypted using an end-to-end encryption mechanism to protect data privacy. Second, user behavior changes dynamically when using different functionalities of mobile apps. Third, it is hard to differentiate traffic behavior due to common shared libraries and content delivery within modern mobile apps. Existing techniques managed to address the encryption issue but not the others, thus achieving low detection/classification accuracy. In this paper, we present MAppGraph, a novel technique to classify mobile apps, addressing all the above issues. Given a chunk of traffic generated by an app, MAppGraph constructs a communication graph whose nodes are defined by tuples of IP address and port of the services connected by the app, edges are established by the weighted communication correlation among the nodes. We extract information from packet headers without analyzing encrypted payload to form feature vectors of the nodes. We leverage deep graph convolution neural networks to learn the diverse communication behavior of mobile apps from a large number of graphs and achieve a fast classification. To validate our technique, we collect traffic of a hundred mobile apps on the Android platform and run extensive experiments with various experimental scenarios. The results show that MAppGraph significantly improves classification accuracy by up to 20% compared to recently developed techniques and demonstrates its practicality for security and network management of mobile services.

References

  1. Khaled Al-Naami 2016. Adaptive Encrypted Traffic Fingerprinting with Bi-Directional Dependence. In Proc. 32nd Annual Conference on Computer Security Applications (ACSAC ’16). Los Angeles, CA, USA, 177–188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Blake Anderson 2018. Deciphering malware’s use of TLS (without decryption). J Comput Virol Hack Tech 14 (Aug. 2018).Google ScholarGoogle Scholar
  3. Blake Anderson and David McGrew. 2016. Identifying Encrypted Malware Traffic with Contextual Flow Data. In 2016 ACM Workshop on Artificial Intelligence and Security. Vienna, Austria, 35–46.Google ScholarGoogle Scholar
  4. Noah J. Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, and Nick Feamster. 2017. Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic. CoRR abs/1708.05044(2017).Google ScholarGoogle Scholar
  5. Bram Bonne. 2021. An Update on Android TLS Adoption. https://android-developers.googleblog.com/2019/12/an-update-on-android-tls-adoption.html. Online; accessed 30 April 2021.Google ScholarGoogle Scholar
  6. Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014). Banff, Canada.Google ScholarGoogle Scholar
  7. Francisco M. Castro, Manuel J. Marín-Jiménez, Nicolás Guil, Cordelia Schmid, and Karteek Alahari. 2018. End-to-End Incremental Learning. In 15th European Conference on Computer Vision (ECCV 2018). Munich, Germany, 241–257.Google ScholarGoogle Scholar
  8. Fenxiao Chen, Yun-Cheng Wang, Bin Wang, and C.-C. Jay Kuo. 2020. Graph representation learning: a survey. APSIPA Transactions on Signal and Information Processing 9 (2020), e15. https://doi.org/10.1017/ATSIP.2020.13Google ScholarGoogle ScholarCross RefCross Ref
  9. Yi Chen, Wei You, Yeonjoon Lee, Kai Chen, XiaoFeng Wang, and Wei Zou. 2017. Mass Discovery of Android Traffic Imprints through Instantiated Partial Execution. In Proc. 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS ’17). Dallas, Texas, USA, 815–828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Z. Chen, K. He, J. Li, and Y. Geng. 2017. Seq2Img: A sequence-to-image based approach towards IP traffic classification using convolutional neural networks. In 2017 IEEE International Conference on Big Data. Boston, MA, USA, 1271–1276.Google ScholarGoogle Scholar
  11. Zhengyang Chen, Bowen Yu, Yu Zhang, Jianzhong Zhang, and Jingdong Xu. 2016. Automatic Mobile Application Traffic Identification by Convolutional Neural Networks. In 2016 IEEE Trustcom/BigDataSE/ISPA. Tianjin, China, 301–307.Google ScholarGoogle Scholar
  12. Yeongrak Choi, Jae Yoon Chung, Byungchul Park, and James Won-Ki Hong. 2012. Automated Classifier Generation for Application-level Mobile Traffic Identification. In 2012 IEEE Network Operations and Management Symposium. Maui, HI, USA, 1075–1081.Google ScholarGoogle Scholar
  13. Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song. 2013. NetworkProfiler: Towards automatic fingerprinting of Android apps. In 2013 Proceedings IEEE INFOCOM. Turin, Italy, 809–817.Google ScholarGoogle ScholarCross RefCross Ref
  14. Manh Tuan Do, Noseong Park, and Kijung Shin. 2020. Two-stage Training of Graph Neural Networks for Graph Classification. arXiv e-prints (Nov. 2020).Google ScholarGoogle Scholar
  15. David Duvenaud, , Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gomez-Bombarelli, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P. Adams. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In Twenty-ninth Conference on Neural Information Processing Systems. Montreal, Canada.Google ScholarGoogle Scholar
  16. A. S. Iliyasu and H. Deng. 2020. Semi-Supervised Encrypted Traffic Classification With Deep Convolutional Generative Adversarial Networks. IEEE Access 8(2020).Google ScholarGoogle Scholar
  17. Paul Jaccard. 1912. The Distribution of the Flora in the Alpine Zone. New Phytologist 11, 2 (Feb. 1912).Google ScholarGoogle ScholarCross RefCross Ref
  18. Peter Jonsson, Stephen Carson, Jasmeet Singh Sethi, Mats Arvedson, Ritva Svenningsson, Per Lindberg, Kati Öhman, and Patrik Hedlund. 2017. Ericsson Mobility Report. Technical Report. Ericsson.Google ScholarGoogle Scholar
  19. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations (ICLR 2015). San Diego, CA, USA.Google ScholarGoogle Scholar
  20. Baris Kurt, Engin Zeydan, Utku Yabas, Ilyas Alper Karatepe, Gunes Karabulut Kurt, and Ali Taylan Cemgil. 2016. A Network Monitoring System for High Speed Network Traffic. In 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). London, UK.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Junhyun Lee, Inyeop Lee, and Jaewoo Kang. 2019. Self-Attention Graph Pooling. In Proc. International Conference on Machine Learning. Long Beach, USA.Google ScholarGoogle Scholar
  22. Jingyi Liao, Sin G. Teo, Partha Pratim Kundu, and Tram Truong-Huu. 2021. ENAD: An Ensemble Framework for Unsupervised Network Anomaly Detection. In Proc. IEEE CSR 2021. Virtual Conference.Google ScholarGoogle ScholarCross RefCross Ref
  23. Martina Lindorfer, Matthias Neugschwandtner, Lukas Weichselbaum, Yanick Fratantonio, Victor van der Veen, and Christian Platzer. 2014. ANDRUBIS – 1,000,000 Apps Later: A View on Current Android Malware Behaviors. In 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS). Wroclaw, Poland, 3–17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Lopez-Martin, B. Carro, A. Sanchez-Esguevillas, and J. Lloret. 2017. Network Traffic Classifier With Convolutional and Recurrent Neural Networks for Internet of Things. IEEE Access 5(2017), 18042–18050.Google ScholarGoogle ScholarCross RefCross Ref
  25. Mohammad Lotfollahi, Mahdi Jafari Siavoshani, Ramin Shirali Hossein Zade, and Mohammdsadegh Saberian. 2020. Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Computing 24 (Feb. 2020).Google ScholarGoogle Scholar
  26. Yair Meidan, Michael Bohadana, Asaf Shabtai, Juan David Guarnizo, Martín Ochoa, Nils Ole Tippenhauer, and Yuval Elovici. 2017. ProfilIoT: A Machine Learning Approach for IoT Device Identification Based on Network Traffic Analysis. In Proc. Symposium on Applied Computing (SAC ’17). Marrakech, Morocco.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Markus Miettinen, Samuel Marchal, Ibbad Hafeez, N. Asokan, Ahmad-Reza Sadeghi, and Sasu Tarkoma. 2017. IoT SENTINEL: Automated Device-Type Identification for Security Enforcement in IoT. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). Atlanta, GA, USA.Google ScholarGoogle Scholar
  28. Akash Raj Narayanadoss, Tram Truong-Huu, Purnima Murali Mohan, and Mohan Gurusamy. 2019. Crossfire Attack Detection Using Deep Learning in Software Defined ITS Networks. In 2019 IEEE 89th Vehicular Technology Conference (VTC2019-Spring). Kuala Lumpur, Malaysia.Google ScholarGoogle Scholar
  29. T. T. T. Nguyen and G. Armitage. 2008. A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials 10, 4 (2008), 56–76.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22, 10(2010), 1345–1359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Emanuele Petagna, Giuseppe Laurenza, Claudio Ciccotelli, and Leonardo Querzoni. 2019. Peel the Onion: Recognition of Android Apps Behind the Tor Network. In Proc. International Conference on Information Security Practice and Experience. Kuala Lumpur, Malaysia, 95–112.Google ScholarGoogle ScholarCross RefCross Ref
  32. Lawrence R. Rabiner and Bernard Gold. 1975. Theory and Application of Digital Signal Processing. Prentice Hall, Hoboken, New Jersey, United States.Google ScholarGoogle Scholar
  33. Jingjing Ren 2019. An International View of Privacy Risks for Mobile Apps. https://recon.meddle.mobi/papers/cross-market.pdfGoogle ScholarGoogle Scholar
  34. Jingjing Ren, Martina Lindorfer, Daniel Dubois, Ashwin Rao, David Choffnes, and Narseo Vallina-Rodriguez. 2018. Bug fixes, improvements,... and privacy leaks–a longitudinal study of pii leaks across android app versions. In Proc. of the Network and Distributed System Security Symposium (NDSS). San Diego, USA.Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Rezaei and X. Liu. 2019. Deep Learning for Encrypted Traffic Classification: An Overview. IEEE Communications Magazine 57, 5 (2019), 76–81.Google ScholarGoogle ScholarCross RefCross Ref
  36. Shahbaz Rezaei and Xin Liu. 2019. How to Achieve High Classification Accuracy with Just a Few Labels: A Semisupervised Approach Using Sampled Packets. In Proc. 19th Industrial Conference on Data Mining. New York, USA, 28–42.Google ScholarGoogle Scholar
  37. Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling Relational Data with Graph Convolutional Networks. In European Semantic Web Conference. Heraklion, Crete, Greece.Google ScholarGoogle Scholar
  38. Yaman Sharaf-Dabbagh and Walid Saad. 2016. On the Authentication of Devices in the Internet of Things. In 2016 IEEE 17th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM). Coimbra, Portugal.Google ScholarGoogle Scholar
  39. Hongtao Shi, Hongping Li, Dan Zhang, Chaqiu Cheng, and Xuanxuan Cao. 2018. An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification. Computer Networks 132(2018).Google ScholarGoogle Scholar
  40. V. F. Taylor, R. Spolaor, M. Conti, and I. Martinovic. 2016. AppScanner: Automatic Fingerprinting of Smartphone Apps from Encrypted Network Traffic. In IEEE EuroS&P 2016. Saarbruecken, Germany.Google ScholarGoogle Scholar
  41. Vincent F. Taylor, Riccardo Spolaor, Mauro Conti, and Ivan Martinovic. 2018. Robust Smartphone App Identification via Encrypted Network Traffic Analysis. IEEE Transactions on Information Forensics and Security 13, 1(2018), 63–78.Google ScholarGoogle ScholarCross RefCross Ref
  42. Vijayanand Thangavelu, Dinil Mon Divakaran, Rishi Sairam, Suman Sankar Bhunia, and Mohan Gurusamy. 2019. DEFT: A Distributed IoT Fingerprinting Technique. IEEE Internet of Things Journal 6, 1 (2019), 940–952.Google ScholarGoogle ScholarCross RefCross Ref
  43. Tram Truong-Huu, Nidhya Dheenadhayalan, Partha Pratim Kundu, Vasudha Ramnath, Jingyi Liao, Sin G. Teo, and Sai Praveen Kadiyala. 2020. An Empirical Study on Unsupervised Network Anomaly Detection Using Generative Adversarial Networks. In 1st Security and Privacy on Artificial Intelligent Workshop (SPAI’20). Taipei, Taiwan.Google ScholarGoogle Scholar
  44. Thijs van Ede, Riccardo Bortolameotti, Andrea Continella, Jingjing Ren, Daniel J Dubois, Martina Lindorfer, David Choffnes, Maarten van Steen, and Andreas Peter. 2020. FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic. In Proc. Network and Distributed System Security Symposium (NDSS 2020). San Diego, CA, USA.Google ScholarGoogle ScholarCross RefCross Ref
  45. Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations. Vancouver, Canada.Google ScholarGoogle Scholar
  46. W. Wang, M. Zhu, J. Wang, X. Zeng, and Z. Yang. 2017. End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In IEEE ISI 2017. Beijing, China, 43–48.Google ScholarGoogle Scholar
  47. Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations. New Orleans, Louisiana, United States.Google ScholarGoogle Scholar
  48. Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. 2018. Hierarchical Graph Representation Learning with Differentiable Pooling. In 32nd International Conference on Neural Information Processing Systems. Montréal, Canada, 4805–4815.Google ScholarGoogle Scholar
  49. Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An End-to-End Deep Learning Architecture for Graph Classification. In The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18). New Orleans, USA.Google ScholarGoogle Scholar
  50. Yixue Zhao and Nenad Medvidovic. 2019. A Microservice Architecture for Online Mobile App Optimization. In 2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft). Montreal, QC, Canada.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. MAppGraph: Mobile-App Classification on Encrypted Network Traffic using Deep Graph Convolution Neural Networks
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ACSAC '21: Proceedings of the 37th Annual Computer Security Applications Conference
            December 2021
            1077 pages
            ISBN:9781450385794
            DOI:10.1145/3485832

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 6 December 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate104of497submissions,21%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format