ABSTRACT
Identifying mobile apps based on network traffic has multiple benefits for security and network management. However, it is a challenging task due to multiple reasons. First, network traffic is encrypted using an end-to-end encryption mechanism to protect data privacy. Second, user behavior changes dynamically when using different functionalities of mobile apps. Third, it is hard to differentiate traffic behavior due to common shared libraries and content delivery within modern mobile apps. Existing techniques managed to address the encryption issue but not the others, thus achieving low detection/classification accuracy. In this paper, we present MAppGraph, a novel technique to classify mobile apps, addressing all the above issues. Given a chunk of traffic generated by an app, MAppGraph constructs a communication graph whose nodes are defined by tuples of IP address and port of the services connected by the app, edges are established by the weighted communication correlation among the nodes. We extract information from packet headers without analyzing encrypted payload to form feature vectors of the nodes. We leverage deep graph convolution neural networks to learn the diverse communication behavior of mobile apps from a large number of graphs and achieve a fast classification. To validate our technique, we collect traffic of a hundred mobile apps on the Android platform and run extensive experiments with various experimental scenarios. The results show that MAppGraph significantly improves classification accuracy by up to 20% compared to recently developed techniques and demonstrates its practicality for security and network management of mobile services.
- Khaled Al-Naami 2016. Adaptive Encrypted Traffic Fingerprinting with Bi-Directional Dependence. In Proc. 32nd Annual Conference on Computer Security Applications (ACSAC ’16). Los Angeles, CA, USA, 177–188.Google ScholarDigital Library
- Blake Anderson 2018. Deciphering malware’s use of TLS (without decryption). J Comput Virol Hack Tech 14 (Aug. 2018).Google Scholar
- Blake Anderson and David McGrew. 2016. Identifying Encrypted Malware Traffic with Contextual Flow Data. In 2016 ACM Workshop on Artificial Intelligence and Security. Vienna, Austria, 35–46.Google Scholar
- Noah J. Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, and Nick Feamster. 2017. Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic. CoRR abs/1708.05044(2017).Google Scholar
- Bram Bonne. 2021. An Update on Android TLS Adoption. https://android-developers.googleblog.com/2019/12/an-update-on-android-tls-adoption.html. Online; accessed 30 April 2021.Google Scholar
- Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014). Banff, Canada.Google Scholar
- Francisco M. Castro, Manuel J. Marín-Jiménez, Nicolás Guil, Cordelia Schmid, and Karteek Alahari. 2018. End-to-End Incremental Learning. In 15th European Conference on Computer Vision (ECCV 2018). Munich, Germany, 241–257.Google Scholar
- Fenxiao Chen, Yun-Cheng Wang, Bin Wang, and C.-C. Jay Kuo. 2020. Graph representation learning: a survey. APSIPA Transactions on Signal and Information Processing 9 (2020), e15. https://doi.org/10.1017/ATSIP.2020.13Google ScholarCross Ref
- Yi Chen, Wei You, Yeonjoon Lee, Kai Chen, XiaoFeng Wang, and Wei Zou. 2017. Mass Discovery of Android Traffic Imprints through Instantiated Partial Execution. In Proc. 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS ’17). Dallas, Texas, USA, 815–828.Google ScholarDigital Library
- Z. Chen, K. He, J. Li, and Y. Geng. 2017. Seq2Img: A sequence-to-image based approach towards IP traffic classification using convolutional neural networks. In 2017 IEEE International Conference on Big Data. Boston, MA, USA, 1271–1276.Google Scholar
- Zhengyang Chen, Bowen Yu, Yu Zhang, Jianzhong Zhang, and Jingdong Xu. 2016. Automatic Mobile Application Traffic Identification by Convolutional Neural Networks. In 2016 IEEE Trustcom/BigDataSE/ISPA. Tianjin, China, 301–307.Google Scholar
- Yeongrak Choi, Jae Yoon Chung, Byungchul Park, and James Won-Ki Hong. 2012. Automated Classifier Generation for Application-level Mobile Traffic Identification. In 2012 IEEE Network Operations and Management Symposium. Maui, HI, USA, 1075–1081.Google Scholar
- Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song. 2013. NetworkProfiler: Towards automatic fingerprinting of Android apps. In 2013 Proceedings IEEE INFOCOM. Turin, Italy, 809–817.Google ScholarCross Ref
- Manh Tuan Do, Noseong Park, and Kijung Shin. 2020. Two-stage Training of Graph Neural Networks for Graph Classification. arXiv e-prints (Nov. 2020).Google Scholar
- David Duvenaud, , Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gomez-Bombarelli, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P. Adams. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In Twenty-ninth Conference on Neural Information Processing Systems. Montreal, Canada.Google Scholar
- A. S. Iliyasu and H. Deng. 2020. Semi-Supervised Encrypted Traffic Classification With Deep Convolutional Generative Adversarial Networks. IEEE Access 8(2020).Google Scholar
- Paul Jaccard. 1912. The Distribution of the Flora in the Alpine Zone. New Phytologist 11, 2 (Feb. 1912).Google ScholarCross Ref
- Peter Jonsson, Stephen Carson, Jasmeet Singh Sethi, Mats Arvedson, Ritva Svenningsson, Per Lindberg, Kati Öhman, and Patrik Hedlund. 2017. Ericsson Mobility Report. Technical Report. Ericsson.Google Scholar
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations (ICLR 2015). San Diego, CA, USA.Google Scholar
- Baris Kurt, Engin Zeydan, Utku Yabas, Ilyas Alper Karatepe, Gunes Karabulut Kurt, and Ali Taylan Cemgil. 2016. A Network Monitoring System for High Speed Network Traffic. In 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). London, UK.Google ScholarDigital Library
- Junhyun Lee, Inyeop Lee, and Jaewoo Kang. 2019. Self-Attention Graph Pooling. In Proc. International Conference on Machine Learning. Long Beach, USA.Google Scholar
- Jingyi Liao, Sin G. Teo, Partha Pratim Kundu, and Tram Truong-Huu. 2021. ENAD: An Ensemble Framework for Unsupervised Network Anomaly Detection. In Proc. IEEE CSR 2021. Virtual Conference.Google ScholarCross Ref
- Martina Lindorfer, Matthias Neugschwandtner, Lukas Weichselbaum, Yanick Fratantonio, Victor van der Veen, and Christian Platzer. 2014. ANDRUBIS – 1,000,000 Apps Later: A View on Current Android Malware Behaviors. In 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS). Wroclaw, Poland, 3–17.Google ScholarDigital Library
- M. Lopez-Martin, B. Carro, A. Sanchez-Esguevillas, and J. Lloret. 2017. Network Traffic Classifier With Convolutional and Recurrent Neural Networks for Internet of Things. IEEE Access 5(2017), 18042–18050.Google ScholarCross Ref
- Mohammad Lotfollahi, Mahdi Jafari Siavoshani, Ramin Shirali Hossein Zade, and Mohammdsadegh Saberian. 2020. Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Computing 24 (Feb. 2020).Google Scholar
- Yair Meidan, Michael Bohadana, Asaf Shabtai, Juan David Guarnizo, Martín Ochoa, Nils Ole Tippenhauer, and Yuval Elovici. 2017. ProfilIoT: A Machine Learning Approach for IoT Device Identification Based on Network Traffic Analysis. In Proc. Symposium on Applied Computing (SAC ’17). Marrakech, Morocco.Google ScholarDigital Library
- Markus Miettinen, Samuel Marchal, Ibbad Hafeez, N. Asokan, Ahmad-Reza Sadeghi, and Sasu Tarkoma. 2017. IoT SENTINEL: Automated Device-Type Identification for Security Enforcement in IoT. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). Atlanta, GA, USA.Google Scholar
- Akash Raj Narayanadoss, Tram Truong-Huu, Purnima Murali Mohan, and Mohan Gurusamy. 2019. Crossfire Attack Detection Using Deep Learning in Software Defined ITS Networks. In 2019 IEEE 89th Vehicular Technology Conference (VTC2019-Spring). Kuala Lumpur, Malaysia.Google Scholar
- T. T. T. Nguyen and G. Armitage. 2008. A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials 10, 4 (2008), 56–76.Google ScholarDigital Library
- Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22, 10(2010), 1345–1359.Google ScholarDigital Library
- Emanuele Petagna, Giuseppe Laurenza, Claudio Ciccotelli, and Leonardo Querzoni. 2019. Peel the Onion: Recognition of Android Apps Behind the Tor Network. In Proc. International Conference on Information Security Practice and Experience. Kuala Lumpur, Malaysia, 95–112.Google ScholarCross Ref
- Lawrence R. Rabiner and Bernard Gold. 1975. Theory and Application of Digital Signal Processing. Prentice Hall, Hoboken, New Jersey, United States.Google Scholar
- Jingjing Ren 2019. An International View of Privacy Risks for Mobile Apps. https://recon.meddle.mobi/papers/cross-market.pdfGoogle Scholar
- Jingjing Ren, Martina Lindorfer, Daniel Dubois, Ashwin Rao, David Choffnes, and Narseo Vallina-Rodriguez. 2018. Bug fixes, improvements,... and privacy leaks–a longitudinal study of pii leaks across android app versions. In Proc. of the Network and Distributed System Security Symposium (NDSS). San Diego, USA.Google ScholarCross Ref
- S. Rezaei and X. Liu. 2019. Deep Learning for Encrypted Traffic Classification: An Overview. IEEE Communications Magazine 57, 5 (2019), 76–81.Google ScholarCross Ref
- Shahbaz Rezaei and Xin Liu. 2019. How to Achieve High Classification Accuracy with Just a Few Labels: A Semisupervised Approach Using Sampled Packets. In Proc. 19th Industrial Conference on Data Mining. New York, USA, 28–42.Google Scholar
- Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling Relational Data with Graph Convolutional Networks. In European Semantic Web Conference. Heraklion, Crete, Greece.Google Scholar
- Yaman Sharaf-Dabbagh and Walid Saad. 2016. On the Authentication of Devices in the Internet of Things. In 2016 IEEE 17th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM). Coimbra, Portugal.Google Scholar
- Hongtao Shi, Hongping Li, Dan Zhang, Chaqiu Cheng, and Xuanxuan Cao. 2018. An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification. Computer Networks 132(2018).Google Scholar
- V. F. Taylor, R. Spolaor, M. Conti, and I. Martinovic. 2016. AppScanner: Automatic Fingerprinting of Smartphone Apps from Encrypted Network Traffic. In IEEE EuroS&P 2016. Saarbruecken, Germany.Google Scholar
- Vincent F. Taylor, Riccardo Spolaor, Mauro Conti, and Ivan Martinovic. 2018. Robust Smartphone App Identification via Encrypted Network Traffic Analysis. IEEE Transactions on Information Forensics and Security 13, 1(2018), 63–78.Google ScholarCross Ref
- Vijayanand Thangavelu, Dinil Mon Divakaran, Rishi Sairam, Suman Sankar Bhunia, and Mohan Gurusamy. 2019. DEFT: A Distributed IoT Fingerprinting Technique. IEEE Internet of Things Journal 6, 1 (2019), 940–952.Google ScholarCross Ref
- Tram Truong-Huu, Nidhya Dheenadhayalan, Partha Pratim Kundu, Vasudha Ramnath, Jingyi Liao, Sin G. Teo, and Sai Praveen Kadiyala. 2020. An Empirical Study on Unsupervised Network Anomaly Detection Using Generative Adversarial Networks. In 1st Security and Privacy on Artificial Intelligent Workshop (SPAI’20). Taipei, Taiwan.Google Scholar
- Thijs van Ede, Riccardo Bortolameotti, Andrea Continella, Jingjing Ren, Daniel J Dubois, Martina Lindorfer, David Choffnes, Maarten van Steen, and Andreas Peter. 2020. FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic. In Proc. Network and Distributed System Security Symposium (NDSS 2020). San Diego, CA, USA.Google ScholarCross Ref
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations. Vancouver, Canada.Google Scholar
- W. Wang, M. Zhu, J. Wang, X. Zeng, and Z. Yang. 2017. End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In IEEE ISI 2017. Beijing, China, 43–48.Google Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations. New Orleans, Louisiana, United States.Google Scholar
- Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. 2018. Hierarchical Graph Representation Learning with Differentiable Pooling. In 32nd International Conference on Neural Information Processing Systems. Montréal, Canada, 4805–4815.Google Scholar
- Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An End-to-End Deep Learning Architecture for Graph Classification. In The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18). New Orleans, USA.Google Scholar
- Yixue Zhao and Nenad Medvidovic. 2019. A Microservice Architecture for Online Mobile App Optimization. In 2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft). Montreal, QC, Canada.Google ScholarCross Ref
Index Terms
- MAppGraph: Mobile-App Classification on Encrypted Network Traffic using Deep Graph Convolution Neural Networks
Recommendations
An Explorative Study of the Mobile App Ecosystem from App Developers' Perspective
WWW '17: Proceedings of the 26th International Conference on World Wide WebWith the prevalence of smartphones, app markets such as Apple App Store and Google Play has become the center stage in the mobile app ecosystem, with millions of apps developed by tens of thousands of app developers in each major market. This paper ...
Multi-classification approaches for classifying mobile app traffic
The growing usage of smartphones in everyday life is deeply (and rapidly) changing the nature of traffic traversing home and enterprise networks, and the Internet. Different tools and middleboxes, such as performance enhancement proxies, network ...
Encrypted Traffic Classification Using Graph Convolutional Networks
Advanced Data Mining and ApplicationsAbstractTraffic classification plays a vital role in the field of network management and network security. Because of the continuous evolution of new applications and services and the widespread use of encrypted communication technologies, it has become a ...
Comments