skip to main content
10.1145/2465839.2465847acmconferencesArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

A supervised machine learning approach to classify host roles on line using sFlow

Published: 18 June 2013 Publication History

Abstract

Classifying host roles based on network traffic behavior is valuable for network security analysis and detecting security policy violation. Behavior-based network security analysis has advantages over traditional approaches such as code patterns or signatures. Modeling host roles based on network flow data is challenging because of the huge volume of network traffic and overlap among host roles. Many studies of network traffic classification have focused on classifying applications such as web, peer-to-peer, and DNS traffic. In general, machine learning approaches have been applied on classifying applications, security awareness, and anomaly detection. In this paper, we present a supervised machine learning approach that use On-Line Support Vector Machine and Decision Tree to classify host roles. We collect sFlow data from main gateways of a large campus network. We classify different roles, namely, clients versus servers, regular web non-email servers versus web email servers, clients at personal offices versus public places of laboratories and libraries, and personal office clients from two different colleges. We achieved very high classification accuracy, i.e., 99.2% accuracy in classifying clients versus servers, 100% accuracy in classifying regular web non-email servers versus web email servers, 93.3% accuracy in classifying clients at personnel offices versus public places, and 93.3% accuracy in classifying clients at personal offices from two different colleges.

References

[1]
Apache Hadoop Project. http://hadoop.apache.org, Retrieved November 3, 2012.
[2]
Big data fuels intelligence-driven security. http://www.emc.com/collateral/industry-overview/big-data-fuels-intelligence-driven-security-io.pdf, Retrieved February 18, 2013.
[3]
Decition Tree C5.0. http://www.rulequest.com/see5-info.html, Retrieved December 31, 2012.
[4]
Internet Traffic Classification. http://www.caida.org/research/traffic-analysis/classification-overview/, Retrieved June 3, 2012.
[5]
R. Berthier, M. Cukier, M. Hiltunen, D. Kormann, G. Vesonder, D. Sheleheda, P. Ave, and F. Park. Nfsight : NetFlow-based Network Awareness Tool. Architecture, pages 1--8, 2010.
[6]
A. Bordes, S. Ertekin, J. Weston, and L. Bottou. Fast kernel classifiers with online and active learning. Journal of Machine Learning Research, 6:1579--1619, September 2005.
[7]
G. Dewaele, Y. Himura, P. Borgnat, K. Fukuda, P. Abry, O. Michel, R. Fontugne, K. Cho, and H. Esaki. Unsupervised host behavior classification from connection patterns. Int. J. Netw. Manag., 20(5):317--337, Sept. 2010.
[8]
A. S. Galathiya, A. P. Ganatra, and C. K. Bhensdadia. Article: Classification with an improved decision tree algorithm. International Journal of Computer Applications, 46(23):1--6, May 2012. Published by Foundation of Computer Science, New York, USA.
[9]
D. Geer. Behavior-based network security goes mainstream. Computer, 39(3):14--17, Mar. 2006.
[10]
I. Guyon and A. Elisseeff. An introduction to variable and feature selection. The Jorunal of Machine Learning Research, 3:1157--1182, Mar. 2003.
[11]
Y. Himura, K. Fukuda, K. Cho, P. Borgnat, P. Abry, and H. Esaki. Synoptic graphlet: Bridging the gap between supervised and unsupervised profiling of host-level network traffic. Networking, IEEE/ACM Transactions on, PP(99):1, 2012.
[12]
T. Karagiannis, K. Papagiannaki, and M. Faloutsos. BLINC: multilevel traffic classification in the dark. SIGCOMM Comput. Commun. Rev., 35(4):229--240, Aug. 2005.
[13]
T. Karagiannis, K. Papagiannaki, N. Taft, and M. Faloutsos. Profiling the end host. In Proceedings of the 8th international conference on Passive and active network measurement, PAM'07, pages 186--196, Berlin, Heidelberg, 2007. Springer-Verlag.
[14]
Y. Lee and Y. Lee. Toward scalable internet traffic measurement and analysis with hadoop. SIGCOMM Comput. Commun. Rev., 43(1):5--13, Jan. 2012.
[15]
B. Li, J. Springer, G. Bebis, and M. Hadi Gunes. Review: A survey of network flow applications. Journal of Network and Computer Applications, 36(2):567--581, Mar. 2013.
[16]
M. Meiss, F. Menczer, and A. Vespignani. Properties and evolution of internet traffic networks from anonymized flow data. ACM Trans. Internet Technol., 10(4):15:1--15:23, Mar. 2011.
[17]
A. Moore, M. Crogan, A. W. Moore, Q. Mary, D. Zuev, D. Zuev, and M. L. Crogan. Discriminators for use in flow-based classification. Technical Report RR-05-13, Dept. of Computer Science, Queen Mary University of London, Aug. 2005.
[18]
D. Schatzmann, W. Muhlbauer, T. Spyropoulos, and X. Dimitropoulos. Digging into HTTPS: flow-based classification of webmail traffic. In Proceedings of the 10th annual conference on Internet measurement, IMC '10, pages 322--327, New York, NY, USA, 2010. ACM.
[19]
G. Tan, M. Poletto, J. Guttag, and F. Kaashoek. Role classification of hosts within enterprise networks based on connection patterns. In Proceedings of the annual conference on USENIX Annual Technical Conference, ATEC '03, pages 2--2, Berkeley, CA, USA, 2003. USENIX Association.
[20]
I. Trestian, S. Ranjan, A. Kuzmanovi, and A. Nucci. Unconstrained endpoint profiling (googling the internet). SIGCOMM Computer Communication Review, 38(4):279--290, Aug. 2008.
[21]
K. Xu, F. Wang, and L. Gu. Network-aware behavior clustering of Internet end hosts. In INFOCOM, 2011 Proceedings IEEE, pages 2078--2086, Apr. 2011.

Cited By

View all
  • (2022)Classifying and tracking enterprise assets via dual-grained network behavioral analysisComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2022.109387218:COnline publication date: 9-Dec-2022
  • (2021)Automated IoT Device Identification Based on Full Packet Information Using Real-Time Network TrafficSensors10.3390/s2108266021:8(2660)Online publication date: 10-Apr-2021
  • (2021)Host Behavior in Computer Network: One-Year StudyIEEE Transactions on Network and Service Management10.1109/TNSM.2020.303652818:1(822-838)Online publication date: Mar-2021
  • Show More Cited By

Index Terms

  1. A supervised machine learning approach to classify host roles on line using sFlow

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      HPPN '13: Proceedings of the first edition workshop on High performance and programmable networking
      June 2013
      70 pages
      ISBN:9781450319812
      DOI:10.1145/2465839
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 June 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. machine learning
      2. network traffic classific
      3. sFlow

      Qualifiers

      • Research-article

      Conference

      HPDC'13
      Sponsor:

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)29
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Classifying and tracking enterprise assets via dual-grained network behavioral analysisComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2022.109387218:COnline publication date: 9-Dec-2022
      • (2021)Automated IoT Device Identification Based on Full Packet Information Using Real-Time Network TrafficSensors10.3390/s2108266021:8(2660)Online publication date: 10-Apr-2021
      • (2021)Host Behavior in Computer Network: One-Year StudyIEEE Transactions on Network and Service Management10.1109/TNSM.2020.303652818:1(822-838)Online publication date: Mar-2021
      • (2020)IoT Event Classification Based on Network TrafficIEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFOCOMWKSHPS50562.2020.9162885(854-859)Online publication date: Jul-2020
      • (2019)Behavior-Aware Network Segmentation using IP FlowsProceedings of the 14th International Conference on Availability, Reliability and Security10.1145/3339252.3339265(1-9)Online publication date: 26-Aug-2019
      • (2019)A survey on traffic-behavioral profiling of network end-targetProceedings of the ACM Turing Celebration Conference - China10.1145/3321408.3326653(1-7)Online publication date: 17-May-2019
      • (2019)A Survey on Big Data for Network Traffic Monitoring and AnalysisIEEE Transactions on Network and Service Management10.1109/TNSM.2019.293335816:3(800-813)Online publication date: Sep-2019
      • (2019)FENet: Roles Classification of IP Addresses Using Connection Patterns2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT)10.1109/INFOCT.2019.8711412(158-164)Online publication date: Mar-2019
      • (2019)Automated IoT Device Identification using Network TrafficICC 2019 - 2019 IEEE International Conference on Communications (ICC)10.1109/ICC.2019.8761559(1-7)Online publication date: May-2019
      • (2019)FENet/IP: Uncovering the Fine-Grained Structure in IP Addresses2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00133(921-928)Online publication date: Aug-2019
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media