skip to main content
research-article

On Runtime and Classification Performance of the Discretize-Optimize (DISCO) Classification Approach

Published: 25 January 2019 Publication History

Abstract

Using machine learning in high-speed networks for tasks such as flow classification typically requires either very resource efficient classification approaches, large amounts of computational resources, or specialized hardware. Here we provide a sketch of the discretize-optimize (DISCO) approach which can construct an extremely efficient classifier for low dimensional problems by combining feature selection, efficient discretization, novel bin placement, and lookup. As feature selection and discretization parameters are crucial, appropriate combinatorial optimization is an important aspect of the approach. A performance evaluation is performed for a YouTube classification task using a cellular traffic data set. The initial evaluation results show that the DISCO approach can move the Pareto boundary in the classification performance versus runtime trade-off by up to an order of magnitude compared to runtime optimized random forest and decision tree classifiers.

References

[1]
Laurent Bernaille, Renata Teixeira, Ismael Akodkenou, Augustin Soule, and Kave Salamatian. 2006. Traffic Classification on the Fly. SIGCOMM Comput. Commun. Rev. 36, 2 (April 2006), 23--26.
[2]
Raouf Boutaba, Mohammad A Salahuddin, Noura Limam, Sara Ayoubi, Nashid Shahriar, Felipe Estrada- Solano, and Oscar M Caicedo. 2018. A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. Journal of Internet Services and Applications 9, 1 (2018), 16.
[3]
Johan Garcia and Topi Korhonen. 2018. Efficient Distribution-Derived Features for High-Speed Encrypted Flow Classification. In Proceedings of 2018 ACM SIGCOMM Workshop on Network Meets AI & ML. 21--27.
[4]
Johan Garcia, Topi Korhonen, Ricky Andersson, and Filip Vastlund. 2018. Towards Video Flow Classification at a Million Encrypted Flows Per Second. In 2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA). 358--365.
[5]
Salvador Garcia, Julian Luengo, José Antonio Sáez, Victoria Lopez, and Francisco Herrera. 2013. A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning. IEEE Transactions on Knowledge and Data Engineering 25, 4 (2013), 734--750.
[6]
Tristan Groleat, Sandrine Vaton, and Matthieu Arzel. 2014. High-speed flow-based classification on FPGA. International journal of network management 24, 4 (2014), 253--271.
[7]
H. Liu, F. Hussain, C.L. Tan, and M. Dash. 2002. Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 4 (2002), 393--423.
[8]
Petr Velan, Milan Cermák, Pavel Celeda, and Martin Dra?ar. 2015. A survey of methods for encrypted traffic classification and analysis. International Journal of Network Management 25, 5 (2015), 355--374.

Cited By

View all
  • (2022)Machine Learning Methods in Smart Lighting Toward Achieving User Comfort: A SurveyIEEE Access10.1109/ACCESS.2022.316976510(45137-45178)Online publication date: 2022
  • (2020)A surrogate-assisted GA enabling high-throughput ML by optimal feature and discretization selectionProceedings of the 2020 Genetic and Evolutionary Computation Conference Companion10.1145/3377929.3398092(1632-1640)Online publication date: 8-Jul-2020
  • (2020)DIOPT: Extremely Fast Classification Using Lookups and Optimal Feature Discretization2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9207037(1-8)Online publication date: Jul-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 46, Issue 3
December 2018
174 pages
ISSN:0163-5999
DOI:10.1145/3308897
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 January 2019
Published in SIGMETRICS Volume 46, Issue 3

Check for updates

Author Tags

  1. classification
  2. machine learning
  3. runtime

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Machine Learning Methods in Smart Lighting Toward Achieving User Comfort: A SurveyIEEE Access10.1109/ACCESS.2022.316976510(45137-45178)Online publication date: 2022
  • (2020)A surrogate-assisted GA enabling high-throughput ML by optimal feature and discretization selectionProceedings of the 2020 Genetic and Evolutionary Computation Conference Companion10.1145/3377929.3398092(1632-1640)Online publication date: 8-Jul-2020
  • (2020)DIOPT: Extremely Fast Classification Using Lookups and Optimal Feature Discretization2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9207037(1-8)Online publication date: Jul-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media