research-article

Feature Cultivation in Privileged Information-augmented Detection

Authors:

Z. Berkay Celik,

Patrick McDaniel,

Rauf IzmailovAuthors Info & Claims

IWSPA '17: Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics

Pages 73 - 80

https://doi.org/10.1145/3041008.3041018

Published: 24 March 2017 Publication History

Abstract

Modern detection systems use sensor outputs available in the deployment environment to probabilistically identify attacks. These systems are trained on past or synthetic feature vectors to create a model of anomalous or normal behavior. Thereafter, run-time collected sensor outputs are compared to the model to identify attacks (or the lack of attack). While this approach to detection has been proven to be effective in many environments, it is limited to training on only features that can be reliably collected at detection time. Hence, they fail to leverage the often vast amount of ancillary information available from past forensic analysis and post-mortem data. In short, detection systems do not train (and thus do not learn from) features that are unavailable or too costly to collect at run-time. Recent work proposed an alternate model construction approach that integrates forensic "privilege" information---features reliably available at training time, but not at run-time---to improve accuracy and resilience of detection systems. In this paper, we further evaluate two of proposed techniques to model training with privileged information: knowledge transfer, and model influence. We explore the cultivation of privileged features, the efficiency of those processes and their influence on the detection accuracy. We observe that the improved integration of privileged features makes the resulting detection models more accurate. Our evaluation shows that use of privileged information leads to up to 8.2% relative decrease in detection error for fast-flux bot detection over a system with no privileged information, and 5.5% for malware classification.

References

[1]

A. A. Cardenas, P. K. Manadhata, and S. P. Rajan. Big data analytics for security. Proc. IEEE Security & Privacy, 2013.

Digital Library

[2]

Richard Zuech, Taghi M Khoshgoftaar, and Randall Wald. Intrusion detection and big heterogeneous data: a survey. Journal of Big Data, 2015.

[3]

Z. Berkay Celik, Patrick McDaniel, Rauf Izmailov, Nicolas Papernot, and Ananthram Swami. Extending detection with forensic information. arXiv:1603.09638, 2016.

[4]

Vladimir Vapnik and Rauf Izmailov. Learning using privileged information: Similarity control and knowledge transfer. Journal of ML Research, 2015.

Digital Library

[5]

Vladimir Vapnik and Akshay Vashist. A new learning paradigm: Learning using privileged information. Neural Networks, 2009.

Digital Library

[6]

Ting-Fang Yen, Alina Oprea, Kaan Onarlioglu, Todd Leetham, William Robertson, Ari Juels, and Engin Kirda. Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In Proc. Computer Security Applications. ACM, 2013.

Digital Library

[7]

Jeffrey Bickford, H Andrés Lagar-Cavilla, Alexander Varshavsky, Vinod Ganapathy, and Liviu Iftode. Security versus energy tradeoffs in host-based mobile malware detection. In Proc. Mobile systems, applications, and services. ACM, 2011.

Digital Library

[8]

V. Sharmanska, N. Quadrianto, and C. H. Lampert. Learning to rank using privileged information. In Proc. International Conference on Computer Vision (ICCV), 2013.

Digital Library

[9]

Christopher M. Bishop. Pattern recognition and machine learning. 2006.

Digital Library

[10]

D. A. Belsley, E. Kuh, and R. E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons, 2005.

[11]

Michael Friendly and Ernest Kwan. Where's Waldo? Visualizing collinearity diagnostics. The American Statistician, 2009.

[12]

Z. Berkay Celik, Rauf Izmailov, and Patrick McDaniel. Proof and Implementation of Algorithmic Realization of Learning Using Privileged Information (LUPI) Paradigm: SVM

[13]

. Technical Report NAS-TR-0187--2015, CSE Department, PSU, December 2015.

[14]

Z. Berkay Celik and Sema Oktug. Detection of Fast-Flux Networks using various DNS feature sets. In Proc. IEEE Symposium on Computers and Communications (ISCC), 2013.

[15]

Microsoft malware classification challenge. https://www.kaggle.com/c/malware-classification/. {Online; accessed 10-May-2015}.

[16]

Ida pro: Disassembler and debugger. http://www.hex-rays.com/idapro/.

[17]

Lei Yu and Huan Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proc. International Conference on Machine Learning (ICML), 2003.

Digital Library

[18]

M Zubair Rafique and Juan Caballero. Firma: Malware clustering and network signature generation with mixed network behaviors. In Proc. RAID. 2013.

Digital Library

[19]

Nir Nissim, Robert Moskovitch, Lior Rokach, and Yuval Elovici. Novel active learning methods for enhanced pc malware detection in windows os. Expert Systems with Applications, 2014.

[20]

Mansour Ahmadi, Giorgio Giacinto, Dmitry Ulyanov, Stanislav Semenov, and Mikhail Trofimov. Novel feature extraction, selection and fusion for effective malware family classification. arXiv preprint arXiv:1511.04317, 2015.

[21]

Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee, and David Dagon. From throw-away traffic to bots: detecting the rise of dga-based malware. In Proc. USENIX Security, 2012.

Digital Library

[22]

Leyla Bilge, Engin Kirda, Christopher Kruegel, and Marco Balduzzi. Exposure: Finding malicious domains using passive dns analysis. In Proc. NDSS, 2011.

[23]

Sandeep Yadav, Ashwath Kumar Krishna Reddy, AL Reddy, and Supranamaya Ranjan. Detecting algorithmically generated malicious domain names. In Proc. ACM Internet measurement, 2010.

Digital Library

[24]

Z. Berkay Celik, Jayaram Raghuram, George Kesidis, and David J Miller. Salting public traces with attack traffic to test flow classifiers. In Proc. Usenix Cyber Security Experimentation and Test, 2011.

Digital Library

[25]

Z. Berkay Celik, Robert J Walls, Patrick McDaniel, and Ananthram Swami. Malware traffic detection using tamper resistant features. In Proc. IEEE Military Communications Conference (MILCOM), 2015.

[26]

Ziheng Wang and Qiang Ji. Classifier learning with hidden information. In Proc. IEEE Computer Vision and Pattern Recognition, 2015.

[27]

Daniel Hernández-Lobato, Viktoriia Sharmanska, Kristian Kersting, Christoph H Lampert, and Novi Quadrianto. Mind the nuisance: Gaussian process classification using privileged noise. In Proc. Advances in Neural Information Processing Systems, 2014.

Digital Library

[28]

David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, and Vladimir Vapnik. Unifying distillation and privileged information. arXiv preprint arXiv:1511.03643, 2015.

[29]

Z. Berkay Celik, David Lopez-Paz, and Patrick McDaniel. Patient-driven privacy control through generalized distillation. arXiv:1611.08648, 2016.

Cited By

Celik ZMcDaniel PIzmailov RPapernot NSheatsley RAlvarez RSwami AKim JAhn GKim SKim YLopez JKim T(2018)Detection under Privileged InformationProceedings of the 2018 on Asia Conference on Computer and Communications Security10.1145/3196494.3196502(199-206)Online publication date: 29-May-2018
https://dl.acm.org/doi/10.1145/3196494.3196502
Celik ZMcDaniel P(2018)Extending Detection with Privileged Information via Generalized Distillation2018 IEEE Security and Privacy Workshops (SPW)10.1109/SPW.2018.00021(83-88)Online publication date: May-2018
https://doi.org/10.1109/SPW.2018.00021
Celik ZMcDaniel PBowen T(2017)Malware modeling and experimentation through parameterized behaviorThe Journal of Defense Modeling and Simulation: Applications, Methodology, Technology10.1177/154851291772175515:1(31-48)Online publication date: 7-Aug-2017
https://doi.org/10.1177/1548512917721755
Show More Cited By

Index Terms

Feature Cultivation in Privileged Information-augmented Detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Intrusion detection systems

Recommendations

Detection under Privileged Information
ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security

For well over a quarter century, detection systems have been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or ...
A two-generation based method for few-shot learning with few-shot instance-level privileged information
Abstract
Few-shot Learning (FSL) aims to recognize the novel classes from few novel samples. Recently, lots of methods have been proposed to improve FSL performance by introducing privileged information. However, on the one hand, they utilize the class ...
Pedestrian detection based on the privileged information

The pedestrian detection is always a challenging issue in the computer vision. Unlike the object recognition problem, the detection's speed is a critical factor. In order to accelerate detection speed while maintaining competitive accuracy, in this ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IWSPA '17: Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics

March 2017

88 pages

ISBN:9781450349093

DOI:10.1145/3041008

General Chair:
Rakesh Verma
University of Houston
,
Program Chairs:
Bhavani Thuraisingham
University of Texas -- Dallas
,
Rakesh Verma
University of Houston

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 March 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Army Research Laboratory

Conference

CODASPY '17

Sponsor:

SIGSAC

CODASPY '17: Seventh ACM Conference on Data and Application Security and Privacy

March 24, 2017

Arizona, Scottsdale, USA

Acceptance Rates

IWSPA '17 Paper Acceptance Rate 4 of 14 submissions, 29%;

Overall Acceptance Rate 18 of 58 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
200
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Celik ZMcDaniel PIzmailov RPapernot NSheatsley RAlvarez RSwami AKim JAhn GKim SKim YLopez JKim T(2018)Detection under Privileged InformationProceedings of the 2018 on Asia Conference on Computer and Communications Security10.1145/3196494.3196502(199-206)Online publication date: 29-May-2018
https://dl.acm.org/doi/10.1145/3196494.3196502
Celik ZMcDaniel P(2018)Extending Detection with Privileged Information via Generalized Distillation2018 IEEE Security and Privacy Workshops (SPW)10.1109/SPW.2018.00021(83-88)Online publication date: May-2018
https://doi.org/10.1109/SPW.2018.00021
Celik ZMcDaniel PBowen T(2017)Malware modeling and experimentation through parameterized behaviorThe Journal of Defense Modeling and Simulation: Applications, Methodology, Technology10.1177/154851291772175515:1(31-48)Online publication date: 7-Aug-2017
https://doi.org/10.1177/1548512917721755
Celik ZLopez-Paz DMcDaniel P(2017)Patient-Driven Privacy Control through Generalized Distillation2017 IEEE Symposium on Privacy-Aware Computing (PAC)10.1109/PAC.2017.13(1-12)Online publication date: Aug-2017
https://doi.org/10.1109/PAC.2017.13

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten