skip to main content
10.1145/3041008.3041018acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Feature Cultivation in Privileged Information-augmented Detection

Published: 24 March 2017 Publication History

Abstract

Modern detection systems use sensor outputs available in the deployment environment to probabilistically identify attacks. These systems are trained on past or synthetic feature vectors to create a model of anomalous or normal behavior. Thereafter, run-time collected sensor outputs are compared to the model to identify attacks (or the lack of attack). While this approach to detection has been proven to be effective in many environments, it is limited to training on only features that can be reliably collected at detection time. Hence, they fail to leverage the often vast amount of ancillary information available from past forensic analysis and post-mortem data. In short, detection systems do not train (and thus do not learn from) features that are unavailable or too costly to collect at run-time. Recent work proposed an alternate model construction approach that integrates forensic "privilege" information---features reliably available at training time, but not at run-time---to improve accuracy and resilience of detection systems. In this paper, we further evaluate two of proposed techniques to model training with privileged information: knowledge transfer, and model influence. We explore the cultivation of privileged features, the efficiency of those processes and their influence on the detection accuracy. We observe that the improved integration of privileged features makes the resulting detection models more accurate. Our evaluation shows that use of privileged information leads to up to 8.2% relative decrease in detection error for fast-flux bot detection over a system with no privileged information, and 5.5% for malware classification.

References

[1]
A. A. Cardenas, P. K. Manadhata, and S. P. Rajan. Big data analytics for security. Proc. IEEE Security & Privacy, 2013.
[2]
Richard Zuech, Taghi M Khoshgoftaar, and Randall Wald. Intrusion detection and big heterogeneous data: a survey. Journal of Big Data, 2015.
[3]
Z. Berkay Celik, Patrick McDaniel, Rauf Izmailov, Nicolas Papernot, and Ananthram Swami. Extending detection with forensic information. arXiv:1603.09638, 2016.
[4]
Vladimir Vapnik and Rauf Izmailov. Learning using privileged information: Similarity control and knowledge transfer. Journal of ML Research, 2015.
[5]
Vladimir Vapnik and Akshay Vashist. A new learning paradigm: Learning using privileged information. Neural Networks, 2009.
[6]
Ting-Fang Yen, Alina Oprea, Kaan Onarlioglu, Todd Leetham, William Robertson, Ari Juels, and Engin Kirda. Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In Proc. Computer Security Applications. ACM, 2013.
[7]
Jeffrey Bickford, H Andrés Lagar-Cavilla, Alexander Varshavsky, Vinod Ganapathy, and Liviu Iftode. Security versus energy tradeoffs in host-based mobile malware detection. In Proc. Mobile systems, applications, and services. ACM, 2011.
[8]
V. Sharmanska, N. Quadrianto, and C. H. Lampert. Learning to rank using privileged information. In Proc. International Conference on Computer Vision (ICCV), 2013.
[9]
Christopher M. Bishop. Pattern recognition and machine learning. 2006.
[10]
D. A. Belsley, E. Kuh, and R. E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons, 2005.
[11]
Michael Friendly and Ernest Kwan. Where's Waldo? Visualizing collinearity diagnostics. The American Statistician, 2009.
[12]
Z. Berkay Celik, Rauf Izmailov, and Patrick McDaniel. Proof and Implementation of Algorithmic Realization of Learning Using Privileged Information (LUPI) Paradigm: SVM
[13]
. Technical Report NAS-TR-0187--2015, CSE Department, PSU, December 2015.
[14]
Z. Berkay Celik and Sema Oktug. Detection of Fast-Flux Networks using various DNS feature sets. In Proc. IEEE Symposium on Computers and Communications (ISCC), 2013.
[15]
Microsoft malware classification challenge. https://www.kaggle.com/c/malware-classification/. {Online; accessed 10-May-2015}.
[16]
Ida pro: Disassembler and debugger. http://www.hex-rays.com/idapro/.
[17]
Lei Yu and Huan Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proc. International Conference on Machine Learning (ICML), 2003.
[18]
M Zubair Rafique and Juan Caballero. Firma: Malware clustering and network signature generation with mixed network behaviors. In Proc. RAID. 2013.
[19]
Nir Nissim, Robert Moskovitch, Lior Rokach, and Yuval Elovici. Novel active learning methods for enhanced pc malware detection in windows os. Expert Systems with Applications, 2014.
[20]
Mansour Ahmadi, Giorgio Giacinto, Dmitry Ulyanov, Stanislav Semenov, and Mikhail Trofimov. Novel feature extraction, selection and fusion for effective malware family classification. arXiv preprint arXiv:1511.04317, 2015.
[21]
Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee, and David Dagon. From throw-away traffic to bots: detecting the rise of dga-based malware. In Proc. USENIX Security, 2012.
[22]
Leyla Bilge, Engin Kirda, Christopher Kruegel, and Marco Balduzzi. Exposure: Finding malicious domains using passive dns analysis. In Proc. NDSS, 2011.
[23]
Sandeep Yadav, Ashwath Kumar Krishna Reddy, AL Reddy, and Supranamaya Ranjan. Detecting algorithmically generated malicious domain names. In Proc. ACM Internet measurement, 2010.
[24]
Z. Berkay Celik, Jayaram Raghuram, George Kesidis, and David J Miller. Salting public traces with attack traffic to test flow classifiers. In Proc. Usenix Cyber Security Experimentation and Test, 2011.
[25]
Z. Berkay Celik, Robert J Walls, Patrick McDaniel, and Ananthram Swami. Malware traffic detection using tamper resistant features. In Proc. IEEE Military Communications Conference (MILCOM), 2015.
[26]
Ziheng Wang and Qiang Ji. Classifier learning with hidden information. In Proc. IEEE Computer Vision and Pattern Recognition, 2015.
[27]
Daniel Hernández-Lobato, Viktoriia Sharmanska, Kristian Kersting, Christoph H Lampert, and Novi Quadrianto. Mind the nuisance: Gaussian process classification using privileged noise. In Proc. Advances in Neural Information Processing Systems, 2014.
[28]
David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, and Vladimir Vapnik. Unifying distillation and privileged information. arXiv preprint arXiv:1511.03643, 2015.
[29]
Z. Berkay Celik, David Lopez-Paz, and Patrick McDaniel. Patient-driven privacy control through generalized distillation. arXiv:1611.08648, 2016.

Cited By

View all
  • (2018)Detection under Privileged InformationProceedings of the 2018 on Asia Conference on Computer and Communications Security10.1145/3196494.3196502(199-206)Online publication date: 29-May-2018
  • (2018)Extending Detection with Privileged Information via Generalized Distillation2018 IEEE Security and Privacy Workshops (SPW)10.1109/SPW.2018.00021(83-88)Online publication date: May-2018
  • (2017)Malware modeling and experimentation through parameterized behaviorThe Journal of Defense Modeling and Simulation: Applications, Methodology, Technology10.1177/154851291772175515:1(31-48)Online publication date: 7-Aug-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IWSPA '17: Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics
March 2017
88 pages
ISBN:9781450349093
DOI:10.1145/3041008
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 March 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. intrusion detection
  2. privileged information

Qualifiers

  • Research-article

Funding Sources

Conference

CODASPY '17
Sponsor:

Acceptance Rates

IWSPA '17 Paper Acceptance Rate 4 of 14 submissions, 29%;
Overall Acceptance Rate 18 of 58 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Detection under Privileged InformationProceedings of the 2018 on Asia Conference on Computer and Communications Security10.1145/3196494.3196502(199-206)Online publication date: 29-May-2018
  • (2018)Extending Detection with Privileged Information via Generalized Distillation2018 IEEE Security and Privacy Workshops (SPW)10.1109/SPW.2018.00021(83-88)Online publication date: May-2018
  • (2017)Malware modeling and experimentation through parameterized behaviorThe Journal of Defense Modeling and Simulation: Applications, Methodology, Technology10.1177/154851291772175515:1(31-48)Online publication date: 7-Aug-2017
  • (2017)Patient-Driven Privacy Control through Generalized Distillation2017 IEEE Symposium on Privacy-Aware Computing (PAC)10.1109/PAC.2017.13(1-12)Online publication date: Aug-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media