skip to main content
10.1145/3148055.3148076acmconferencesArticle/Chapter ViewAbstractPublication PagesbdcatConference Proceedingsconference-collections
research-article

An Imputation-based Augmented Anomaly Detection from Large Traces of Operating System Events

Authors Info & Claims
Published:05 December 2017Publication History

ABSTRACT

Software debugging, audit, and compliance testing are some of the tasks we perform using execution traces of an operating system. However, these actions gather information about the behavior of the software vis-a-vis its design aims. In this work, our analysis of the execution traces of an embedded real-time operating system (RTOS) is rather to model the behavior of the physical system being managed by the software application via the embedded operating system. Hence, for an event-triggered embedded RTOS that controls the behavior of a bespoke system like an unmanned aerial vehicle (UAV), the events in the execution traces of the embedded RTOS is directly linked to the operation of the controlled physical system. Therefore, we hypothesize that the frequency of events (method/function calls) per observation is a useful feature for modeling the behavior of the physical system controlled by the operating system.

Furthermore, we tackle the challenge of lack of data that sufficiently captures the possible degree of aberration that may occur in a system. We model augmentation via artificial missingness and imputation in the data we have to generate new cases. We implement missingness using the missing completely at random (MCAR) strategy, and we use the overall single mean imputation method at the imputation stage. This imputation method takes the average of the remaining values in the dataset and replaces missing values with this average. This accretion leads to an imputation-based augmented anomaly detection model that enables us to expand both the training and validation/test data. Expansion of the test data ensures that we reduce the misclassification resulting from the non-parametric nature of the anomalies that may occur on the physical system, while the use of injected data for training helps us to do a stress test on our model.

We test our model with traces of a real-time operating system kernel of a UAV, and the results show that the model achieves an improved anomalous trace detection accuracy even under the induced missingness.

References

  1. Charu C. Aggarwal, Alexander Hinneburg, and Daniel A. Keim. 2001. On the surprising behavior of distance metrics in high dimensional space International Conference on Database Theory. Springer, 420--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Paul D. Allison. 2002. Missing data: Quantitative applications in the social sciences. Brit. J. Math. Statist. Psych. Vol. 55, 1 (2002), 193--196.Google ScholarGoogle ScholarCross RefCross Ref
  3. Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) Vol. 41, 3 (2009), 15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2012. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 24, 5 (2012), 823--839. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Rogier T. Donders, Geert J.M.G. van der Heijden, Theo Stijnen, and Karel G.M. Moons. 2006. Review: a gentle introduction to imputation of missing values. Journal of clinical epidemiology Vol. 59, 10 (2006), 1087--1091.Google ScholarGoogle ScholarCross RefCross Ref
  6. Pedro Garcia-Teodoro, J. Diaz-Verdejo, Gabriel Maciá-Fernández, and Enrique Vázquez. 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. computers & security Vol. 28, 1 (2009), 18--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yu Gu, Andrew McCallum, and Don Towsley. 2005. Detecting anomalies in network traffic using maximum entropy estimation Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement. USENIX Association, 32--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Stamatis Karnouskos. 2011. Stuxnet worm impact on industrial cyber-physical system security IECON 2011-37th Annual Conference on IEEE Industrial Electronics Society. IEEE, 4490--4494.Google ScholarGoogle Scholar
  9. Roderick J.A. Little. 1988. A test of missing completely at random for multivariate data with missing values. J. Amer. Statist. Assoc. Vol. 83, 404 (1988), 1198--1202.Google ScholarGoogle ScholarCross RefCross Ref
  10. George Nychis, Vyas Sekar, David G. Andersen, Hyong Kim, and Hui Zhang. 2008. An empirical evaluation of entropy-based traffic anomaly detection Proceedings of the 8th ACM SIGCOMM conference on Internet measurement. ACM, 151--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Animesh Patcha and Jung-Min Park. 2007. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer networks, Vol. 51, 12 (2007), 3448--3470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Per Runeson, Magnus Alexandersson, and Oskar Nyholm. 2007. Detection of duplicate defect reports using natural language processing Proceedings of the 29th international conference on Software Engineering. IEEE Computer Society, 499--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mahmoud Salem, Mark Crowley, and Sebastian Fischmeister. 2016 a. Anomaly detection using inter-arrival curves for real-time systems Real-Time Systems (ECRTS), 2016 28th Euromicro Conference on. IEEE, 97--106.Google ScholarGoogle Scholar
  14. Mahmoud Salem, Mark Crowley, and Sebastian Fischmeister. 2016 b. Dataset for Anomaly Detection Using Inter-Arrival Curves for Real-time Systems. (July. 2016).Google ScholarGoogle Scholar
  15. Robert R. Sokal. 1958. A statistical method for evaluating systematic relationships. Univ Kans Sci Bull Vol. 38 (1958), 1409--1438.Google ScholarGoogle Scholar
  16. Robert R. Sokal and F. James Rohlf. 1962. The comparison of dendrograms by objective methods. Taxon (1962), 33--40.Google ScholarGoogle Scholar
  17. Marina Soley-Bori. 2013. Dealing with missing data: Key assumptions and methods for applied analysis. Boston University (2013).Google ScholarGoogle Scholar
  18. Arno Wagner and Bernhard Plattner. 2005. Entropy based worm and anomaly detection in fast IP networks Enabling Technologies: Infrastructure for Collaborative Enterprise, 2005. 14th IEEE International Workshops on. IEEE, 172--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xiaoyin Wang, Lu Zhang, Tao Xie, John Anvik, and Jiasu Sun. 2008. An approach to detecting duplicate bug reports using natural language and execution information. In Software Engineering, 2008. ICSE'08. ACM/IEEE 30th International Conference on. IEEE, 461--470. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An Imputation-based Augmented Anomaly Detection from Large Traces of Operating System Events

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        BDCAT '17: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
        December 2017
        288 pages
        ISBN:9781450355490
        DOI:10.1145/3148055

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 December 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        BDCAT '17 Paper Acceptance Rate27of93submissions,29%Overall Acceptance Rate27of93submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader