Skip to main content

A Statistical Anomaly-Based Algorithm for On-line Fault Detection in Complex Software Critical Systems

  • Conference paper
Computer Safety, Reliability, and Security (SAFECOMP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6894))

Included in the following conference series:

Abstract

The next generation of software systems in Large-scale Complex Critical Infrastructures (LCCIs) requires efficient runtime management and reconfiguration strategies, and the ability to take decisions on the basis of current and past behavior of the system. In this paper we propose an anomalybased approach for the detection of online faults, which is able to (i) cope with highly variable and non-stationary environment and to (ii) work without any initial training phase. The novel algorithm is based on Statistical Predictor and Safety Margin (SPS), which was initially developed to estimate the uncertainty in time synchronization mechanisms.

The SPS anomaly detection algorithm has been experimented on a case study from the Air Traffic Management (ATM) domain. Results have been compared with an algorithm, which adopts static thresholds, in the same scenarios [5]. Experimental results show limitations of static thresholds in highly variable scenarios, and the ability of SPS to fulfill the expectations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic Concepts and Taxonomy of Dependable and Secure Computing. IEEE Trans. Dependable Secure Computing (2004)

    Google Scholar 

  2. Salfner, F., Lenk, M., Malek, M.: A survey of online failure prediction methods. ACM Computing Surveys, CSUR (2010)

    Google Scholar 

  3. Basseville, M., Nikiforov, I.V.: Detection of abrupt changes: theory and application. Prentice-Hall, Inc., Englewood Cliffs (1993)

    Google Scholar 

  4. Natella, R., Cotroneo, D.: Emulation of transient software faults for dependability assessment: A case study. In: Proceedings of the Eighth European Dependable Computing Conference, EDCC (2010)

    Google Scholar 

  5. Carrozza, G., Cinque, M., Cotroneo, D., Natella, R.: Operating System Suppor t to Detect Application Hangs. In: International Workshop on Verification and Evaluation of Computer and Communication Systems, VECoS (2008)

    Google Scholar 

  6. Irrera, I., Duraes, J., Vieira, M., Madeira, H.: Towards Identifying the Best Variables for Failure Prediction Using Injection of Realistic Software Faults. In: Pacific Rim International Symposium on Dependable Computing. IEEE, Los Alamitos (2010)

    Google Scholar 

  7. Brancati, A., Bondavalli, A., Ceccarelli, A.: Safe estimation of time uncertainty of local clocks. In: International Symposium on Precision Clock Synchronization for Measurement, Control and Communication (2009)

    Google Scholar 

  8. Salfner, F.: Event-based failure prediction: an extended hidden Markov model approach, Dissertation.de, Berlin (2008)

    Google Scholar 

  9. Daidone, A.: Critical infrastructures: a conceptual framework for diagnosis, some applications and their quantitative analysis. PhD thesis, Università degli studi di Firenze (December 2009)

    Google Scholar 

  10. Johnson, C., Malek, M.: Progress achieved in the research area of Critical Information Infrastructure Protection by the IST-FP6 Projects CRUTIAL, IRRIIS and GRID. Technical report, EU Report (March 2007)

    Google Scholar 

  11. Montgomery, D.C.: Controllo statistic della qualità, 1st edn. McGraw-Hill italia, New York (2000)

    Google Scholar 

  12. Chen, W., Toueg, S., Aguilera, M.K.: On the Quality of Service of Failure Detectors. In: Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8) (2000)

    Google Scholar 

  13. Casimiro, A., Lollini, P., Dixit, M., Bondavalli, A., Verissimo, P.: A framework for dependable QoS adaptation in probabilistic environments. In: Proceedings of the 2008 ACM Symposium on Applied Computing (2008)

    Google Scholar 

  14. Madeira, H., Costa, J., Vieira, M.: The OLAP and data warehousing approaches for analysis and sharing of results from dependability evaluation experiments. In: 2003 International Conference on Dependable Systems and Networks, 2003, pp. 86–91 (June 22-25, 2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bovenzi, A., Brancati, F., Russo, S., Bondavalli, A. (2011). A Statistical Anomaly-Based Algorithm for On-line Fault Detection in Complex Software Critical Systems. In: Flammini, F., Bologna, S., Vittorini, V. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2011. Lecture Notes in Computer Science, vol 6894. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24270-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24270-0_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24269-4

  • Online ISBN: 978-3-642-24270-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics