skip to main content
10.1145/2889443.2889458acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodularityConference Proceedingsconference-collections
short-paper

On structuring holistic fault tolerance

Published: 14 March 2016 Publication History

Abstract

Computer systems are developed taking into account that they should be easily maintained in the future. It is one of the main requirements for the sound architectural design. The existing approaches to introducing fault tolerance rely on recursive system structuring out of functional components – this typically results in non-optimal fault tolerance. The paper proposes a vision of structuring complex many-core systems by introducing a special component supporting system-wide fault tolerance coordination. The component acts as a central module making decisions about fault tolerance strategies to be implemented by individual system components depending on the performance and energy requirements specified as system operating modes.

References

[1]
T. Anderson, P. A. Lee, Fault tolerance, principles and practice, Prentice/Hall International. 1981.
[2]
A. DeHon, N. Carter, H. Quinn, Editors, Final Report for CCC Cross-Layer Reliability Visioning Study, CCC, 2011.
[3]
R. Gensh, A. Romanovsky, A. Yakovlev, Engineering Cross-Layer Fault Tolerance in Many-Core Systems, in Proc. Software Engineering for Resilient Systems, LNCS-9274, 2015.
[4]
Y. Wang, H. Wu, F. Lin, N. Tzeng, Cross-Layer Protocol Design and Optimization for Delay/Fault-Tolerant Mobile Sensor Networks, IEEE Journal on Selected Areas in Communications 26(5) 809-819, 2008.
[5]
R. Ludwig, Eliminating Inefficient Cross-Layer Interactions in Wireless Networking, PhD Thesis, RWTH Aahen, Germany, 2000.
[6]
R. Hilliard, I. Malavolta, H. Muccini, P. Pelliccione, On the Composition and Reuse of Viewpoints across Architecture Frameworks, in Proc. WICSA/ECSA 2012, Helsinki, Finland, ACM.
[7]
I. Lopatkin, A. Iliasov, A. Romanovsky, Rigorous Development of Dependable Systems using Fault Tolerance Views, in Proc. ISSRE’11, Hiroshima, Japan. 2011. IEEE CS.
[8]
A. Rafiev, F. Xia, A. Iliasov, R. Gensh, A. Aalsaud, A. Romanovsky, A. Yakovlev, Order Graphs and Cross-layer Parametric Significance-driven Modelling, in Proc. ACSD'15, Brussels, Belgium, 2015. IEEE CS.
[9]
A. Rafiev, F. Xia, A. Iliasov, R. Gensh, A. Aalsaud, A. Romanovsky, A. Yakovlev, Power-proportional modelling fidelity, in Proc. 1st Workshop on Model-Implementation Fidelity, DATE 2015. Grenoble, France. 2015.
[10]
R. Brooks, A robust layered control system for a mobile robot, IEEE Journal of Robotics and Automation 2(1), 1986.

Cited By

View all
  • (2018)DroidEH: An Exception Handling Mechanism for Android Applications2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE.2018.00030(200-211)Online publication date: Oct-2018
  • (2017)Architecting Holistic Fault Tolerance2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE)10.1109/HASE.2017.13(5-8)Online publication date: 2017
  • (2017)Modelling for Systems with Holistic Fault ToleranceSoftware Engineering for Resilient Systems10.1007/978-3-319-65948-0_11(169-183)Online publication date: 11-Aug-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
MODULARITY 2016: Proceedings of the 15th International Conference on Modularity
March 2016
145 pages
ISBN:9781450339957
DOI:10.1145/2889443
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Many-core systems
  2. energy efficiency
  3. error recovery
  4. performance
  5. system layering
  6. system structuring

Qualifiers

  • Short-paper

Conference

Modularity '16

Acceptance Rates

Overall Acceptance Rate 41 of 139 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)DroidEH: An Exception Handling Mechanism for Android Applications2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE.2018.00030(200-211)Online publication date: Oct-2018
  • (2017)Architecting Holistic Fault Tolerance2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE)10.1109/HASE.2017.13(5-8)Online publication date: 2017
  • (2017)Modelling for Systems with Holistic Fault ToleranceSoftware Engineering for Resilient Systems10.1007/978-3-319-65948-0_11(169-183)Online publication date: 11-Aug-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media