A Principled Approach to HPC Event Monitoring.
Conference
·
OSTI ID:1239260
- UNM
- Perdue
Abstract not provided.
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- DOE ASCR
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1239260
- Report Number(s):
- SAND2015-1074C; 567013
- Resource Relation:
- Conference: Proposed for presentation at the Fault Tolerance for HPC at eXtreme Scale (FTXS) Workshop, help at ACM Symposium on High Performance Distributed Computing (HPDC) held June 15-19, 2015 in Portland, OR.
- Country of Publication:
- United States
- Language:
- English
Similar Records
HPC Event Log Analysis: Method feasibility for event correlation and prediction.
Principles of Scalable HPC System Design.
Event Correlation and Failure Prediction in HPC clusters.
Conference
·
Sun Dec 01 00:00:00 EST 2013
·
OSTI ID:1239260
+2 more
Principles of Scalable HPC System Design.
Conference
·
Wed Feb 01 00:00:00 EST 2012
·
OSTI ID:1239260
Event Correlation and Failure Prediction in HPC clusters.
Conference
·
Sun Sep 01 00:00:00 EDT 2013
·
OSTI ID:1239260
+2 more