Abstract
Cloud environments make resilience more challenging because of the sharing of non-virtualised resources, frequent reconfigurations, and cyber attacks on these flexible and dynamic systems. We present a Cloud Resilience Management Framework (CRMF), which models and then applies an existing resilience strategy in a cloud operating context to diagnose anomalies. The framework uses an end-to-end feedback loop that allows remediation to be integrated with the existing cloud management systems. We demonstrate the applicability of the framework with a use-case for effective cloud resilience management.
Zusammenfassung
Cloud-Umgebungen stellen wegen der gemeinsamen Nutzung von nicht-virtualisierten Ressourcen, häufiger Rekonfigurationen und Cyber-Angriffen auf diese flexiblen und dynamischen Systeme größere Herausforderungen an Ausfallsicherheit. In dieser Arbeit wird ein Cloud Resilience Management Framework (CRMF) präsentiert, das eine bereits existierende Ausfallsicherheitsstrategie im Kontext eines Cloudbetriebs modelliert und dort anwendet, um Anomalien zu erkennen. Das Framework benutzt eine Ende-zu-Ende-Feedbackschleife, die es ermöglicht, Problembehebung in vorhandene Cloud-Managementsysteme zu integrieren. Weiterhin wird die Anwendbarkeit dieses Frameworks durch einen Anwendungsfall mit effizientem Cloud Resilience Management gezeigt.
Similar content being viewed by others
Notes
In the NIST cloud computing reference architecture [8] the term tenant is used for consumers who use the cloud based services.
Work presented here is carried out within the FP 7 SECCRIT (SEcure Cloud computing for CRitical infrastructure IT) project (FP7-SEC-2012-1), which is a multidisciplinary research project with the mission to analyse and evaluate cloud computing technologies with respect to security risks in sensitive environments, and to develop methodologies, technologies, and best practices for creating a secure, trustworthy, and high assurance cloud computing environment.
European Union Agency for Network and Information Security: http://www.enisa.europa.eu/.
ResumeNet: http://www.comp.lancs.ac.uk/resilience/.
Heat Orchestration Template: http://docs.openstack.org/developer/heat/template_guide/hot_guide.html.
OpenStack: http://www.openstack.org/.
Volatility framework: https://code.google.com/p/volatility/.
libVMI: https://code.google.com/p/vmitools/.
tcpdump/libpcap: http://www.tcpdump.org/.
libpcap API: http://www.tcpdump.org/.
References
PRECYSE (2014): http://www.precyse.eu/. Accessed: 2014-10-26.
ResumeNet (2014): http://www.resumenet.eu/. Accessed: 2014-10-26.
TClouds (2014): http://www.tclouds-project.eu//. Accessed: 2014-10-26.
SECCRIT Consortium (2013): An architectural framework for critical infrastructure in cloud computing. Technical report.
Ali, A., Schaeffer-Filho, A., Smith, P., Hutchison, D. (2010): Justifying a policy based approach for ddos remediation: a case study. In 11th annual conference on the convergence of telecommunications, networking & broadcasting, PGNet 2010, Liverpool, UK (pp. 21–22).
Angelov, P., Yager, R. (2011): Simplified fuzzy rule-based systems using non-parametric antecedents and relative data density. In IEEE workshop on evolving and adaptive intelligent systems, EAIS (pp. 62–69). New York: IEEE Press.
Beigi, M. S., Calo, S., Verma, D. (2004): Policy transformation techniques in policy-based systems management. In Proceedings of the fifth IEEE international workshop on policies for distributed systems and networks, POLICY 2004 (pp. 13–22). New York: IEEE Press.
Bohn, R. B., Messina, J., Liu, F., Tong, J., Mao, J. (2011): NIST cloud computing reference architecture. In Proceedings of the IEEE world congress on services, SERVICES ’11, Washington, DC, USA (pp. 594–596). Los Alamitos: IEEE Comput. Soc. ISBN 978-0-7695-4461-8. doi:10.1109/SERVICES.2011.105.
Santiago Cáceres, E., Oliviero, F. (2013): Deliverable 1.2: report on requirements and use cases. https://seccrit.eu/upload/D2-1-Report_on_requirements_and_use_cases-v2.0.pdf.
Casassa Mont, M., Baldwin, A., Goh, C. (2000): Power prototype: towards integrated policy-based management. In Network operations and management symposium. NOMS 2000 (pp. 789–802). New York: IEEE/IFIP.
Catteddu, D. (2011): Security and resilience in governmental clouds: making an informed decision. Technical report, European Network and Information Security Agency (ENISA). http://www.enisa.europa.eu/act/rm/emerging-and-future-risk/deliverables/security-and-resilience-in-governmental-clouds.
Cholda, P., Mykkeltveit, A., Helvik, B. E., Wittner, O. J., Jajszczyk, A. (2007): A survey of resilience differentiation frameworks in communication networks. IEEE Commun. Surv. Tutor., 9(4), 32–55.
Cuppens, F., Miege, A. (2003): Administration model for OR-BAC. In On the move to meaningful Internet systems, OTM 2003 workshops (pp. 754–768). Berlin: Springer.
Cuppens, F., Cuppens-Boulahia, N., Coma, C. (2006): Motorbac: un outil dadministration et de simulation de politiques de sécurité. In First joint conference security in network architectures (SAR) and security of information systems (SSI) (pp. 6–9).
Gamer, T. (2009): Anomaly-based identification of large-scale attacks. In Global telecommunications conference, GLOBECOM 2009 (pp. 1–6). New York: IEEE Press.
Hegering, H.-G., Abeck, S., Wies, R. (1996): A corporate operation framework for network service management. IEEE Commun. Mag., 34(1), 62–68.
Kaikini, P., Lewis, L., Malik, R., Rustici, E., Scott, W., Sycamore, S., Thebaut, S. (1999): Method and apparatus for defining and enforcing policies for configuration management in communications networks, February 16, 1999. US patent 5,872,928.
Abou El Kalam, A., Baida, R. E., Balbiani, P., Benferhat, S., Cuppens, F., Deswarte, Y., Miege, A., Saurel, C., Trouessin, G. (2003): Organisation based access control. In Proceedings of IEEE 4th international workshop on policies for distributed systems and networks. POLICY 2003 (pp. 120–131). New York: IEEE Press.
Lakhina, A., Crovella, M., Diot, C. (2005): Mining anomalies using traffic feature distributions. In Proceedings of the 2005 conference on applications, technologies, architectures, and protocols for computer communications, SIGCOMM ’05, New York, NY, USA (pp. 217–228). New York: ACM. ISBN 1-59593-009-4. doi:10.1145/1080091.1080118.
Lughofer, E., Guardiola, C. (2008): On-line fault detection with data-driven evolving fuzzy models. Control Intell. Syst., 36(4), 307.
Marnerides, A., Watson, M., Shirazi, N., Mauthe, A., Hutchison, D. (2013): A snapshot of malware analysis over the cloud: network and system characteristics. In Proc. IEEE Globecom 2013 workshop on cloud computing systems, networks, and applications, CCSNA.
Marnerides, A., James, C., Schaeffer-Filho, A., Sait, S. Y., Mauthe, A., Murthy, H. (2011): Multi-level network resilience: traffic analysis, anomaly detection and simulation. ICTACT Journal on Communication Technology, Special Issue on Next Generation Wireless Networks and Applications, 2(2).
Marnerides, A. K., Hutchison, D., Pezaros, D. P. (2010): Autonomic diagnosis of anomalous network traffic. In IEEE international symposium on a world of wireless mobile and multimedia networks, WoWMoM (pp. 1–6). New York: IEEE Press.
Meyer, B., Anstötz, F., Popien, C. (1996): Towards implementing policy-based systems management. Distrib. Syst. Eng., 3(2), 78.
Neal, D. (2011): Amazon web services outages raise serious cloud questions. Technical report, March 2011, http://www.v3.co.uk/v3-uk/news/2035726/amazon-web-services-outages-raise-cloud-questions.
Oblak, S., Škrjanc, I., Blažič, S. (2007): Fault detection for nonlinear systems with uncertain parameters based on the interval fuzzy model. Eng. Appl. Artif. Intell., 20(4), 503–510.
Roos, J., Putter, P., Bekker, C. (1993): Modelling management policy using enriched managed objects. In Proceedings of the IFIP TC6/WG6.6, third international symposium on integrated network management with participation of the IEEE communications society CNOM and with support from the institute for educational services (pp. 207–215). Amsterdam: North-Holland.
Schaeffer-Filho, A., Smith, P., Mauthe, A. (2011): Policy-driven network simulation: a resilience case study. In Proceedings of the ACM symposium on applied computing (pp. 492–497). New York: ACM.
Schaeffer-Filho, A., Mauthe, A., Hutchison, D., Smith, P., Yu, Y., Fry, M. (2013): Preset: a toolset for the evaluation of network resilience strategies. In IFIP/IEEE international symposium on integrated network management, IM 2013 (pp. 202–209). New York: IEEE Press.
Shirazi, N.-u.-h., Simpson, S., Marnerides, A. K., Watson, M., Mauthe, A., Hutchison, D. (2014): Assessing the impact of intra-cloud live migration on anomaly detection. In IEEE 3rd international conference on cloud networking, CloudNet, Oct. 2014 (pp. 52–57). doi:10.1109/CloudNet.2014.6968968.
Sterbenz, J.P.G., Hutchison, D., Çetinkaya, E. K., Jabbar, A., Rohrer, J. P., Schöller, M., Smith, P. (2010): Resilience and survivability in communication networks: strategies, principles, and survey of disciplines. Comput. Netw., 54(8), 1245–1265.
CSA CCM Leadership Team (2010): Cloud security alliance cloud controls matrix v1.1. Technical report.
Verma, D. C. (2000): Policy-based networking: architecture and algorithms. San Fancisco: New Riders Publishing.
Yu, Y., Fry, M., Schaeffer-Filho, A., Smith, P., Hutchison, D. (2011): An adaptive approach to network resilience: evolving challenge detection and mitigation. In 8th international workshop on the design of reliable communication networks, DRCN (pp. 172–179). New York: IEEE Press.
Acknowledgements
The research presented in this paper is sponsored by the EU FP7 Project SECCRIT (Secure Cloud Computing for Critical Infrastructure IT), grant agreement no. 312758. The work on “Deployment function” and “IND2UCE” is by SECCRIT Consortium members NEC (NEC Europe Ltd) and IESE (Fraunhofer Institute for Experimental Software Engineering IESE) respectively. We are grateful to Plamen Angelov for providing insightful comments and inputs to the use of the Recursive Density Estimation technique for implementation of the Network Analysis Engine.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shirazi, Nuh., Simpson, S., Oechsner, S. et al. A framework for resilience management in the cloud. Elektrotech. Inftech. 132, 122–132 (2015). https://doi.org/10.1007/s00502-015-0290-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00502-015-0290-9