Abstract
Software Defined Network (SDN) is emerging as a novel network architecture which decouples the control plane from the data plane. However, SDN is unable to survive when facing failure, in particular in large scale data-center networks. Due to the programmability of SDN, mechanism could be designed to achieve fault tolerance. In this survey, we broadly discuss the fault tolerance issue and systematically review the existing methods proposed so far for SDN. Our representation starts from the significant components that OpenFlow and SDN brings – which are useful for the purpose of failure recovery, and is then further expanded to the discussion of fault tolerance in data plane and control plane, in which two phases – detection and recovery – are both needed. In particular, as the important part of this paper, we have highlighted the comparison between two main methods – restoration and protection – for failure recovery. Moreover, future research issues are discussed as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
The internet topology zoo. http://www.topology-zoo.org/
Mininet. http://mininet.org/
Openflow switch specification: version 1.0.0, December 2009
Openflow switch specification: version 1.1.0, Feburuary 2011. http://archive.openflow.org/documents/openflow-spec-v1.0.0.pdf
Al-Fares, M., Loukissas, A., Vahdat, A.: A scalable, commodity data center network architecture. In: 2008 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 63–74, August 2008
Atlas, A.K., Zinin, A., Torvi, R., Choudhury, G., Martin, C., Imhoff, B., Fedyk, D.: Basic specification for IP fast reroute: loop-free alternates. In: RFC-5286, September 2008. https://tools.ietf.org/html/rfc5286
Basu, A., Riecke, J.: Stability issues in OSPF routing. In: 2001 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 225–236, August 2001
Bonaventure, O., Filsfils, C., Francois, P.: Achieving Sub-50 milliseconds recovery upon BGP peering link failures. IEEE/ACM Trans. Netw. 15(5), 1123–1135 (2007)
Botelho, F.A., Ramos, F.M.V., Kreutz, D., Bessani, A.N.: On the feasibility of a consistent and fault-tolerant data store for sdns. In: 2013 2nd European Workshop on Software Defined Networks (EWSDN), pp. 38–43, October 2013. http://dx.doi.org/10.1109/EWSDN.2013.13
Bryant, S., Previdi, S., Shand, M.: A framework for IP and MPLS fast reroute using not-via addresses. In: RFC-6981, August 2013
Desai, M., Nandagopal, T.: Coping with link failures in centralized control plane architectures. In: 2010 2nd International Conference on Communication Systems and NETworks (COMSNETS), pp. 79–88, January 2010. http://dl.acm.org/citation.cfm?id=1831443.1831452
Farhady, H., Lee, H., Nakao, A.: Software-defined networking: a survey. Comput. Netw. 81, 79–95 (2015)
Ficco, M., Avolio, G., Palmieri, F., Castiglione, A.: An HLA-based framework for simulation of large-scale critical systems. Concurr. Comput.: Prac. Exp. (2015). doi:10.1002/cpe.3472
Jain, S., Kumar, A., Mandal, S., Ong, J., Poutievski, L., Singh, A., Venkata, S., Wanderer, J., Zhou, J., Zhu, M., Zolla, J., Hölzle, U., Stuart, S., Vahdat, A.: B4: experience with a globally-deployed software defined wan. In: 2013 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 3–14, August 2013
Katta, N., Zhang, H., Freedman, M., Rexford, J.: Ravana: controller fault-tolerance in software-defined networking. In: 2015 1st ACM SIGCOMM Symposium on Software Defined Networking Research, pp. 4:1–4:12, June 2015
Katz, D., Ward, D.: Bidirectional forwarding detection. In: RFC-5880, June 2010
Kim, H., Santos, J.R., Turner, Y., Schlansker, M., Tourrihes, J., Feamster, N.: Coronet: fault tolerance for software defined networks. In: 2012 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–2, October 2012
Kozat, U.C., Liang, G., Kokten, K.: On diagnosis of forwarding plane via static forwarding rules in software defined networks. In: 2014 33rd IEEE Conference on Computer Communications (INFOCOM), pp. 1716–1724, April 2013. http://arxiv.org/abs/1308.4465
Kreutz, D., Ramos, F., Esteve Rothenberg, P., Esteve Rothenberg, C., Azodolmolky, S., Uhlig, S.: Software-defined networking: a comprehensive survey. Proc. IEEE 103(1), 14–76 (2015)
Lee, S., Yu, Y., Nelakuditi, S., Zhang, Z.L., Chuah, C.N.: Proactive vs. reactive approaches to failure resilient routing. In: 2004 23rd IEEE Conference on Computer Communications (INFOCOM), pp. 176–186, March 2004. http://arxiv.org/abs/1308.4465
Lee, S., Li, K.Y., Chan, K.Y., Lai, G.H., Chung, Y.C.: Path layout planning and software based fast failure detection in survivable openflow networks. In: 2014 10th International Conference on the Design of Reliable Communication Networks (DRCN), pp. 1–8, April 2014
Levin, D., Wundsam, A., Heller, B., Handigol, N., Feldmann, A.: Logically centralized? state distribution tradeoffs in software defined networks. In: 2014 Proceedings of 3rd Workshop on Hot Topics in Software Defined Networking (HotSDN), pp. 1–6, January 2012
Li, J., Hyun, J., Yoo, J.H., Baik, S., Hong, J.K.: Scalable failover method for data center networks using openflow. In: 2014 14th IEEE Network Operations and Management Symposium (NOMS), pp. 1–6, May 2014
Liu, Z., Li, Y., Su, L., Jin, D., Zeng, L.: M2cloud: software defined multi-site data center network control framework for multi-tenant. In: 2013 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 517–518, August 2013
Maesschalck, S., Colle, D., Lievens, I., Pickavet, M., Demeester, P., Mauz, C., Jaeger, M., Inkret, R., Mikac, B., Derkacz, J.: Pan-european optical transport networks: an availability-based comparison. Photonic Netw. Commun. 5(3), 203–225 (2003). http://dx.doi.org/10.1023/A%3A1023088418684 http://dx.doi.org/10.1023/A%3A1023088418684
McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., Turner, J.: Openflow: enabling innovation in campus networks. ACM Comput. Commun. Rev. 38(2), 69–74 (2008)
Moy, J.: OSPF version 2. In: RFC-2328, April 1998
Nagano, J., Shinomiya, N.: A failure recovery method based on cycle structure and its verification by openflow. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 298–303, March 2013
Fonseca, P., Bennesby, R., Mota, E., Passito, A.: A replication component for resilient openflow-based networking. In: 2012 IEEE 13th Network Operations and Management Symposium (NOMS), pp. 933–939, April 2012
Przygienda, T., Shen, N., Sheth, N.: M-ISIS: multi topology (MT) routing in intermediate system to intermediate systems (IS-ISs). In: RFC-5120, February 2008
Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., Pillay-Esnault, P.: Multi-topology (MT) routing in OSPF. In: RFC-4915, June 2007
Ramos, R.M., Rothenberg, C.E.: Slickflow: resilient source routing in data center networks unlocked by openflow. In: 2013 IEEE 38th Conference on Local Computer Networks (LCN), pp. 606–613, October 2013
Reitblatt, M., Canini, M., Guha, A., Foster, N.: Fattire: declarative fault tolerance for software-defined networks. In: 2013 Proceedings of 2nd Workshop on Hot Topics in Software Defined Networking (HotSDN), pp. 109–114, August 2013
Rongqing, C.: Research on the fast failure recovery technologies of IP networks. Master’s thesis, Hangzhou Dianzi University, March 2012
Roy, A.R., Bari, M.F., Zhani, M.F., Ahmed, R., Boutaba, R.: Dot: distributed openflow testbed. In: 2014 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 367–368, August 2014
Sgambelluri, A., Giorgetti, A., Cugini, F., Paolucci, F., Castoldi, P.: Openflow-based segment protection in ethernet networks. IEEE/OSA J. Opt. Commun. Netw. 5(9), 1066–1075 (2013)
Sharma, S., Staessens, D., Colle, D., Pickavet, M., Demeester, P.: Fast failure recovery for in-band openflow networks. In: 2013 9th International Conference on the Design of Reliable Communication Networks (DRCN), pp. 52–59, March 2013
Sharma, S., Staessens, D., Colle, D., Pickavet, M., Demeester, P.: Openflow: meeting carrier-grade recovery requirements. Comput. Commun. 36(6), 656–665 (2013). http://www.sciencedirect.com/science/article/pii/S0140366412003349
Staessens, D., Sharma, S., Colle, D., Pickavet, M., Demeester, P.: Software defined networking: meeting carrier grade requirements. In: 2011 18th IEEE Workshop on Local Metropolitan Area Networks (LANMAN), pp. 1–6, October 2011
Suurballe, J.W.: Disjoint paths in a network. Networks 4(2), 125–145 (1974)
Tootoonchian, A., Ganjali, Y.: Hyperflow: a distributed control plane for openflow. In: 2010 7th Internet Network Management Conference on Research on Enterprise Networking (INM/WREN), p. 3, April 2010
Vasseur, J.P., Pickavet, M., Demeester, P.: Network Recovery: Protection and Restoration of Optical, SONET-SDH, IP, and MPLS. Morgan Kaufmann, San Francisco (2004)
Wang, S., Li, D., Xia, S.: The problems and solutions of network update in SDN: a survey. In: 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 474–479, April 2015
Wei, T., Mishra, P., Wu, K., Zhou, J.: Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems. J. Syst. Softw. 85(6), 1386–1399 (2012)
Gu, W., Zhang, X., Gong, B., Wang, L.: A survey of multicast in software-defined networking. In: 2015 5th International Conference on Information Engineering for Mechanics and Materials (ICIMM), July 2015
Yu, Y., Shanzhi, C., Xin, L., Yan, W.: A framework of using openflow to handle transient link failure. In: 2011 1st International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE), pp. 2050–2053, December 2011
Acknowledgments
Corresponding authors: Jinbang Chen and Fei Xu. They are with Shanghai Key Laboratory of Multidimensional Information Processing & Department of Computer Science and Technology, East China Normal University, China. This work was supported by the Science and Technology Commission of Shanghai Municipality under research grant no. 14DZ2260800, and China Postdoctoral Science Foundation under grant no. 2014M561438.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, J., Chen, J., Xu, F., Yin, M., Zhang, W. (2015). When Software Defined Networks Meet Fault Tolerance: A Survey. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9530. Springer, Cham. https://doi.org/10.1007/978-3-319-27137-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-27137-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27136-1
Online ISBN: 978-3-319-27137-8
eBook Packages: Computer ScienceComputer Science (R0)