Skip to main content

When Software Defined Networks Meet Fault Tolerance: A Survey

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9530))

Abstract

Software Defined Network (SDN) is emerging as a novel network architecture which decouples the control plane from the data plane. However, SDN is unable to survive when facing failure, in particular in large scale data-center networks. Due to the programmability of SDN, mechanism could be designed to achieve fault tolerance. In this survey, we broadly discuss the fault tolerance issue and systematically review the existing methods proposed so far for SDN. Our representation starts from the significant components that OpenFlow and SDN brings – which are useful for the purpose of failure recovery, and is then further expanded to the discussion of fault tolerance in data plane and control plane, in which two phases – detection and recovery – are both needed. In particular, as the important part of this paper, we have highlighted the comparison between two main methods – restoration and protection – for failure recovery. Moreover, future research issues are discussed as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The internet topology zoo. http://www.topology-zoo.org/

  2. Mininet. http://mininet.org/

  3. Ryu. http://osrg.github.io/ryu/

  4. Openflow switch specification: version 1.0.0, December 2009

    Google Scholar 

  5. Openflow switch specification: version 1.1.0, Feburuary 2011. http://archive.openflow.org/documents/openflow-spec-v1.0.0.pdf

  6. Al-Fares, M., Loukissas, A., Vahdat, A.: A scalable, commodity data center network architecture. In: 2008 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 63–74, August 2008

    Google Scholar 

  7. Atlas, A.K., Zinin, A., Torvi, R., Choudhury, G., Martin, C., Imhoff, B., Fedyk, D.: Basic specification for IP fast reroute: loop-free alternates. In: RFC-5286, September 2008. https://tools.ietf.org/html/rfc5286

  8. Basu, A., Riecke, J.: Stability issues in OSPF routing. In: 2001 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 225–236, August 2001

    Google Scholar 

  9. Bonaventure, O., Filsfils, C., Francois, P.: Achieving Sub-50 milliseconds recovery upon BGP peering link failures. IEEE/ACM Trans. Netw. 15(5), 1123–1135 (2007)

    Article  Google Scholar 

  10. Botelho, F.A., Ramos, F.M.V., Kreutz, D., Bessani, A.N.: On the feasibility of a consistent and fault-tolerant data store for sdns. In: 2013 2nd European Workshop on Software Defined Networks (EWSDN), pp. 38–43, October 2013. http://dx.doi.org/10.1109/EWSDN.2013.13

  11. Bryant, S., Previdi, S., Shand, M.: A framework for IP and MPLS fast reroute using not-via addresses. In: RFC-6981, August 2013

    Google Scholar 

  12. Desai, M., Nandagopal, T.: Coping with link failures in centralized control plane architectures. In: 2010 2nd International Conference on Communication Systems and NETworks (COMSNETS), pp. 79–88, January 2010. http://dl.acm.org/citation.cfm?id=1831443.1831452

  13. Farhady, H., Lee, H., Nakao, A.: Software-defined networking: a survey. Comput. Netw. 81, 79–95 (2015)

    Article  Google Scholar 

  14. Ficco, M., Avolio, G., Palmieri, F., Castiglione, A.: An HLA-based framework for simulation of large-scale critical systems. Concurr. Comput.: Prac. Exp. (2015). doi:10.1002/cpe.3472

  15. Jain, S., Kumar, A., Mandal, S., Ong, J., Poutievski, L., Singh, A., Venkata, S., Wanderer, J., Zhou, J., Zhu, M., Zolla, J., Hölzle, U., Stuart, S., Vahdat, A.: B4: experience with a globally-deployed software defined wan. In: 2013 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 3–14, August 2013

    Google Scholar 

  16. Katta, N., Zhang, H., Freedman, M., Rexford, J.: Ravana: controller fault-tolerance in software-defined networking. In: 2015 1st ACM SIGCOMM Symposium on Software Defined Networking Research, pp. 4:1–4:12, June 2015

    Google Scholar 

  17. Katz, D., Ward, D.: Bidirectional forwarding detection. In: RFC-5880, June 2010

    Google Scholar 

  18. Kim, H., Santos, J.R., Turner, Y., Schlansker, M., Tourrihes, J., Feamster, N.: Coronet: fault tolerance for software defined networks. In: 2012 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–2, October 2012

    Google Scholar 

  19. Kozat, U.C., Liang, G., Kokten, K.: On diagnosis of forwarding plane via static forwarding rules in software defined networks. In: 2014 33rd IEEE Conference on Computer Communications (INFOCOM), pp. 1716–1724, April 2013. http://arxiv.org/abs/1308.4465

  20. Kreutz, D., Ramos, F., Esteve Rothenberg, P., Esteve Rothenberg, C., Azodolmolky, S., Uhlig, S.: Software-defined networking: a comprehensive survey. Proc. IEEE 103(1), 14–76 (2015)

    Article  Google Scholar 

  21. Lee, S., Yu, Y., Nelakuditi, S., Zhang, Z.L., Chuah, C.N.: Proactive vs. reactive approaches to failure resilient routing. In: 2004 23rd IEEE Conference on Computer Communications (INFOCOM), pp. 176–186, March 2004. http://arxiv.org/abs/1308.4465

  22. Lee, S., Li, K.Y., Chan, K.Y., Lai, G.H., Chung, Y.C.: Path layout planning and software based fast failure detection in survivable openflow networks. In: 2014 10th International Conference on the Design of Reliable Communication Networks (DRCN), pp. 1–8, April 2014

    Google Scholar 

  23. Levin, D., Wundsam, A., Heller, B., Handigol, N., Feldmann, A.: Logically centralized? state distribution tradeoffs in software defined networks. In: 2014 Proceedings of 3rd Workshop on Hot Topics in Software Defined Networking (HotSDN), pp. 1–6, January 2012

    Google Scholar 

  24. Li, J., Hyun, J., Yoo, J.H., Baik, S., Hong, J.K.: Scalable failover method for data center networks using openflow. In: 2014 14th IEEE Network Operations and Management Symposium (NOMS), pp. 1–6, May 2014

    Google Scholar 

  25. Liu, Z., Li, Y., Su, L., Jin, D., Zeng, L.: M2cloud: software defined multi-site data center network control framework for multi-tenant. In: 2013 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 517–518, August 2013

    Google Scholar 

  26. Maesschalck, S., Colle, D., Lievens, I., Pickavet, M., Demeester, P., Mauz, C., Jaeger, M., Inkret, R., Mikac, B., Derkacz, J.: Pan-european optical transport networks: an availability-based comparison. Photonic Netw. Commun. 5(3), 203–225 (2003). http://dx.doi.org/10.1023/A%3A1023088418684 http://dx.doi.org/10.1023/A%3A1023088418684

  27. McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., Turner, J.: Openflow: enabling innovation in campus networks. ACM Comput. Commun. Rev. 38(2), 69–74 (2008)

    Article  Google Scholar 

  28. Moy, J.: OSPF version 2. In: RFC-2328, April 1998

    Google Scholar 

  29. Nagano, J., Shinomiya, N.: A failure recovery method based on cycle structure and its verification by openflow. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 298–303, March 2013

    Google Scholar 

  30. Fonseca, P., Bennesby, R., Mota, E., Passito, A.: A replication component for resilient openflow-based networking. In: 2012 IEEE 13th Network Operations and Management Symposium (NOMS), pp. 933–939, April 2012

    Google Scholar 

  31. Przygienda, T., Shen, N., Sheth, N.: M-ISIS: multi topology (MT) routing in intermediate system to intermediate systems (IS-ISs). In: RFC-5120, February 2008

    Google Scholar 

  32. Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., Pillay-Esnault, P.: Multi-topology (MT) routing in OSPF. In: RFC-4915, June 2007

    Google Scholar 

  33. Ramos, R.M., Rothenberg, C.E.: Slickflow: resilient source routing in data center networks unlocked by openflow. In: 2013 IEEE 38th Conference on Local Computer Networks (LCN), pp. 606–613, October 2013

    Google Scholar 

  34. Reitblatt, M., Canini, M., Guha, A., Foster, N.: Fattire: declarative fault tolerance for software-defined networks. In: 2013 Proceedings of 2nd Workshop on Hot Topics in Software Defined Networking (HotSDN), pp. 109–114, August 2013

    Google Scholar 

  35. Rongqing, C.: Research on the fast failure recovery technologies of IP networks. Master’s thesis, Hangzhou Dianzi University, March 2012

    Google Scholar 

  36. Roy, A.R., Bari, M.F., Zhani, M.F., Ahmed, R., Boutaba, R.: Dot: distributed openflow testbed. In: 2014 ACM International Conference on Special Interest Group on Data Communication (SIGCOMM), pp. 367–368, August 2014

    Google Scholar 

  37. Sgambelluri, A., Giorgetti, A., Cugini, F., Paolucci, F., Castoldi, P.: Openflow-based segment protection in ethernet networks. IEEE/OSA J. Opt. Commun. Netw. 5(9), 1066–1075 (2013)

    Article  Google Scholar 

  38. Sharma, S., Staessens, D., Colle, D., Pickavet, M., Demeester, P.: Fast failure recovery for in-band openflow networks. In: 2013 9th International Conference on the Design of Reliable Communication Networks (DRCN), pp. 52–59, March 2013

    Google Scholar 

  39. Sharma, S., Staessens, D., Colle, D., Pickavet, M., Demeester, P.: Openflow: meeting carrier-grade recovery requirements. Comput. Commun. 36(6), 656–665 (2013). http://www.sciencedirect.com/science/article/pii/S0140366412003349

    Article  Google Scholar 

  40. Staessens, D., Sharma, S., Colle, D., Pickavet, M., Demeester, P.: Software defined networking: meeting carrier grade requirements. In: 2011 18th IEEE Workshop on Local Metropolitan Area Networks (LANMAN), pp. 1–6, October 2011

    Google Scholar 

  41. Suurballe, J.W.: Disjoint paths in a network. Networks 4(2), 125–145 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  42. Tootoonchian, A., Ganjali, Y.: Hyperflow: a distributed control plane for openflow. In: 2010 7th Internet Network Management Conference on Research on Enterprise Networking (INM/WREN), p. 3, April 2010

    Google Scholar 

  43. Vasseur, J.P., Pickavet, M., Demeester, P.: Network Recovery: Protection and Restoration of Optical, SONET-SDH, IP, and MPLS. Morgan Kaufmann, San Francisco (2004)

    Google Scholar 

  44. Wang, S., Li, D., Xia, S.: The problems and solutions of network update in SDN: a survey. In: 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 474–479, April 2015

    Google Scholar 

  45. Wei, T., Mishra, P., Wu, K., Zhou, J.: Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems. J. Syst. Softw. 85(6), 1386–1399 (2012)

    Article  Google Scholar 

  46. Gu, W., Zhang, X., Gong, B., Wang, L.: A survey of multicast in software-defined networking. In: 2015 5th International Conference on Information Engineering for Mechanics and Materials (ICIMM), July 2015

    Google Scholar 

  47. Yu, Y., Shanzhi, C., Xin, L., Yan, W.: A framework of using openflow to handle transient link failure. In: 2011 1st International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE), pp. 2050–2053, December 2011

    Google Scholar 

Download references

Acknowledgments

Corresponding authors: Jinbang Chen and Fei Xu. They are with Shanghai Key Laboratory of Multidimensional Information Processing & Department of Computer Science and Technology, East China Normal University, China. This work was supported by the Science and Technology Commission of Shanghai Municipality under research grant no. 14DZ2260800, and China Postdoctoral Science Foundation under grant no. 2014M561438.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jinbang Chen or Fei Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, J., Chen, J., Xu, F., Yin, M., Zhang, W. (2015). When Software Defined Networks Meet Fault Tolerance: A Survey. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9530. Springer, Cham. https://doi.org/10.1007/978-3-319-27137-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27137-8_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27136-1

  • Online ISBN: 978-3-319-27137-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics