skip to main content
10.1145/3563766.3564095acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

Network can check itself: scaling data plane checking via distributed, on-device verification

Published: 14 November 2022 Publication History

Abstract

Current data plane verification (DPV) tools employ a centralized architecture, where a server collects the data planes of all devices and verifies them. This architecture is inherently unscalable (i.e., requiring a reliable management network, incurring a long control path and making the server a single point of failure). In this paper, we tackle this scalability challenge of DPV from an architectural perspective. In particular, we circumvent the scalability bottleneck of centralized design and advocate for a distributed, on-device DPV framework. Our key insight is that DPV can be transformed into a counting problem on DAG, which can be naturally decomposed into lightweight tasks executed at network devices, enabling scalability. Evaluation shows that a prototype of this framework achieves scalable DPV under various settings, with little overhead on commodity network devices.

References

[1]
E. Al-Shaer and S. Al-Haj. Flowchecker: Configuration analysis and verification of federated openflow infrastructures. In Proceedings of the 3rd ACM workshop on Assurable and usable security configuration, pages 37--44, 2010.
[2]
C. J. Anderson, N. Foster, A. Guha, J.-B. Jeannin, D. Kozen, C. Schlesinger, and D. Walker. Netkat: Semantic foundations for networks. Acm sigplan notices, 49(1):113--126, 2014.
[3]
A. Authors. Coral system functionality demonstration. http://distributeddpvdemo.tech/, 2022.
[4]
Barefoot S9180-32X Switch. https://www.ufispace.com/uploads/able/files/productfilemanager/000045467d1fc648d792c404372956a0.pdf, 2019.
[5]
R. Beckett and A. Gupta. Katra: Realtime verification for multilayer networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 617--634, 2022.
[6]
R. Beckett, R. Mahajan, T. Millstein, J. Padhye, and D. Walker. Don't mind the gap: Bridging network-wide objectives and device-level configurations. In Proceedings of the 2016 ACM SIGCOMM Conference, pages 328--341, 2016.
[7]
S. Choi, B. Burkov, A. Eckert, T. Fang, S. Kazemkhani, R. Sherwood, Y. Zhang, and H. Zeng. Fboss: Building switch software at scale. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 342--356, 2018.
[8]
A. Dhamdhere, D. D. Clark, A. Gamero-Garrido, M. Luckie, R. K. Mok, G. Akiwate, K. Gogia, V. Bajpai, A. C. Snoeren, and K. Claffy. Inferring persistent interdomain congestion. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 1--15, 2018.
[9]
Edgecore Wedge32-100X Switch. https://www.edge-core.com/productsInfo.php?cls=1&cls2=5&cls3=181&id=335, 2021.
[10]
Facebook Employees Were Unable to Access Critical Work Tools During Six-Hour Outage. https://www.cnbc.com/2021/10/04/facebook-workers-lose-access-to-internal-tools-following-outage.html, 2021.
[11]
A. Fogel, S. Fung, L. Pedrosa, M. Walraed-Sullivan, R. Govindan, R. Mahajan, and T. Millstein. A general approach to network configuration analysis. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 469--483, 2015.
[12]
O. N. Foundation. Openflow switch specification 1.5.1. Open Networking Foundation (on-line), Mar. 2015.
[13]
A. Gember-Jacobson, C. Raiciu, and L. Vanbever. Integrating verification and repair into the control plane. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pages 129--135, 2017.
[14]
A. Horn, A. Kheradmand, and M. Prasad. Delta-net: Real-time network verification using atoms. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 735--749, 2017.
[15]
K.-F. Hsu, R. Beckett, A. Chen, J. Rexford, P. Tammana, and D. Walker. Contra: A programmable system for performance-aware routing. to apper at NSDI'20, 2020.
[16]
K. Jayaraman, N. Bjørner, J. Padhye, A. Agrawal, A. Bhargava, P.-A. C. Bissonnette, S. Foster, A. Helwer, M. Kasten, I. Lee, et al. Validating datacenters at scale. In Proceedings of the ACM Special Interest Group on Data Communication, pages 200--213. 2019.
[17]
P. Kazemian, M. Chan, H. Zeng, G. Varghese, N. McKeown, and S. Whyte. Real time network policy checking using header space analysis. In NSDI, pages 99--111, 2013.
[18]
P. Kazemian, G. Varghese, and N. McKeown. Header space analysis: Static checking for networks. In NSDI, volume 12, pages 113--126, 2012.
[19]
A. Khurshid, X. Zou, W. Zhou, M. Caesar, and P. B. Godfrey. Veriflow: Verifying network-wide invariants in real time. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 15--27, 2013.
[20]
K. Lakshminarayanan, M. Caesar, M. Rangan, T. Anderson, S. Shenker, and I. Stoica. Achieving convergence-free routing using failure-carrying packets. ACM SIGCOMM computer communication review, 37(4):241--252, 2007.
[21]
F. Le, G. G. Xie, and H. Zhang. Theory and new primitives for safely connecting routing protocol instances. ACM SIGCOMM Computer Communication Review, 40(4):219--230, 2010.
[22]
H. R. Lewis and C. H. Papadimitriou. Elements of the theory of computation. ACM SIGACT News, 29(3):62--78, 1998.
[23]
N. P. Lopes, N. Bjørner, P. Godefroid, K. Jayaraman, and G. Varghese. Checking beliefs in dynamic networks. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 499--512, 2015.
[24]
H. Mai, A. Khurshid, R. Agarwal, M. Caesar, P. B. Godfrey, and S. T. King. Debugging the data plane with anteater. ACM SIGCOMM Computer Communication Review, 41(4):290--301, 2011.
[25]
Mellanox SN2700 Switch. https://www.mellanox.com/related-docs/prod_eth_switches/PB_SN2700.pdf, 2015.
[26]
T. I. Observatory. The internet2 dataset. http://www.internet2.edu/research-solutions/research-support/observatory, 2021.
[27]
G. Pandurangan, P. Robinson, and M. Scquizzato. On the distributed complexity of large-scale graph computations. ACM Transactions on Parallel Computing (TOPC), 8(2):1--28, 2021.
[28]
G. D. Plotkin, N. Bjørner, N. P. Lopes, A. Rybalchenko, and G. Varghese. Scaling network verification using symmetry and surgery. ACM SIGPLAN Notices, 51(1):69--83, 2016.
[29]
R. Soulé, S. Basu, P. J. Marandi, F. Pedone, R. Kleinberg, E. G. Sirer, and N. Foster. Merlin: A language for managing network resources. IEEE/ACM Transactions on Networking, 26(5):2188--2201, 2018.
[30]
K. Subramanian, A. Abhashkumar, L. D'Antoni, and A. Akella. D2r: Policy-compliant fast reroute. In Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR), pages 148--161, 2021.
[31]
S. Vissicchio, L. Cittadini, O. Bonaventure, G. G. Xie, and L. Vanbever. On the co-existence of distributed and centralized routing control-planes. In Computer Communications (INFOCOM), 2015 IEEE Conference on, pages 469--477. IEEE, 2015.
[32]
H. Wang, C. Qian, Y. Yu, H. Yang, and S. S. Lam. Practical network-wide packet behavior identification by ap classifier. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, pages 1--13, 2015.
[33]
H. Wang, C. Qian, Y. Yu, H. Yang, and S. S. Lam. Practical network-wide packet behavior identification by ap classifier. IEEE/ACM Transactions on Networking, 25(5):2886--2899, 2017.
[34]
WonderNetwork. Global ping statistics. https://wondernetwork.com/pings, 2021.
[35]
G. G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, and J. Rexford. On static reachability analysis of ip networks. In Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies., volume 3, pages 2170--2183. IEEE, 2005.
[36]
H. Yang and S. S. Lam. Real-time verification of network properties using atomic predicates. In 2013 21st IEEE International Conference on Network Protocols (ICNP), pages 1--11. IEEE, 2013.
[37]
H. Yang and S. S. Lam. Collaborative verification of forward and reverse reachability in the internet data plane. In 2014 IEEE 22nd International Conference on Network Protocols, pages 320--331. IEEE, 2014.
[38]
H. Yang and S. S. Lam. Real-time verification of network properties using atomic predicates. IEEE/ACM Transactions on Networking, 24(2):887--900, 2016.
[39]
H. Yang and S. S. Lam. Scalable verification of networks with packet transformers using atomic predicates. IEEE/ACM Transactions on Networking, 25(5):2900--2915, 2017.
[40]
H. Zeng, S. Zhang, F. Ye, V. Jeyakumar, M. Ju, J. Liu, N. McKeown, and A. Vahdat. Libra: Divide and conquer to verify forwarding tables in huge networks. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 87--99, 2014.
[41]
P. Zhang, X. Liu, H. Yang, N. Kang, Z. Gu, and H. Li. Apkeep: Realtime verification for real networks. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 241--255, 2020.

Cited By

View all
  • (2025)Verifying Network-level Properties for Large-scale Networks with Header Transformations in RealtimeJournal of Information Processing10.2197/ipsjjip.33.4133(41-54)Online publication date: 2025
  • (2024)Tempus: Probabilistic Network Latency VerificationIEEE Access10.1109/ACCESS.2024.349873712(169896-169909)Online publication date: 2024

Index Terms

  1. Network can check itself: scaling data plane checking via distributed, on-device verification

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      HotNets '22: Proceedings of the 21st ACM Workshop on Hot Topics in Networks
      November 2022
      252 pages
      ISBN:9781450398992
      DOI:10.1145/3563766
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 14 November 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. distributed verification
      2. network verification

      Qualifiers

      • Research-article

      Funding Sources

      • NSFC
      • Alibaba Innovative Research Award
      • NSF-Fujian-China
      • Tan Kah Kee Innovation Laboratory Award
      • National Key R&D Program of China
      • Open Research Projects of Zhejiang Lab
      • Future Network Innovation Research Award of Ministry of Education of China

      Conference

      HotNets '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 110 of 460 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)32
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 20 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Verifying Network-level Properties for Large-scale Networks with Header Transformations in RealtimeJournal of Information Processing10.2197/ipsjjip.33.4133(41-54)Online publication date: 2025
      • (2024)Tempus: Probabilistic Network Latency VerificationIEEE Access10.1109/ACCESS.2024.349873712(169896-169909)Online publication date: 2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media