skip to main content
10.1145/3035918.3035926acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Distributed Provenance Compression

Authors Info & Claims
Published:09 May 2017Publication History

ABSTRACT

Network provenance, which records the execution history of network events as meta-data, is becoming increasingly important for network accountability and failure diagnosis. For example, network provenance may be used to trace the path that a message traversed in a network, or to reveal how a particular routing entry was derived and the parties involved in its derivation. A challenge when storing the provenance of a live network is that the large number of the arriving messages may incur substantial storage overhead. In this paper, we explore techniques to dynamically compress distributed provenance stored at scale. Logically, the compression is achieved by grouping equivalent provenance trees and maintaining only one concrete copy for each equivalence class. To efficiently identify equivalent provenance, we (1) introduce distributed event-based linear programs (DELP) to specify distributed network applications, and (2) statically analyze DELPs to allow for quick detection of provenance equivalence at runtime. Our experimental results demonstrate that our approach leads to significant storage reduction and query latency improvement over alternative approaches.

References

  1. Y. Amsterdamer, D. Deutch, T. Milo, and V. Tannen. On provenance minimization. ACM Trans. Database Syst., 37(4):30, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Z. Bao, H. Köhler, L. Wang, X. Zhou, and S. W. Sadiq. Efficient provenance storage for relational queries. In CIKM, pages 1352--1361, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Chapman, H. V. Jagadish, and P. Ramanan. Efficient provenance storage. In Proceedings of ACM SIGMOD, pages 993--1006, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Chen, Y. Wu, A. Haeberlen, W. Zhou, and B. T. Loo. The Good, the Bad, and the Differences: Better Network Diagnostics with Differential Provenance. In Proceedings of ACM SIGCOMM, Aug. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Chen, L. Jia, H. Xu, C. Luo, W. Zhou, and B. T. Loo. A program logic for verifying secure routing protocols. In Proceedings of FORTE, pages 117--132, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  6. C. Chen, H. Lehri, L. K. Loh, A. Alur, L. Jia, B. T. Loo, and W. Zhou. Provably correct distributed provenance compression (cmu-cylab-17-001). Technical report, CyLab, Carnegie Mellon University, Jan. 2017.Google ScholarGoogle Scholar
  7. R. Droms. Dynamic host configuration protocol. 1997. RFC 2131.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In Proceedings of PODS, pages 31--40, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Jung, E. Sit, H. Balakrishnan, and R. Morris. DNS performance and the effectiveness of caching. IEEE/ACM Trans. Netw., 10(5):589--603, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Karvounarakis, Z. G. Ives, and V. Tannen. Querying data provenance. In Proceedings of ACM SIGMOD, pages 951--962, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. T. Loo, T. Condie, M. Garofalakis, D. E. Gay, J. M. Hellerstein, P. Maniatis, R. Ramakrishnan, T. Roscoe, and I. Stoica. Declarative Networking Language, Execution and Optimization. In Proceedings of ACM SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. T. Loo, T. Condie, M. Garofalakis, D. E. Gay, J. M. Hellerstein, P. Maniatis, R. Ramakrishnan, T. Roscoe, and I. Stoica. Declarative networking. In Communications of the ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. V. Mockapetris. Domain names - implementation and specification, Nov. 1987. RFC 1035. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. C. Muthukumar, X. Li, C. Liu, J. B. Kopena, M. Oprea, and B. T. Loo. Declarative toolkit for rapid network protocol simulation and experimentation. In SIGCOMM (demo), 2009.Google ScholarGoogle Scholar
  15. ns 3 project. Network Simulator 3. http://www.nsnam.org/.Google ScholarGoogle Scholar
  16. D. Olteanu and J. Závodný. On factorisation of provenance polynomials. In Proceedings of TaPP, 2011.Google ScholarGoogle Scholar
  17. D. Olteanu and J. Závodný. Factorised representations of query results: size bounds and readability. In Proceedings of ICDT, pages 285--298, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. C. Plummer. An ethernet address resolution protocol. 1982. RFC 826. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Reitblatt, N. Foster, J. Rexford, C. Schlesinger, and D. Walker. Abstractions for network update. In Proceedings of ACM SIGCOMM, pages 323--334, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Robert Ramey. http://www.boost.org/doc/libs/1_61_0/libs/serialization/doc/index.html.Google ScholarGoogle Scholar
  21. A. Woodruff and M. Stonebraker. Supporting fine-grained data lineage in a database visualization environment. In Proceedings of ICDE, pages 91--102, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Wu, A. Chen, A. Haeberlen, W. Zhou, and B. T. Loo. Automated network repair with meta provenance. In Proceedings of HotNets, pages 26:1--26:7, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Wu, M. Zhao, A. Haeberlen, W. Zhou, and B. T. Loo. Diagnosing missing events in distributed systems with negative provenance. In Proceeding of ACM SIGCOMM, pages 383--394, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Xie, K. Muniswamy-Reddy, D. Feng, Y. Li, and D. D. E. Long. Evaluation of a hybrid approach for efficient provenance storage. TOS, 9(4):14, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. W. Zegura, K. L. Calvert, and S. Bhattacharjee. How to model an internetwork. In Proceedings IEEE INFOCOM, pages 594--602, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Zhou, Q. Fei, A. Narayan, A. Haeberlen, B. T. Loo, and M. Sherr. Secure network provenance. In Proceedings of SOSP, pages 295--310, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Zhou, S. Mapara, Y. Ren, Y. Li, A. Haeberlen, Z. G. Ives, B. T. Loo, and M. Sherr. Distributed time-aware provenance. PVLDB, 6(2):49--60, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Zhou, M. Sherr, T. Tao, X. Li, B. T. Loo, and Y. Mao. Efficient querying and maintenance of network provenance at internet-scale. In Proceedings of ACM SIGMOD, pages 615--626, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distributed Provenance Compression

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
          May 2017
          1810 pages
          ISBN:9781450341974
          DOI:10.1145/3035918

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 May 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader