skip to main content
10.1145/3152434.3152448acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Public Access

Biases in Data-Driven Networking, and What to Do About Them

Published: 30 November 2017 Publication History

Abstract

Recent efforts highlight the promise of data-driven approaches to optimize network decisions. Many such efforts use trace-driven evaluation; i.e., running offline analysis on network traces to estimate the potential benefits of different policies before running them in practice. Unfortunately, such frameworks can have fundamental pitfalls (e.g., skews due to previous policies that were used in the data collection phase and insufficient data for specific subpopulations) that could lead to misleading estimates and ultimately suboptimal decisions. In this paper, we shed light on such pitfalls and identify a promising roadmap to address these pitfalls by leveraging parallels in causal inference, namely the Doubly Robust estimator.

Supplementary Material

MP4 File (bartulovic.mp4)

References

[1]
The data-driven approach to network management: Innovation delivered. https://goo.gl/vfLF5z.
[2]
Simulation setup. https://github.com/DoublyRobustEvaluation/DoublyRobustEvaluation/.
[3]
Systems that learn initiative at csail. http://stl.csail.mit.edu/.
[4]
A. Akella and R. Mahajan. A call to arms for management plane analytics. In Proc. HotNets, page 4. ACM, 2014.
[5]
H. Bang and J. M. Robins. Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962--973, 2005.
[6]
J. R. Carpenter, M. G. Kenward, and S. Vansteelandt. A comparison of multiple imputation and doubly robust estimation for analyses with missing data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169(3):571--584, 2006.
[7]
Y. Cheng, U. Hölzle, N. Cardwell, S. Savage, and G. M. Voelker. Monkey see, monkey do: A tool for tcp tracing and replaying. In USENIX Annual Technical Conference, General Track, 2004.
[8]
W. G. Cochran. Sampling Techniques, 3rd Edition. John Wiley, 1977.
[9]
M. Dudík, D. Erhan, J. Langford, L. Li, et al. Doubly robust policy evaluation and optimization. Statistical Science, 29(4):485--511, 2014.
[10]
S. Floyd and V. Paxson. Difficulties in simulating the internet. IEEE/ACM Transactions on Networking (ToN), 9(4):392--403, 2001.
[11]
B. Han, F. Qian, L. Ji, V. Gopalakrishnan, and N. Bedminster. Mp-dash: Adaptive video streaming over preference-aware multipath. In CoNEXT, pages 129--143, 2016.
[12]
T.-Y. Huang, N. Handigol, B. Heller, N. McKeown, and R. Johari. Confused, timid, and unstable: picking a video streaming rate is hard. In Proceedings of the 2012 ACM conference on Internet measurement conference, pages 225--238. ACM, 2012.
[13]
T.-Y. Huang, R. Johari, N. McKeown, M. Trunnell, and M. Watson. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. ACM SIGCOMM Computer Communication Review, 44(4):187--198, 2015.
[14]
J. Jiang, R. Das, G. Ananthanarayanan, P. A. Chou, V. Padmanabhan, V. Sekar, E. Dominique, M. Goliszewski, D. Kukoleca, R. Vafin, et al. Via: Improving internet telephony call quality using predictive relay selection. In Proceedings of the 2016 conference on ACM SIGCOMM 2016 Conference, pages 286--299. ACM, 2016.
[15]
J. Jiang, V. Sekar, H. Milner, D. Shepherd, I. Stoica, and H. Zhang. Cfa: A practical prediction system for video qoe optimization. In NSDI, pages 137--150, 2016.
[16]
J. Jiang, V. Sekar, I. Stoica, and H. Zhang. Unleashing the potential of data-driven networking. In Proceedings of 9th International Conference on COMmunication Systems & NETworkS (COMSNET), 2017.
[17]
J. Jiang, V. Sekar, and H. Zhang. Improving Fairness, Efficiency, and Stability in HTTP-Based Adaptive Streaming with Festive. In ACM CoNEXT 2012.
[18]
J. Jiang, S. Sun, V. Sekar, and H. Zhang. Pytheas: Enabling data-driven quality of experience optimization using group-based exploration-exploitation. In NSDI, pages 393--406, 2017.
[19]
N. Jiang and L. Li. Doubly robust off-policy value evaluation for reinforcement learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pages 652--661, 2016.
[20]
X. Jin, Y. Li, D. Wei, S. Li, J. Gao, L. Xu, G. Li, W. Xu, and J. Rexford. Optimizing bulk transfers with software-defined optical wan. In Proceedings of the 2016 conference on ACM SIGCOMM 2016 Conference, pages 87--100. ACM, 2016.
[21]
J. D. Y. Kang and J. L. Schafer. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statist. Sci., 22(4):523--539, 11 2007.
[22]
F. Kaup, F. Michelinakis, N. Bui, J. Widmer, K. Wac, and D. Hausheer. Behind the nat-a measurement based evaluation of cellular service quality. In Network and Service Management (CNSM), 2015 11th International Conference on, pages 228--236. IEEE, 2015.
[23]
R. Killick, P. Fearnhead, and I. A. Eckley. Optimal detection of change-points with a linear computational cost. Journal of the American Statistical Association, 107(500):1590--1598, 2012.
[24]
S. S. Krishnan and R. K. Sitaraman. Video stream quality impacts viewer behavior: inferring causality using quasi-experimental designs. IEEE/ACM Transactions on Networking, 21(6):2001--2014, 2013.
[25]
D. T. Larose. K-nearest neighbor algorithm. Discovering Knowledge in Data: An Introduction to Data Mining, pages 90--106, 2005.
[26]
M. Lavielle. Using penalized contrasts for the change-point problem. Signal processing, 85(8):1501--1510, 2005.
[27]
L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web, pages 661--670. ACM, 2010.
[28]
L. Li, W. Chu, J. Langford, and X. Wang. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 297--306. ACM, 2011.
[29]
H. H. Liu, R. Viswanathan, M. Calder, A. Akella, R. Mahajan, J. Padhye, and M. Zhang. Efficiently delivering online services over integrated infrastructure. In NSDI, pages 77--90, 2016.
[30]
H. Mao, M. Alizadeh, I. Menache, and S. Kandula. Resource management with deep reinforcement learning. In Proc. HotNets, pages 50--56, 2016.
[31]
H. Mao, R. Netravali, and M. Alizadeh. Neural adaptive video streaming with pensieve. In Proceedings of the 2016 conference on ACM SIGCOMM 2017 Conference. ACM, 2017.
[32]
S. A. Murphy, M. J. van der Laan, J. M. Robins, and C. P. P. R. Group. Marginal mean models for dynamic regimes. Journal of the American Statistical Association, 96(456):1410--1423, 2001.
[33]
S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345--1359, 2010.
[34]
J. M. Robins, A. Rotnitzky, and L. P. Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427):846--866, 1994.
[35]
M. Schapira, Y. Zhu, and J. Rexford. Putting bgp on the right path: A case for next-hop routing. In Proc. HotNets, page 3. ACM, 2010.
[36]
B. Schlinker, H. Kim, T. Cui, E. Katz-Bassett, H. Madhyastha, I. Cunha, J. Quinn, S. Hasan, P. Lapukhov, and H. Zeng. Engineering egress with edge fabric. In SIGCOMM. ACM, 2016.
[37]
Y. Sun, X. Yin, J. Jiang, V. Sekar, F. Lin, N. Wang, T. Liu, and B. Sinopoli. Cs2p: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 conference on ACM SIGCOMM 2016 Conference, pages 272--285. ACM, 2016.
[38]
M. Tariq, A. Zeitoun, V. Valancius, N. Feamster, and M. Ammar. Answering what-if deployment and configuration questions with wise. In ACM SIGCOMM Computer Communication Review, volume 38, pages 99--110. ACM, 2008.
[39]
A. Valadarsky, M. Schapira, D. Shahaf, and A. Tamar. Learning to route. In Proc. HotNets, 2017.
[40]
S. Venkataraman, Z. Yang, M. J. Franklin, B. Recht, and I. Stoica. Ernest: Efficient performance prediction for large-scale advanced analytics. In NSDI, pages 363--378, 2016.
[41]
K.-K. Yap, M. Motiwala, J. Rahe, S. Padgett, M. Holliman, G. Baldus, M. Hines, T. Kim, A. Narayanan, A. Jain, V. Lin, C. Rice, B. Rogan, A. Singh, B. Tanaka, M. Verma, P. Sood, M. Tariq, M. Tierney, D. Trumic, V. Valancius, C. Ying, M. Kallahalla, B. Koley, and A. Vahdat. Taking the edge off with espresso: Scale, reliability and programmability for global internet peering. In SIGCOMM. ACM, 2016.
[42]
X. Yin, A. Jindal, V. Sekar, and B. Sinopoli. A control-theoretic approach for dynamic adaptive video streaming over http. ACM SIGCOMM Computer Communication Review, 45(4):325--338, 2015.
[43]
Y. Zaki, T. Pötsch, J. Chen, L. Subramanian, and C. Görg. Adaptive congestion control for unpredictable cellular networks. In ACM SIGCOMM Computer Communication Review, volume 45, pages 509--522. ACM, 2015.

Cited By

View all
  • (2024)Magpie: Improving the Efficiency of A/B Tests for Large Scale Video-on-Demand SystemsProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3689019(588-594)Online publication date: 4-Nov-2024
  • (2024)Adapting Wireless Network Configuration From Simulation to Reality via Deep Learning-Based Domain AdaptationIEEE/ACM Transactions on Networking10.1109/TNET.2023.333534632:3(1983-1998)Online publication date: Jun-2024
  • (2024)Reducing Traffic Wastage in Video Streaming via Bandwidth-Efficient Bitrate AdaptationIEEE Transactions on Mobile Computing10.1109/TMC.2024.337349823:11(10361-10377)Online publication date: Nov-2024
  • Show More Cited By

Index Terms

  1. Biases in Data-Driven Networking, and What to Do About Them
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HotNets '17: Proceedings of the 16th ACM Workshop on Hot Topics in Networks
    November 2017
    206 pages
    ISBN:9781450355698
    DOI:10.1145/3152434
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 November 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    HotNets-XVI
    Sponsor:
    HotNets-XVI: The 16th ACM Workshop on Hot Topics in Networks
    November 30 - December 1, 2017
    CA, Palo Alto, USA

    Acceptance Rates

    HotNets '17 Paper Acceptance Rate 28 of 124 submissions, 23%;
    Overall Acceptance Rate 110 of 460 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)101
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Magpie: Improving the Efficiency of A/B Tests for Large Scale Video-on-Demand SystemsProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3689019(588-594)Online publication date: 4-Nov-2024
    • (2024)Adapting Wireless Network Configuration From Simulation to Reality via Deep Learning-Based Domain AdaptationIEEE/ACM Transactions on Networking10.1109/TNET.2023.333534632:3(1983-1998)Online publication date: Jun-2024
    • (2024)Reducing Traffic Wastage in Video Streaming via Bandwidth-Efficient Bitrate AdaptationIEEE Transactions on Mobile Computing10.1109/TMC.2024.337349823:11(10361-10377)Online publication date: Nov-2024
    • (2023)Counterfactual identifiability of bijective causal modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619478(25733-25754)Online publication date: 23-Jul-2023
    • (2023)Veritas: Answering Causal Queries from Video Streaming TracesProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604828(738-753)Online publication date: 10-Sep-2023
    • (2022)A new hope for network model generalizationProceedings of the 21st ACM Workshop on Hot Topics in Networks10.1145/3563766.3564104(152-159)Online publication date: 14-Nov-2022
    • (2021)Xatu: Richer Neural Network Based Prediction for Video StreamingProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34910565:3(1-26)Online publication date: 15-Dec-2021
    • (2021)SayerProceedings of the ACM Symposium on Cloud Computing10.1145/3472883.3487001(273-288)Online publication date: 1-Nov-2021
    • (2020)Learning in situProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388279(495-512)Online publication date: 25-Feb-2020
    • (2020)Pitfalls of data-driven networkingProceedings of the Workshop on Network Meets AI & ML10.1145/3405671.3405815(42-47)Online publication date: 10-Aug-2020
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media