skip to main content
10.1145/3475716.3484196acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
keynote

Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System

Published:11 October 2021Publication History

ABSTRACT

A cyber cyber digital twin is a deployed software model that executes in tandem with the system it simulates, contributing to, and drawing from, the system's behaviour. This paper outlines Facebook's cyber cyber digital twin, dubbed WW, a twin of Facebook's WWW platform, built using web-enabled simulation. The paper focuses on the current research challenges and opportunities in the area of measurement. Measurement challenges lie at the heart of modern simulation. They directly impact how we use simulation outcomes for automated online and semi-automated offline decision making. Measurements also encompas how we verify and validate those outcomes. Modern simulation systems are increasingly becoming more like cyber cyber digital twins, effectively moving from manual to automated decision making, hence, these measurement challenges acquire ever greater significance.

References

  1. David Adam. 2020. Special report: The simulations driving the world's response to COVID-19. Nature (April 2020).Google ScholarGoogle Scholar
  2. John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Ralf Laemmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2020. WES: Agent-based User Interaction Simulation on Real Infrastructure. In GI @ ICSE 2020, Shin Yoo, Justyna Petke, Westley Weimer, and Bobby R. Bruce (Eds.). ACM, 276--284. https://doi.org/doi:10.1145/3387940.3392089 Invited Keynote. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2021. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Virtual.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins. In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Saif Al-Sultan, Moath M. Al-Doori, Ali H. Al-Bayatti, and Hussien Zedan. 2014. A comprehensive survey on vehicular Ad Hoc network. Journal of Network and Computer Applications 37 (2014), 380--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. V. Alba Fernández, M.D. Jiménez Gamero, and J. Muñoz García. 2008. A test for the two-sample problem based on empirical characteristic functions. Computational Statistics and Data Analysis 52, 7 (2008), 3730--3748. https://doi.org/10.1016/j.csda.2007.12.013 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nadia Alshahwan and Mark Harman. 2011. Automated Web Application Testing Using Search Based Software Engineering. In 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). Lawrence, Kansas, USA, 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kelly Androutsopoulos, David Clark, Haitao Dan, Mark Harman, and Robert Hierons. 2014. An Analysis of the Relationship between Conditional Entropy and Failed Error Propagation in Software Testing. In 36th International Conference on Software Engineering (ICSE 2014). Hyderabad, India, 573--583. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In 33rd International Conference on Software Engineering (ICSE'11) (Waikiki, Honolulu, HI, USA). ACM, New York, NY, USA, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrea Arcuri and Lionel Briand. 2014. A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability 24, 3 (2014), 219--250. https://doi.org/10.1002/stvr.1486 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/stvr.1486 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 712--721. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, Baltimore, MD, USA, July 12-17, 2015. 257--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507--525.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Karsten M. Borgwardt, Arthur Gretton, Malte J. Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J. Smola. 2006. Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics 22, 14 (07 2006), e49-e57. https://doi.org/10.1093/bioinformatics/btl242 arXiv:https://academic.oup.com/bioinformatics/article-pdf/22/14/e49/616383/btl242.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David Bowes, Tracy Hall, Mark Harman, Yue Jia, Federica Sarro, and Fan Wu. 2016. Mutation-Aware Fault Prediction. In International Symposium on Software Testing and Analysis (ISSTA 2016). 330--341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Haiyan Cai, Bryan Goggin, and Qingtang Jiang. 2020. Two-sample test based on classification probability. Statistical Analysis and Data Mining: The ASA Data Science Journal 13, 1 (2020), 5--13. https://doi.org/10.1002/sam.11438 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/sam.11438Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tsong Yueh Chen, Jianqiang Feng, and T. H. Tse. 2002. Metamorphic Testing of Programs on Partial Differential Equations: A Case Study. In 26th Annual International Computer Software and Applications Conference (COMPSAC'02). IEEE Computer Society, 327--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming. 268--279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. Csiszar. 1967. Information-type measures of difference of probability distributions and indirect observation. Studia Scientiarum Mathematicarum Hungarica 2 (1967), 229--318. https://ci.nii.ac.jp/naid/10028997448/en/Google ScholarGoogle Scholar
  20. Michael E. Fagan. 1976. Design and code inspections to reduce errors in code development. IBM Systems Journal 15, 3 (1976), 182--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dror G. Feitelson, Eitan Frachtenberg, and Kent L. Beck. 2013. Development and Deployment at Facebook. IEEE Internet Computing 17, 4 (2013), 8--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Han Feng, Xing Qiu, and Hongyu Miao. 2021. Hypothesis Testing for Two Sample Comparison of Network Data. arXiv:2106.13931 [stat.ME]Google ScholarGoogle Scholar
  23. Steven Goodman. 2008. A dirty dozen: twelve p-value misconceptions. In Seminars in hematology, Vol. 45. Elsevier, 135--140.Google ScholarGoogle Scholar
  24. Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current Challenges in Automatic Software Repair. Software Quality Journal 21, 3 (2013), 421--443. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. 2012. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Transactions on Software Engineering 38, 6 (2012), 1276--1304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mark Harman, William B. Langdon, and Yue Jia. 2014. Babel Pidgin: SBSE can grow and graft entirely new functionality into a real world system. In 6th Symposium on Search Based Software Engineering (SSBSE 2014). Springer LNCS, Fortaleza, Brazil, 247--252.Google ScholarGoogle ScholarCross RefCross Ref
  27. Mark Harman, Phil McMinn, Jerffeson Teixeira de Souza, and Shin Yoo. 2012. Search Based Software Engineering: Techniques, Taxonomy, Tutorial. In Empirical software engineering and verification: LASER 2009-2010, Bertrand Meyer and Martin Nordio (Eds.). Springer, 1--59. LNCS 7007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mark Harman and Peter O'Hearn. 2018. From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis (keynote paper). In 18th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2018). Madrid, Spain, 1--23.Google ScholarGoogle ScholarCross RefCross Ref
  29. Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering 37, 5 (September-October 2011), 649 - 678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Gregory L Johnson, Clayton L Hanson, Stuart P Hardegree, and Edward B Ballard. 1996. Stochastic weather simulation: Overview and analysis of two commonly used models. Journal of Applied Meteorology 35, 10 (1996), 1878--1896.Google ScholarGoogle ScholarCross RefCross Ref
  31. Ilmun Kim, Aaditya Ramdas, Aarti Singh, and Larry Wasserman. 2021. Classification accuracy as a proxy for two-sample testing. The Annals of Statistics 49, 1 (2021), 411--434. https://doi.org/10.1214/20-AOS1962Google ScholarGoogle ScholarCross RefCross Ref
  32. Ross D. King, Kenneth E. Whelan, Ffion M. Jones, Philip G. K. Reiser, Christopher H. Bryant, Douglas B. Kell Stephen H. Muggleton, and Stephen G. Oliver. 2004. Functional genomic hypothesis generation and experimentation by a robot scientist. Nature (01 2004), 247--252.Google ScholarGoogle Scholar
  33. Ron Kohavi and Roger Longbotham. 2017. Online Controlled Experiments and A/B Testing. Encyclopedia of machine learning and data mining 7, 8 (2017), 922--929.Google ScholarGoogle Scholar
  34. Zheng Li, Mark Harman, and Rob Hierons. 2007. Search Algorithms for Regression Test Case Prioritization. IEEE Transactions on Software Engineering 33, 4 (2007), 225--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. David Lopez-Paz and Maxime Oquab. 2017. Revisiting Classifier Two-Sample Tests. In ICLR.Google ScholarGoogle Scholar
  36. Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An empirical analysis of flaky tests. In 22nd International Symposium on Foundations of Software Engineering (FSE 2014), Shing-Chi Cheung, Alessandro Orso, and Margaret-Anne Storey (Eds.). ACM, Hong Kong, China, 643--653. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. David J. C. MacKay. 2002. Information Theory, Inference and Learning Algorithms. Cambridge University Press, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Alexandru Marginean, Johannes Bader, Satish Chandra, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, and Andrew Scott. 2019. SapFix: Automated End-to-End Repair at Scale. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Montreal, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Atif M. Memon and Myra B. Cohen. 2013. Automated testing of GUI applications: models, tools, and controlling flakiness. In 35th International Conference on Software Engineering (ICSE 2013), David Notkin, Betty H. C. Cheng, and Klaus Pohl (Eds.). IEEE Computer Society, San Francisco, CA, USA, 1479--1480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Atif M. Memon, Zebao Gao, Bao N. Nguyen, Sanjeev Dhanda, Eric Nickell, Rob Siemborski, and John Micco. 2017. Taming Google-Scale Continuous Testing. In 39th International Conference on Software Engineering, Software Engineering in Practice Track (ICSE-SEIP). IEEE, Buenos Aires, Argentina, 233--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Alfred Müller. 1997. Integral Probability Metrics and Their Generating Classes of Functions. Advances in Applied Probability 29, 2 (1997), 429--443. http://www.jstor.org/stable/1428011Google ScholarGoogle ScholarCross RefCross Ref
  42. Justyna Petke, Saemundur O. Haraldsson, Mark Harman, William B. Langdon, David R. White, and John R. Woodward. 2018. Genetic Improvement of Software: a Comprehensive Survey. IEEE Transactions on Evolutionary Computation 22, 3 (June 2018), 415--432. https://doi.org/doi:10.1109/TEVC.2017.2693219Google ScholarGoogle Scholar
  43. Karl R. Popper. 1959. The logic of scientific discovery. London: Hutchinson and Co. (Publishers) 480 p. (1959).Google ScholarGoogle Scholar
  44. Gregg Rothermel, Roland Untch, Chengyun Chu, and Mary Jean Harrold. 2001. Prioritizing Test Cases For Regression Testing. IEEE Transactions on Software Engineering 27, 10 (Oct. 2001), 929--948. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Caitlin Sadowski, Emma Söderberg, Luke Church, Michal Sipko, and Alberto Bacchelli. 2018. Modern code review: a case study at google. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. 181--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Dino Sejdinovic, Arthur Gretton, Bharath Sriperumbudur, and Kenji Fukumizu. 2012. Hypothesis testing using pairwise distances and associated kernels (with Appendix). Proceedings of the 29th International Conference on Machine Learning, ICML 2012 2 (05 2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Claude Elwood Shannon. 1948. A Mathematical Theory of Communication. Bell System Technical Journal 27 (July and October 1948), 379-423 and 623--656. http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html,http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.ps.gz,http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf,http://djvu.research.att.com/djvu/sci/shannon/index.htmlGoogle ScholarGoogle Scholar
  48. Christian Steinruecken, Emma Smith, David Janz, James Lloyd, and Zoubin Ghahramani. 2019. The Automatic Statistician. Springer International Publishing, Cham, 161--173. https://doi.org/10.1007/978-3-030-05318-5_9Google ScholarGoogle Scholar
  49. Margaret-Anne D. Storey and Alexey Zagalsky. 2016. Disrupting developer productivity one bot at a time. In Proceedings of the 24th International Symposium on Foundations of Software Engineering (FSE 2016), Seattle, WA, USA, November 13-18, 2016. ACM, 928--931. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Sergio Terzi and Sergio Cavalieri. 2004. Simulation in the supply chain context: a survey. Computers in Industry 53, 1 (2004), 3--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Simon Urli, Zhongxing Yu, Lionel Seinturier, and Martin Monperrus. 2018. How to Design a Program Repair Bot? Insights from the Repairnator Project. In 40th International Conference on Software Engineering, Software Engineering in Practice track (ICSE 2018 SEIP track). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Jeffrey M. Voas and Keith W. Miller. 1995. Software Testability: The New Verification. IEEE Software 12, 3 (May 1995), 17--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Oscar Wilde. 1895. The Importance of Being Earnest.Google ScholarGoogle Scholar
  54. Aiko Yamashita and Leon Moonen. 2013. Do developers care about code smells? An exploratory survey. In 2013 20th working conference on reverse engineering (WCRE). IEEE, 242--251.Google ScholarGoogle Scholar
  55. Shin Yoo and Mark Harman. 2012. Regression Testing Minimisation, Selection and Prioritisation: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012), 67--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Tong Yu and Hong Zhu. 2020. Hyper-Parameter Optimization: A Review of Algorithms and Applications. arXiv:2003.05689 [cs.LG]Google ScholarGoogle Scholar
  57. Tianyi Zhang and Miryung Kim. 2017. Automated transplantation and differential testing for clones. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 665--676. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
        October 2021
        368 pages
        ISBN:9781450386654
        DOI:10.1145/3475716

        Copyright © 2021 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 October 2021

        Check for updates

        Qualifiers

        • keynote
        • Research
        • Refereed limited

        Acceptance Rates

        ESEM '21 Paper Acceptance Rate24of124submissions,19%Overall Acceptance Rate130of594submissions,22%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader