keynote

Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System

Authors:
K. Bojarczuk

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
N. Gucevska

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
S. Lucas

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
I. Dvortsova

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
M. Harman

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
E. Meijer

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
S. Sapora

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
J. George

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
M. Lomeli

Facebook Inc. UK

Facebook Inc. UK
View Profile

,
R. Rojas

Facebook Inc. UK

Facebook Inc. UK
View Profile

ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)October 2021Article No.: 2Pages 1–10https://doi.org/10.1145/3475716.3484196

Published:11 October 2021Publication History

ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Pages 1–10

ABSTRACT

A cyber cyber digital twin is a deployed software model that executes in tandem with the system it simulates, contributing to, and drawing from, the system's behaviour. This paper outlines Facebook's cyber cyber digital twin, dubbed WW, a twin of Facebook's WWW platform, built using web-enabled simulation. The paper focuses on the current research challenges and opportunities in the area of measurement. Measurement challenges lie at the heart of modern simulation. They directly impact how we use simulation outcomes for automated online and semi-automated offline decision making. Measurements also encompas how we verify and validate those outcomes. Modern simulation systems are increasingly becoming more like cyber cyber digital twins, effectively moving from manual to automated decision making, hence, these measurement challenges acquire ever greater significance.

References

David Adam. 2020. Special report: The simulations driving the world's response to COVID-19. Nature (April 2020).Google Scholar
John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Ralf Laemmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2020. WES: Agent-based User Interaction Simulation on Real Infrastructure. In GI @ ICSE 2020, Shin Yoo, Justyna Petke, Westley Weimer, and Bobby R. Bruce (Eds.). ACM, 276--284. https://doi.org/doi:10.1145/3387940.3392089 Invited Keynote. Google ScholarDigital Library
John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2021. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Virtual.Google ScholarDigital Library
John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins. In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual. Google ScholarDigital Library
Saif Al-Sultan, Moath M. Al-Doori, Ali H. Al-Bayatti, and Hussien Zedan. 2014. A comprehensive survey on vehicular Ad Hoc network. Journal of Network and Computer Applications 37 (2014), 380--392. Google ScholarDigital Library
V. Alba Fernández, M.D. Jiménez Gamero, and J. Muñoz García. 2008. A test for the two-sample problem based on empirical characteristic functions. Computational Statistics and Data Analysis 52, 7 (2008), 3730--3748. https://doi.org/10.1016/j.csda.2007.12.013 Google ScholarDigital Library
Nadia Alshahwan and Mark Harman. 2011. Automated Web Application Testing Using Search Based Software Engineering. In 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). Lawrence, Kansas, USA, 3--12. Google ScholarDigital Library
Kelly Androutsopoulos, David Clark, Haitao Dan, Mark Harman, and Robert Hierons. 2014. An Analysis of the Relationship between Conditional Entropy and Failed Error Propagation in Software Testing. In 36th International Conference on Software Engineering (ICSE 2014). Hyderabad, India, 573--583. Google ScholarDigital Library
Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In 33rd International Conference on Software Engineering (ICSE'11) (Waikiki, Honolulu, HI, USA). ACM, New York, NY, USA, 1--10. Google ScholarDigital Library
Andrea Arcuri and Lionel Briand. 2014. A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability 24, 3 (2014), 219--250. https://doi.org/10.1002/stvr.1486 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/stvr.1486 Google ScholarDigital Library
Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 712--721. Google ScholarDigital Library
Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, Baltimore, MD, USA, July 12-17, 2015. 257--269. Google ScholarDigital Library
Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507--525.Google ScholarDigital Library
Karsten M. Borgwardt, Arthur Gretton, Malte J. Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J. Smola. 2006. Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics 22, 14 (07 2006), e49-e57. https://doi.org/10.1093/bioinformatics/btl242 arXiv:https://academic.oup.com/bioinformatics/article-pdf/22/14/e49/616383/btl242.pdf Google ScholarDigital Library
David Bowes, Tracy Hall, Mark Harman, Yue Jia, Federica Sarro, and Fan Wu. 2016. Mutation-Aware Fault Prediction. In International Symposium on Software Testing and Analysis (ISSTA 2016). 330--341. Google ScholarDigital Library
Haiyan Cai, Bryan Goggin, and Qingtang Jiang. 2020. Two-sample test based on classification probability. Statistical Analysis and Data Mining: The ASA Data Science Journal 13, 1 (2020), 5--13. https://doi.org/10.1002/sam.11438 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/sam.11438Google ScholarDigital Library
Tsong Yueh Chen, Jianqiang Feng, and T. H. Tse. 2002. Metamorphic Testing of Programs on Partial Differential Equations: A Case Study. In 26th Annual International Computer Software and Applications Conference (COMPSAC'02). IEEE Computer Society, 327--333. Google ScholarDigital Library
Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming. 268--279. Google ScholarDigital Library
I. Csiszar. 1967. Information-type measures of difference of probability distributions and indirect observation. Studia Scientiarum Mathematicarum Hungarica 2 (1967), 229--318. https://ci.nii.ac.jp/naid/10028997448/en/Google Scholar
Michael E. Fagan. 1976. Design and code inspections to reduce errors in code development. IBM Systems Journal 15, 3 (1976), 182--211. Google ScholarDigital Library
Dror G. Feitelson, Eitan Frachtenberg, and Kent L. Beck. 2013. Development and Deployment at Facebook. IEEE Internet Computing 17, 4 (2013), 8--17. Google ScholarDigital Library
Han Feng, Xing Qiu, and Hongyu Miao. 2021. Hypothesis Testing for Two Sample Comparison of Network Data. arXiv:2106.13931 [stat.ME]Google Scholar
Steven Goodman. 2008. A dirty dozen: twelve p-value misconceptions. In Seminars in hematology, Vol. 45. Elsevier, 135--140.Google Scholar
Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current Challenges in Automatic Software Repair. Software Quality Journal 21, 3 (2013), 421--443. Google ScholarDigital Library
Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. 2012. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Transactions on Software Engineering 38, 6 (2012), 1276--1304. Google ScholarDigital Library
Mark Harman, William B. Langdon, and Yue Jia. 2014. Babel Pidgin: SBSE can grow and graft entirely new functionality into a real world system. In 6th Symposium on Search Based Software Engineering (SSBSE 2014). Springer LNCS, Fortaleza, Brazil, 247--252.Google ScholarCross Ref
Mark Harman, Phil McMinn, Jerffeson Teixeira de Souza, and Shin Yoo. 2012. Search Based Software Engineering: Techniques, Taxonomy, Tutorial. In Empirical software engineering and verification: LASER 2009-2010, Bertrand Meyer and Martin Nordio (Eds.). Springer, 1--59. LNCS 7007. Google ScholarDigital Library
Mark Harman and Peter O'Hearn. 2018. From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis (keynote paper). In 18th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2018). Madrid, Spain, 1--23.Google ScholarCross Ref
Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering 37, 5 (September-October 2011), 649 - 678. Google ScholarDigital Library
Gregory L Johnson, Clayton L Hanson, Stuart P Hardegree, and Edward B Ballard. 1996. Stochastic weather simulation: Overview and analysis of two commonly used models. Journal of Applied Meteorology 35, 10 (1996), 1878--1896.Google ScholarCross Ref
Ilmun Kim, Aaditya Ramdas, Aarti Singh, and Larry Wasserman. 2021. Classification accuracy as a proxy for two-sample testing. The Annals of Statistics 49, 1 (2021), 411--434. https://doi.org/10.1214/20-AOS1962Google ScholarCross Ref
Ross D. King, Kenneth E. Whelan, Ffion M. Jones, Philip G. K. Reiser, Christopher H. Bryant, Douglas B. Kell Stephen H. Muggleton, and Stephen G. Oliver. 2004. Functional genomic hypothesis generation and experimentation by a robot scientist. Nature (01 2004), 247--252.Google Scholar
Ron Kohavi and Roger Longbotham. 2017. Online Controlled Experiments and A/B Testing. Encyclopedia of machine learning and data mining 7, 8 (2017), 922--929.Google Scholar
Zheng Li, Mark Harman, and Rob Hierons. 2007. Search Algorithms for Regression Test Case Prioritization. IEEE Transactions on Software Engineering 33, 4 (2007), 225--237. Google ScholarDigital Library
David Lopez-Paz and Maxime Oquab. 2017. Revisiting Classifier Two-Sample Tests. In ICLR.Google Scholar
Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An empirical analysis of flaky tests. In 22nd International Symposium on Foundations of Software Engineering (FSE 2014), Shing-Chi Cheung, Alessandro Orso, and Margaret-Anne Storey (Eds.). ACM, Hong Kong, China, 643--653. Google ScholarDigital Library
David J. C. MacKay. 2002. Information Theory, Inference and Learning Algorithms. Cambridge University Press, USA. Google ScholarDigital Library
Alexandru Marginean, Johannes Bader, Satish Chandra, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, and Andrew Scott. 2019. SapFix: Automated End-to-End Repair at Scale. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Montreal, Canada. Google ScholarDigital Library
Atif M. Memon and Myra B. Cohen. 2013. Automated testing of GUI applications: models, tools, and controlling flakiness. In 35th International Conference on Software Engineering (ICSE 2013), David Notkin, Betty H. C. Cheng, and Klaus Pohl (Eds.). IEEE Computer Society, San Francisco, CA, USA, 1479--1480. Google ScholarDigital Library
Atif M. Memon, Zebao Gao, Bao N. Nguyen, Sanjeev Dhanda, Eric Nickell, Rob Siemborski, and John Micco. 2017. Taming Google-Scale Continuous Testing. In 39th International Conference on Software Engineering, Software Engineering in Practice Track (ICSE-SEIP). IEEE, Buenos Aires, Argentina, 233--242. Google ScholarDigital Library
Alfred Müller. 1997. Integral Probability Metrics and Their Generating Classes of Functions. Advances in Applied Probability 29, 2 (1997), 429--443. http://www.jstor.org/stable/1428011Google ScholarCross Ref
Justyna Petke, Saemundur O. Haraldsson, Mark Harman, William B. Langdon, David R. White, and John R. Woodward. 2018. Genetic Improvement of Software: a Comprehensive Survey. IEEE Transactions on Evolutionary Computation 22, 3 (June 2018), 415--432. https://doi.org/doi:10.1109/TEVC.2017.2693219Google Scholar
Karl R. Popper. 1959. The logic of scientific discovery. London: Hutchinson and Co. (Publishers) 480 p. (1959).Google Scholar
Gregg Rothermel, Roland Untch, Chengyun Chu, and Mary Jean Harrold. 2001. Prioritizing Test Cases For Regression Testing. IEEE Transactions on Software Engineering 27, 10 (Oct. 2001), 929--948. Google ScholarDigital Library
Caitlin Sadowski, Emma Söderberg, Luke Church, Michal Sipko, and Alberto Bacchelli. 2018. Modern code review: a case study at google. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. 181--190. Google ScholarDigital Library
Dino Sejdinovic, Arthur Gretton, Bharath Sriperumbudur, and Kenji Fukumizu. 2012. Hypothesis testing using pairwise distances and associated kernels (with Appendix). Proceedings of the 29th International Conference on Machine Learning, ICML 2012 2 (05 2012). Google ScholarDigital Library
Claude Elwood Shannon. 1948. A Mathematical Theory of Communication. Bell System Technical Journal 27 (July and October 1948), 379-423 and 623--656. http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html,http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.ps.gz,http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf,http://djvu.research.att.com/djvu/sci/shannon/index.htmlGoogle Scholar
Christian Steinruecken, Emma Smith, David Janz, James Lloyd, and Zoubin Ghahramani. 2019. The Automatic Statistician. Springer International Publishing, Cham, 161--173. https://doi.org/10.1007/978-3-030-05318-5_9Google Scholar
Margaret-Anne D. Storey and Alexey Zagalsky. 2016. Disrupting developer productivity one bot at a time. In Proceedings of the 24th International Symposium on Foundations of Software Engineering (FSE 2016), Seattle, WA, USA, November 13-18, 2016. ACM, 928--931. Google ScholarDigital Library
Sergio Terzi and Sergio Cavalieri. 2004. Simulation in the supply chain context: a survey. Computers in Industry 53, 1 (2004), 3--16. Google ScholarDigital Library
Simon Urli, Zhongxing Yu, Lionel Seinturier, and Martin Monperrus. 2018. How to Design a Program Repair Bot? Insights from the Repairnator Project. In 40th International Conference on Software Engineering, Software Engineering in Practice track (ICSE 2018 SEIP track). 1--10. Google ScholarDigital Library
Jeffrey M. Voas and Keith W. Miller. 1995. Software Testability: The New Verification. IEEE Software 12, 3 (May 1995), 17--28. Google ScholarDigital Library
Oscar Wilde. 1895. The Importance of Being Earnest.Google Scholar
Aiko Yamashita and Leon Moonen. 2013. Do developers care about code smells? An exploratory survey. In 2013 20th working conference on reverse engineering (WCRE). IEEE, 242--251.Google Scholar
Shin Yoo and Mark Harman. 2012. Regression Testing Minimisation, Selection and Prioritisation: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012), 67--120. Google ScholarDigital Library
Tong Yu and Hong Zhu. 2020. Hyper-Parameter Optimization: A Review of Algorithms and Applications. arXiv:2003.05689 [cs.LG]Google Scholar
Tianyi Zhang and Miryung Kim. 2017. Automated transplantation and differential testing for clones. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 665--676. Google ScholarDigital Library

Index Terms

Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System
1. Software and its engineering
  1. Software creation and management
  2. Software organization and properties
    1. Extra-functional properties

Recommendations

Facebook’s Cyber–Cyber and Cyber–Physical Digital Twins
EASE '21: Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering

A cyber–cyber digital twin is a simulation of a software system. By contrast, a cyber–physical digital twin is a simulation of a non-software (physical) system. Although cyber–physical digital twins have received a lot of recent attention, their cyber–...
Read More
Modeling and control of Cyber-Physical Systems subject to cyber attacks: A survey of recent advances and challenges
Highlights
- In general, the cyber-attacks in the literature can be classified into three main types: denial of service (DoS) attacks, deception attacks, and replay ...
Abstract
Cyber Physical Systems (CPS) are almost everywhere; they can be accessed and controlled remotely. These features make them more vulnerable to cyber attacks. Since these systems provide critical services, having them under attack would ...
Read More
Cyber Security 51 Handy Things To Know About Cyber Attacks: From the first Cyber Attack in 1988 to the WannaCry ransomware 2017. Tips and Signs to Protect your hardaware and software
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
October 2021
368 pages
ISBN:9781450386654
DOI:10.1145/3475716
General Chair:
Filippo Lanubile
Copyright © 2021 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 October 2021
Check for updates
Author Tags
Digital Twin
Simulation
Social Media
Software Measurement
Qualifiers
- keynote
- Research
- Refereed limited
Conference

Acceptance Rates
ESEM '21 Paper Acceptance Rate24of124submissions,19%Overall Acceptance Rate130of594submissions,22%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 189
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System

ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

ABSTRACT

References

Cited By

Index Terms

Recommendations

Facebook’s Cyber–Cyber and Cyber–Physical Digital Twins

Modeling and control of Cyber-Physical Systems subject to cyber attacks: A survey of recent advances and challenges

Cyber Security 51 Handy Things To Know About Cyber Attacks: From the first Cyber Attack in 1988 to the WannaCry ransomware 2017. Tips and Signs to Protect your hardaware and software