Skip to main content
Log in

Precision, recall, and sensitivity of monitoring partially synchronous distributed programs

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

Distributed programs are often designed with implicit assumptions about the underlying system. We focus on assumptions related to clock synchronization. When a program written with clock synchronization assumptions is monitored to determine if it satisfies its requirements, the monitor should also account for these assumptions precisely. Otherwise, the monitor will either miss potential bugs (false negatives) or find bugs that are inconsistent with these assumptions (false positives). However, if assumptions made by the program are implicit or change over time and are not immediately available to the monitor, such false positives and/or negatives are unavoidable. This paper characterizes precision (probability that the violation identified by the monitor is valid) and recall (probability that the monitor identifies an actual violation) of the monitor based on the gap between clock synchronization assumptions made by the program/application and the clock synchronization assumptions made by the monitor. Our analysis is based on the development of an analytical model for precision, recall and sensitivity of monitors detecting conjunctive predicates. We validate the model via simulations and experiments on the Amazon Web Services platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. We provide an interpretation of a clock tick in terms of actual elapsed time in Sect. 3.3.3. Furthermore, the exact discretization of clock has a negligible effect on the computed precision/recall.

  2. We use \(\phi '\), \(\phi ''\), and \(\phi '''\) to denote the first (\(\frac{\partial \phi }{\partial \epsilon }\)), second (\(\frac{\partial ^2 \phi }{\partial \epsilon ^2}\)), and third (\(\frac{\partial ^3 \phi }{\partial \epsilon ^3}\)) partial derivative of \(\phi \) with respect to \(\epsilon \).

  3. Note that our analysis is based on the property of the monitor and, hence, we do not consider how the monitoring algorithm can be evaluated/implemented most efficiently.

  4. Note that this property is not guaranteed even with physical clocks, because a message send event and the corresponding message receive event can have equal physical timestamps due to clock skew, so events with equal physical timestamps may not be concurrent events.

References

  1. Almeida, J.B., Almeida, P.S., Baquero, C.: Bounded version vectors. In: Guerraoui, R. (ed.) Distributed Computing, 18th International Conference, DISC 2004, Amsterdam, The Netherlands, October 4–7, 2004, Proceedings, Lecture Notes in Computer Science, vol. 3274, pp. 102–116. Springer (2004)

  2. Charron-Bost, B.: Concerning the size of logical clocks in distributed systems. Inf. Process. Lett. 39(1), 11–16 (1991)

    Article  MathSciNet  Google Scholar 

  3. Chow, M., Meisner, D., Flinn, J., Peek, D., Wenisch, T.: The mystery machine: end-to-end performance analysis of large-scale internet services. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 217–231 (2014)

  4. Cooper, R., Marzullo, K.: Consistent detection of global predicates. In: Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, Santa Cruz, California, USA, May 20–21, 1991, pp. 167–174 (1991)

  5. Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., Woodford, D.: Spanner: Google’s globally-distributed database. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12, pp. 251–264. USENIX Association, Berkeley, CA, USA (2012). http://dl.acm.org/citation.cfm?id=2387880.2387905

  6. Demirbas, M., Kulkarni, S.: Beyond truetime: using augmentedtime for improving google spanner. In: LADIS ’13: 7th Workshop on Large-Scale Distributed Systems and Middleware (2013)

  7. Fidge, J.: Timestamps in message-passing systems that preserve the partial ordering. In: Proceedings of the 11th Australian Computer Science Conference, vol. 10(1), pp. 56–66 (1988)

  8. Garg, V.K., Chase, C.: Distributed algorithms for detecting conjunctive predicates. In: International Conference on Distributed Computing Systems, vol. 423–430 (1995)

  9. Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application-specific protocol architecture for wireless microsensor networks. IEEE Trans. Wirel. Commun. 1(4), 660–670 (2002)

    Article  Google Scholar 

  10. Kandris, D., Tsioumas, P., Tzes, A., Nikolakopoulos, G., Vergados, D.D.: Power conservation through energy efficient routing in wireless sensor networks. Sensors 9(9), 7320–7342 (2009)

    Article  Google Scholar 

  11. Kulkarni, S.S., Arumugam, M.: Infuse: a TDMA based data dissemination protocol for sensor networks. IJDSN 2(1), 55–78 (2006)

    Google Scholar 

  12. Kulkarni, S.S., Demirbas, M., Madappa, D., Avva, B., Leone, M.: Logical physical clocks. In: 18th International Conference on Principles of Distributed Systems OPODIS 2014, vol. 8878, pp. 17–32 (2014)

  13. Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)

    Article  Google Scholar 

  14. Lu, H., Veeraraghavan, K., Ajoux, P., Hunt, J., Song, Y.J., Tobagus, W., Kumar, S., Lloyd, W.: Existential consistency: measuring and understanding consistency at facebook. In: Proceedings of the 25th Symposium on Operating Systems Principles, pp. 295–310. ACM (2015)

  15. Mattern, F.: Virtual time and global states of distributed systems. Parallel Distrib. Algorithms, 215–226 (1989)

  16. Mills, D.: A brief history of ntp time: memoirs of an internet timekeeper. ACM SIGCOMM Comput. Commun. Rev. 33(2), 9–21 (2003)

    Article  Google Scholar 

  17. Mostafa, M., Bonakdarpour, B.: Decentralized runtime verification of LTL specifications in distributed systems. In: 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, May 25–29, 2015, pp. 494–503 (2015). https://doi.org/10.1109/IPDPS.2015.95

  18. Nguyen, D.: Supplementary Materials (Source Code and Raw Experimental Results) for the Paper Precision, Recall, and Sensitivity of Monitoring Partially Synchronous Distributed Programs (2020). https://doi.org/10.5281/zenodo.3778190

  19. Nguyen, D.: Quasi-asynchronous Monitors: Supplementary Materials (source Code and Raw Experimental Results) for the Paper Precision, Recall, and Sensitivity of Monitoring Partially Synchronous Distributed Programs (2021). https://doi.org/10.5281/zenodo.4557924

  20. Nguyen, D.N., Charapko, A., Kulkarni, S.S., Demirbas, M.: Using weaker consistency models with monitoring and recovery for improving performance of key-value stores. J. Braz. Comput. Soc. 25(1), 10:1–10:25 (2019)

    Article  Google Scholar 

  21. Sigelman, B., Barroso, L., Burrows, M., Stephenson, P., Plakal, M., Beaver, D., Jaspan, S., Shanbhag, C.: Dapper, a Large-scale Distributed Systems Tracing Infrastructure. Technical repory, Google, Inc. (2010). http://research.google.com/archive/papers/dapper-2010-1.pdf

  22. Stoller, S.: Detecting global predicates in distributed systems with clocks. Distrib. Comput. 13(2), 85–98 (2000)

    Article  Google Scholar 

  23. Verissimo, P.: Real-time communication. Distrib. Syst. 2 (1993)

  24. Yingchareonthawornchai, S., Kulkarni, S.S., Demirbas, M.: Analysis of bounds on hybrid vector clocks. In: OPODIS 2015, December 14–17, 2015, Rennes, France, pp. 34:1–34:17 (2015). https://doi.org/10.4230/LIPIcs.OPODIS.2015.34

Download references

Acknowledgements

This work is supported in part by NSF CNS-1329807, NSF CNS-1318678, NSF XPS-1533870, and NSF XPS-1533802.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duong Nguyen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, D., Yingchareonthawornchai, S., Tekken Valapil, V. et al. Precision, recall, and sensitivity of monitoring partially synchronous distributed programs. Distrib. Comput. 34, 319–348 (2021). https://doi.org/10.1007/s00446-021-00402-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-021-00402-w

Keywords

Navigation