Skip to main content
Log in

Efficient probe selection algorithms for fault diagnosis

  • Published:
Telecommunication Systems Aims and scope Submit manuscript

Abstract

Increase in the network usage for more and more performance critical applications has caused a demand for tools that can monitor network health with minimum management traffic. Adaptive probing has the potential to provide effective tools for end-to-end monitoring and fault diagnosis over a network. Adaptive probing based algorithms adapt the probe set to localize faults in the network by sending less probes in healthy areas and more probes in the suspected areas of failure. In this paper we present adaptive probing tools that meet the requirements to provide an effective and efficient solution for fault diagnosis for modern communication systems. We present a system architecture for adaptive probing based fault diagnosis tool and propose algorithms for probe selection to perform failure detection and fault localization. We compare the performance and efficiency of the proposed algorithms through simulation results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Al-Shaer, E., & Tang, Y. (2002). QoS path monitoring for multicast networks. Journal of Network and Systems Management, 10(3), 357–381.

    Article  Google Scholar 

  2. Andersen, D. G., Balakrishnan, H., Kaashoek, M. F., & Morris, R. (2001). Resilient overlay networks. In Symposium on operating systems principles (pp. 131–145). Chateau Lake Louise, Banff, Canada.

  3. Banerjee, S., Bhattacharjee, B., & Kommareddy, C. (2002). Scalable application layer multicast. In ACM SIGCOMM 2002 (pp. 205–217). Pittsburgh, PA.

  4. Bejerano, Y., & Rastogi, R. (2003). Robust monitoring of link delays and faults in IP networks. In IEEE INFOCOM 2003 (pp. 1092–1103). San Francisco, CA.

  5. Brodie, M., Rish, I., & Ma, S. (2001). Optimizing probe selection for fault localization. In IFIP/IEEE international workshop on distributed systems: operations and management (pp. 1147–1157). Nancy, France.

  6. Brodie, M., Rish, I., Ma, S., Grabarnik, G., & Odintsova, N. (2002). Active probing (Technical report). IBM.

  7. Brown, A., & Patterson, D. (2001). An active approach to characterizing dynamic dependencies for problem determination in a distributed environment. In IFIP/IEEE international symposium on integrated network management (pp. 377–390). Seattle, WA.

  8. Carter, R. L., & Crovella, M. E. (1997). Server selection using dynamic path characterization in wide-area networks. In IEEE INFOCOM, 1999 (pp. 1014–1021). Kobe, Japan.

  9. Chen, Y., Bindel, D., & Katz, R. H. (2003). Tomography-based overlay network monitoring. In ACM SIGCOMM conference on Internet measurement, 2003 (pp. 216–231). Miami, FL.

  10. Chen, Y., Bindel, D., Song, H., & Katz, R. H. (2004). An algebraic approach to practical and scalable overlay network monitoring. In ACM SIGCOMM 2004 (pp. 55–66).

  11. Cormen, T. H., Stein, C., Rivest, R. L., & Leiserson, C. E. (2001). Introduction to algorithms. New York: McGraw-Hill.

    Google Scholar 

  12. Dovrolis, C., Ramanathan, P., & Moore, D. (2001). What do packet dispersion techniques measure? In IEEE INFOCOM 2001 (pp. 905–914). Anchorage, Alaska.

  13. Downey, A. B. (1999). Using Pathchar to estimate Internet link characteristics. In ACM SIGCOMM 1999 (pp. 241–250). Cambridge, MA.

  14. Frenkiel, A., & Lee, H. (1999). EPP: A framework for measuring the end-to-end performance of distributed applications. In Performance engineering ‘best practices’ conference. IBM Academy of Technology.

  15. Gao, J., Kar, G., & Kermani, P. (2004). Approaches to building self-healing systems using dependency analysis. In IEEE/IFIP network operations and management symposium (pp. 119–132). Seoul, Korea.

  16. Chu, Y. H., Rao, S. G., Seshan, S., & Zhang, H. (2002). A case of end system multicast. IEEE Journal on Selected Areas in Communications, 1456–1471.

  17. Hu, N., & Steenkiste, P. (2003). Evaluation and characterization of available bandwidth probing techniques. IEEE Journal on Selected Areas in Communications, 21(6), 879–894. Special issue in Internet and WWW measurement, mapping and modeling.

    Article  Google Scholar 

  18. Hu, N., & Steenkiste, P. (2003). Towards tunable measurement techniques for available bandwidth. In Bandwidth estimation workshop (BEst03). San Diego, CA.

  19. Huffaker, B., Plummer, D., Moore, D., & Claffy, K. (2002). Topology discovery by active probing. In Symposium on applications and the Internet (pp. 90–96). Nara, Japan.

  20. Jain, M., & Dovrolis, C. (2002). End-to-end available bandwidth: Measurement methodology, dynamics, and relation with TCP throughput. In ACM SIGCOMM 2002 (pp. 537–549). Pittsburgh, PA.

  21. Lai, K., & Baker, M. (1999). Measuring bandwidth. In IEEE INFOCOM 1999 (pp. 235–245). New York, NY.

  22. Li, F., & Thottan, M. (2006). End-to-end service quality measurement using source-routed probes. In IEEE INFOCOM 2006 (pp. 1–12). Barcelona, Spain.

  23. Natu, M., & Sethi, A. S. (2008, to appear). Probe station placement for robust monitoring of networks. Journal of Network and Systems Management.

  24. Ozmutlu, H. C., Gautam, N., & Barton, R. (2002). Managing end-to-end network performance via optimized monitoring strategies. Journal of Network and Systems Management, 10(1), 107–126.

    Article  Google Scholar 

  25. Ozmutlu, H. C., Gautam, N., & Barton, R. R. (2002). Zone recovery methodology for probe-subset selection in end-to-end network monitoring. In IEEE/IFIP network operations and management symposium (pp. 451–464). Florence, Italy.

  26. Ratnasamy, S., Francis, P., Handley, M., Karp, R., & Schenker, S. (2001). A scalable content-addressable network. In ACM SIGCOMM 2001 (pp. 161–172). San Diego, CA.

  27. Ribeiro, V., Coates, M., Riedi, R., & Sarvotham, S. (2000). Multi-fractional cross-traffic estimation. In ITC specialist seminar on IP traffic measurement, modeling and management 2000. Monterey, CA.

  28. Ribeiro, V. J., Riedi, R. H., & Baraniuk, R. G. (2004). Spatio-temporal available bandwidth estimation with STAB. In ACM SIGMETRICS 2004 (pp. 394–395). New York, NY.

  29. Ribeiro, V. J., Riedi, R. H., Baraniuk, R. G., Navratil, J., & Cottrell, L. (2003). pathChirp: Efficient available bandwidth estimation for network paths. In Passive and active measurement workshop. La Jolla, CA.

  30. Rish, I., Brodie, M., Ma, S., Odintsova, N., Beygelzimer, A., Grabarnik, G., & Hernandez, K. (2005). Adaptive diagnosis in distributed systems. IEEE Transactions on Neural Networks, 6(5), 1088–1109.

    Article  Google Scholar 

  31. Rowstron, A., & Druschel, P. (2001). Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Lecture notes in computer science (Vol. 2218, pp. 329–350). Berlin: Springer.

    Google Scholar 

  32. Tang, C., & McKinley, P. (2003). On the cost-quality tradeoff in topology-aware overlay path probing. In IEEE international conference on network protocols, 2003 (pp. 268–179). Atlanta, GA.

  33. Tang, Y., Al-Shaer, E. S., & Boutaba, R. (2005). Active integrated fault localization in communication networks. In IFIP/IEEE international symposium on integrated network management (pp. 543–556). Nice, France.

  34. Zhang, B., Jamin, S., & Zhang, L. (2002). Host multicast: A framework for delivering multicast to end users. In IEEE INFOCOM 2002 (pp. 1366–1375). New York, NY.

  35. Zhao, B. Y., Kubiatowicz, J. D., & Joseph, A. D. (2001). Tapestry: An infrastructure for fault-tolerant wide-area location and routing (Technical report). University of California at Berkeley, Berkeley, CA, USA.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maitreya Natu.

Additional information

Prepared through collaborative participation in the Communications and Networks Consortium sponsored by the US Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement DAAD19-01-2-0011. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Natu, M., Sethi, A.S. & Lloyd, E.L. Efficient probe selection algorithms for fault diagnosis. Telecommun Syst 37, 109–125 (2008). https://doi.org/10.1007/s11235-008-9069-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11235-008-9069-1

Keywords

Navigation