Abstract
Stream processing is a new computing paradigm that enables continuous and fast analysis of massive volumes of streaming data. Debugging streaming applications is not trivial, since they are typically distributed across multiple nodes and handle large amounts of data. Traditional debugging techniques like breakpoints often rely on a stop-the-world approach, which may be useful for debugging single node applications, but insufficient for streaming applications. We propose a new visual and analytic environment to support debugging, performance analysis, and troubleshooting for stream processing applications. Our environment provides several visualization methods to study, characterize, and summarize the flow of tuples between stream processing operators. The user can interactively indicate points in the streaming application from where tuples will be traced and visualized as they flow through different operators, without stopping the application. To substantiate our discussion, we also discuss several of these features in the context of a financial engineering application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Turaga, D., Andrade, H., Gedik, B., Venkatramani, C., Verscheure, O., Harris, D., Cox, J., Szewczyk, W., Jones, P.: Design Principles for Developing Stream Processing Applications. Software: Practice & Experience Journal ( to appear, 2010)
Amini, L., Andrade, H., Bhagwan, R., Eskesen, F., King, R., Selo, P., Park, Y., Venkatramani, C.: SPC: A distributed, scalable platform for data mining. In: Workshop on Data Mining Standards, Services and Platforms, DMSSP, Philadelphia, PA (2006)
De Pauw, W., Andrade, H.: Visualizing large-scale streaming applications. Information Visualization 8, 87–106 (2009)
De Pauw, W., Andrade, H., Amini, L.: Streamsight: a visualization tool for large-scale streaming applications. In: Proceedings of the 4th ACM Symposium on Software Visualization, SoftVis 2008, Ammersee, Germany, September 16 - 17, pp. 125–134. ACM, New York (2008)
Gedik, B., Andrade, H., Frenkiel, A., De Pauw, W., Pfeifer, M., Allen, P., Cohen, N., Wu, K.-L.: Tools and strategies for debugging distributed stream processing applications. Software: Practice & Experience 39(16) (2009)
Gedik, B., Andrade, H., Wu, K.-L., Yu, P.S., Doo, M.: SPADE: The System S Declarative Stream Processing Engine. In: International Conference on Management of Data, ACM SIGMOD (2008)
Wang, H.Y., Andrade, H., Gedik, B., Wu, K.-L.: A Code Generation Approach for Auto-Vectorization in the SPADE Compiler. In: International Workshop on Languages and Compilers for Parallel Computing, pp. 383–390 (2009)
Khandekar, R., Hildrum, K., Parekh, S., Rajan, D., Wolf, J., Andrade, H., Wu, K.-L., Gedik, B.: COLA: Optimizing Stream Processing Applications Via Graph Partitioning. In: Bacon, J.M., Cooper, B.F. (eds.) Middleware 2009. LNCS, vol. 5896, pp. 308–327. Springer, Heidelberg (2009)
Stanley, T., Close, T., Miller, M.S.: Causeway: A message-oriented distributed debugger. Technical report, HPL-2009-78, HP Laboratories (2009)
Vijayakumar, N., Plale, B.: Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 46–54. Springer, Heidelberg (2006)
Blount, M., Davis, J., Misra, A., Sow, D., Wang, M.: A Time-and-Value Centric Provenance Model and Architecture for Medical Event Streams. In: ACM HealthNet Workshop, pp. 95–100 (2007)
Misra, A., Blount, M., Kementsietsidis, A., Sow, D., Wang, M.: Advances and Challenges for Scalable Provenance in Stream Processing Systems. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145. Springer, Heidelberg (2006)
De Pauw, W., Lei, M., Pring, E., Villard, L., Arnold, M., Morar, J.F.: Web Services Navigator: Visualizing the execution of Web Services. IBM Systems Journal 44(4) (2005)
De Pauw, W., Hoch, R., Huang, Y.: Discovering Conversations in Web Services Using Semantic Correlation Analysis. In: International Conference on Web Services 2007, pp. 639–646 (2007)
Aguilera, M.K., Mogul, J.C., Wiener, J.L., Reynolds, P., Muthitacharoen, A.: Performance debugging for distributed systems of black boxes. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP 2003, Bolton Landing, NY, USA, October 19 - 22, pp. 74–89. ACM, New York (2003)
Wong, W.E., Qi, Y.: An Execution Slice and Inter-Block Data Dependency-Based Approach for Fault Localization. In: Proceedings of the 11th Asia-Pacific Software Engineering Conference, pp. 366–373 (2004)
Andrade, H., Gedik, B., Wu, K.-L.: Scale-up Strategies for Processing High-Rate Data Streams in System S. In: International Conference on Data Engineering, IEEE ICDE (2009)
Zhang, X.J., Andrade, H., Gedik, B., King, R., Morar, J., Nathan, S., Park, Y., Pavuluri, R., Pring, E., Schnier, R., Selo, P., Spicer, M., Venkatramani, C.: Implementing a High-Volume, Low-Latency Market Data Processing System on Commodity Hardware using IBM Middleware. In: Workshop on High Performance Computational Finance (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
De Pauw, W. et al. (2010). Visual Debugging for Stream Processing Applications. In: Barringer, H., et al. Runtime Verification. RV 2010. Lecture Notes in Computer Science, vol 6418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16612-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-16612-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16611-2
Online ISBN: 978-3-642-16612-9
eBook Packages: Computer ScienceComputer Science (R0)