skip to main content
research-article

Interaction Traces Mining for Efficient System Responses Generation

Published: 06 February 2015 Publication History

Abstract

Software service emulation is an emerging technique for creating realistic executable models of server-side behaviour. It is particularly useful in quality assurance and DevOps, replicating production-like conditions for large-scale enterprise software systems. Existing approaches can automatically build client-server and server-server interaction models of complex software systems directly from analysis of service interaction trace data. However, when these interaction traces become large, searching an entire trace library to generate a run-time responses can become very slow. In this paper we describe a new technique that utilises data mining, specifically clustering algorithms, to pre-process large amounts of recorded interaction trace data. With the obtained clusters we facilitate efficient yet well-formed runtime response generation in our Enterprise System emulation environment. We evaluate our approach using two common application-layer protocols: LDAP and SOAP. Our experimental results show that by utilising clustering techniques in the pre-processing step, the response generation time can be reduced by 99% on average compared with existing approaches.

References

[1]
J. C. Bezdek and R. J. Hathaway. Vat: A tool for visual assessment of (cluster) tendency. In Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN 2002), volume 3 of IJCNN 2002, pages 2225--2230. IEEE, 2002.
[2]
D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn, H. F. Nielsen, S. Thatte, and D. Winer. Simple Object Access Protocol (SOAP) 1.1,. W3C Note 8, W3C, May 2000. http://www.w3.org/TR/2000/NOTE-SOAP-20000508/.
[3]
CA Technologies. CA Directory Administration Guide (r12.0 SP11), 2012.
[4]
W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz. Protocol-independent adaptive replay of application dialog. In Proceedings of the 13th Annual Network and Distributed System Security Symposium (NDSS 2006), Feb. 2006.
[5]
Y. Dang, R. Wu, H. Zhang, D. Zhang, and P. Nobel. Rebucket: A method for clustering duplicate crash reports based on call stack similarity. In Proceedings of the 34th International Conference on Software Engineering (ICSE 2012), pages 1084--1093. IEEE, 2012.
[6]
P. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice/Hall International, 1982.
[7]
M. Du, J.-G. Schneider, C. Hine, J. Grundy, and S. Versteeg. Generating service models by trace subsequence substitution. In Proceedings of the 9th International ACM Sigsoft Conference on Quality of Software Architectures (QoSA 2013), pages 123--132, Vancouver, British Columbia, Canada, 2013. ACM.
[8]
S. Freeman, T. Mackinnon, N. Pryce, and J. Walnes. Mock roles, objects. In Companion to the 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2004), pages 236--246, New York, NY, USA, 2004.
[9]
S. Ghosh and A. P. Mathur. Issues in Testing Distributed Component-Based Systems. In Proceedings of the 1st International ICSE Workshop on Testing Distributed Component-Based Systems, 1999.
[10]
J. Grundy, Y. Cai, and A. Liu. Softarch/mte: Generating distributed system test-beds from high-level software architecture descriptions. Automated Software Engineering, 12(1):5--39, Jan. 2005.
[11]
C. Hine. Emulating Enterprise Software Environments. Phd thesis, Swinburne University of Technology, Faculty of Information and Communication Technologies, 2012.
[12]
C. Hine, J.-G. Schneider, J. Han, and S. Versteeg. Scalable emulation of enterprise systems. In Proceedings of the 20th Australian Software Engineering Conference (ASWEC 2009), pages 142--151, Gold Coast, Australia, Apr. 2009. IEEE Computer Society Press.
[13]
C. Hine, J.-G. Schneider, and S. Versteeg. Reac2o: A runtime for enterprise system models. In J. Andrews and E. Di Nitto, editors, Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2010), pages 177--178, Antwerp, Belgium, Sept. 2010. ACM.
[14]
H. Hu, J. Fang, Z. Lu, F. Zhao, and Z. Qin. Rank-directed layout of uml class diagrams. In Proceedings of the First International Workshop on Software Mining, pages 25--31. ACM, 2012.
[15]
N. C. Jones and P. Pevzner. An Introduction to Bioinformatics Algorithms. MIT press, 2004.
[16]
U. Lamping, R. Sharpe, and E. Warnicke. Wireshark Users's Guide, 2012.
[17]
P. Li. Selecting and Using Virtualization Solutions: our Experiences with VMware and VirtualBox. Journal of Computing Sciences in Colleges, 25(3):11--17, Jan. 2010.
[18]
W. T. McCormick, P. J. Schweitzer, and T. W. White. Problem decomposition and data reorganization by a clustering technique. Operations Research, 20(5):993--1009, 1972.
[19]
G. J. McLachlan, K.-A. Do, and C. Ambroise. Analyzing Microarray Gene Expression Data. Wiley-Interscience, 2004.
[20]
J. Michelsen. Key Capabilities of a Service Virtualization Solution, October 2011. ITKO White Paper. Available at: http://www.itko.com/resources/service_virtualization_capabilities.jsp.
[21]
S. B. Needleman and C. D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3):443--453, 1970.
[22]
S. Sampath, S. Sprenkle, E. Gibson, L. Pollock, and A. Souter Greenwald. Applying Concept Analysis to User-Session-Based Testing of Web Applications. IEEE Transactions on Software Engineering, 33(10):643--658, 2007.
[23]
J. Sanchez. Squeezing virtual machines out {of} CPU cores, June 2011. VM Install.
[24]
J. Sermersheim. Lightweight Directory Access Protocol (LDAP): The Protocol. RFC 4511 (Proposed Standard), June 2006.
[25]
L. _Subelj and M. Bajec. Software systems through complex networks science: Review, analysis and applications. In Proceedings of the First International Workshop on Software Mining, pages 9--16. ACM, 2012.
[26]
J. Sugerman, G. Venkitachalam, and B.-H. Lim. Virtualizing i/o devices on vmware workstation's hosted virtual machine monitor. In Proceedings of the General Track: 2002 USENIX Annual Technical Conference, pages 1--14, Berkeley, CA, USA, 2001. USENIX Association.
[27]
M. Tamer Özsu and P. Valduriez. Principles of Distributed Database Systems. Springer, 2011.
[28]
T. Wang, G. Yin, X. Li, and H. Wang. Labeled topic detection of open source software from mining mass textual project profiles. In Proceedings of the First International Workshop on Software Mining, pages 17--24. ACM, 2012.
[29]
D. Yuan, Y. Yang, X. Liu, and J. Chen. A data placement strategy in scientific cloud workows. Future Generation Computer Systems, 26(8):1200--1214, 2010.
[30]
H. Zhang, L. Gong, and S. Versteeg. Predicting bug-fixing time: An empirical study of commercial software projects. In Proceedings of the 35th International Conference on Software Engineering, pages 1042--1051. IEEE, 2013.

Cited By

View all
  • (2023)ML-aVAT: A Novel 2-Stage Machine-Learning Approach for Automatic Clustering Tendency AssessmentBig Data Research10.1016/j.bdr.2023.10041334(100413)Online publication date: Nov-2023
  • (2023)Cloud-Native Architecture for Distributed Systems that Facilitates Integration with AIOps PlatformsAdvances in Computing10.1007/978-3-031-47372-2_26(318-329)Online publication date: 14-Nov-2023
  • (2022)Extracting Formats of Service Messages with Varying PayloadsACM Transactions on Internet Technology10.1145/350315922:3(1-31)Online publication date: 1-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGSOFT Software Engineering Notes
ACM SIGSOFT Software Engineering Notes  Volume 40, Issue 1
January 2015
237 pages
ISSN:0163-5948
DOI:10.1145/2693208
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 February 2015
Published in SIGSOFT Volume 40, Issue 1

Check for updates

Author Tags

  1. Automatic modelling
  2. Interaction emulation
  3. Service emulation
  4. Traces clustering

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)ML-aVAT: A Novel 2-Stage Machine-Learning Approach for Automatic Clustering Tendency AssessmentBig Data Research10.1016/j.bdr.2023.10041334(100413)Online publication date: Nov-2023
  • (2023)Cloud-Native Architecture for Distributed Systems that Facilitates Integration with AIOps PlatformsAdvances in Computing10.1007/978-3-031-47372-2_26(318-329)Online publication date: 14-Nov-2023
  • (2022)Extracting Formats of Service Messages with Varying PayloadsACM Transactions on Internet Technology10.1145/350315922:3(1-31)Online publication date: 1-Feb-2022
  • (2022)Inferring data model from service interactions for response generation in service virtualizationInformation and Software Technology10.1016/j.infsof.2021.106803145:COnline publication date: 1-May-2022
  • (2021)A Review on the Service Virtualisation and Its Structural PillarsApplied Sciences10.3390/app1105238111:5(2381)Online publication date: 8-Mar-2021
  • (2021)R-gramJournal of Network and Computer Applications10.1016/j.jnca.2021.103247196:COnline publication date: 15-Dec-2021
  • (2020)Visual Approaches for Exploratory Data Analysis: A Survey of the Visual Assessment of Clustering Tendency (VAT) Family of AlgorithmsIEEE Systems, Man, and Cybernetics Magazine10.1109/MSMC.2019.29611636:2(10-48)Online publication date: Apr-2020
  • (2020)Evolving from Traditional Systems to AIOps: Design, Implementation and Measurements2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications( AEECA)10.1109/AEECA49918.2020.9213650(276-280)Online publication date: Aug-2020
  • (2020)SpecMiner: Heuristic-based mining of service behavioral models from interaction tracesFuture Generation Computer Systems10.1016/j.future.2020.10.033Online publication date: Nov-2020
  • (2019)A positional keyword-based approach to inferring fine-grained message formatsFuture Generation Computer Systems10.1016/j.future.2019.08.011Online publication date: Aug-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media