Redundancy-aware SOAP messages compression and aggregation for enhanced performance

https://doi.org/10.1016/j.jnca.2011.08.004Get rights and content

Abstract

Many organizations around the world have started to adopt Web services as well as server farms and clouds hosted by large enterprise and data centers for various applications. Web Services offer several advantages over other communication technologies. However, they have high latency and often suffer from congestion and bottlenecks due to the massive load generated by web service requests from large numbers of end users. SOAP (Simple Object Access Protocol) is the basic XML-based communication protocol of Web services. XML is a verbose encoding language in comparison with other technologies such CORBA and RMI. In this paper, two new redundancy-aware SOAP Web message aggregation models – Two-bit and One-bit XML status tree – are proposed to enable the Web servers to aggregate SOAP responses and send them back as one compact aggregated message in order to reduce the required bandwidth, latency, and improve the overall performance of Web services. XML message compressibility, the Jaccard based clustering technique, and the vector space model are three similarity measurements that are proposed to cluster SOAP messages as groups based on their similarity degree. The clustering based similarity measurements enable the aggregation techniques to potentially reduce the required network traffic by minimizing the overall size of the messages. The experiments show significant performance for both aggregation techniques achieving compression ratios as high as 25 for aggregated SOAP messages.

Introduction

Web services are middleware that provide access to networked resources over the Internet with the support of network mechanisms and protocols such as HTTP and TCP (Rosu, 2007; Komathy et al., 2003; Madiraju et al., 2010). Generally, Web servers provide dynamically scalable services (responses) that are available on demand (requests) over the Internet (Christian Werner and Fischer, 2004; Diamadopoulou et al., 2008; Kuehnhausen and Frost, 2011; Subashini and Kavitha, 2011). SOAP (Simple Object Access Protocol) is the basic communication protocol of most Web services (Nakagawa et al., 2006; Hu et al., 2011). SOAP is based on XML (eXtensible Markup Language) that encodes the contents of sent/received Web messages over the Internet (Rosu, 2007). Recently, the adoption of Web services on server farms and clouds has increased significantly by many network organizations with the aim of providing the required services without investing heavily in computing infrastructure (Hartmut Liefke, 2000; AjayKumar et al., 2009; Bo et al., 2010). Understandably, this has contributed to the growth of web services over the Internet.

SOAP has been developed to improve interoperability of Web services (Khoi Anh Phan and Bertok, 2008, Christian Werner and Fischer, 2004; Chonka et al., 2011; Gu et al., 2005). However, Web services inherit the disadvantages of SOAP as messages are bigger than the real payload of the requested services (Rosu, 2007; Pastore, 2008; Ruiz-Martinez et al., 2011) which can cause high network traffic. As a result, Web services often suffer from congestion and bottlenecks due to the high number of client Web requests and the large size of Web messages (Nakagawa et al., 2006). This can result in slowing down the performance of the Web applications considerably (Christian Werner and Fischer, 2004, Khoi Anh Phan and Bertok, 2008; Hsu et al., 2009; Hu and Cho, 2011).

Several compression techniques (Christian Werner and Fischer, 2004, Rosu, 2007) and textual aggregation models (Khoi Anh Phan and Bertok, 2008) have been developed to reduce the size of the messages. For example, XMill (Hartmut Liefke, 2000) distributes the XML tags into different containers and compresses them using semantic compressors. Differential encoding (Christian Werner and Fischer, 2004) reduces the computational overhead by computing the differences between the current active message and the previous one in order to compress them only and avoid the overhead. A similarity-based aggregation technique (Khoi Anh Phan and Bertok, 2008) aims to reduce network traffic by combining similar messages and deliver the compact message using multicast protocol. Despite the fact that these techniques are to some extent capable of enhancing the performance of web services, they still suffer from few technical drawbacks:

  • Some are considered to be storage consuming as they are either based on dictionary type approaches or depend on log files for the sent/received messages.

  • Although both compression and aggregation models have similar objectives and exploit similarity within the message itself (redundancy in compression) or with other messages (aggregation), they have failed to take advantage of each other to achieve higher performance.

An XML binary tree structure based aggregation model by Al-Shammary and Khalil (2010a) was developed with the aim of providing a high compression ratio for aggregated messages. Two-bit and One-bit compression techniques (Al-Shammary and Khalil, 2010b) are a general tree structure based models that can significantly compress individual XML messages. In this paper, new Two-bit and One-bit aggregation models are proposed that exploit the redundancies found in SOAP messages to reduce the aggregated message size. Further message size reduction is achieved using compression. The objective of the proposed models is to provide an efficient aggregation that can significantly reduce the size of the messages. Two-bit and One-bit status XML tree aggregation techniques aim to enable Web servers with the capability to aggregate a group of messages that have a certain degree of similarity and send them as one compact message, minimizing the network traffic. Figure 1 shows the support from the compression and aggregation schemes in reducing the high network traffic created by Web requests/responses. The resultant aggregated messages of SOAP responses are extractable at the closest routers to the receivers (clients) to deliver only the required response to each client. An XML-aware compression technique is developed to exploit the redundancy of different SOAP messages creating one compact message structure for the Web messages. Three similarity measurements of SOAP messages are introduced in order to investigate the best similarity based clustering model that can group messages with a significant similarity degree to enable the aggregation techniques to achieve potential message size reduction. Compressibility measurement, Jaccard coefficient (Wang and Li, 2009), and vector space technique (Liu et al., 2010) have been developed in order to cluster SOAP messages based on their similarity. Compressibility measurement investigates the possibility of size reduction that can be achieved with SOAP message pairs. Jaccard coefficient and vector space techniques are proposed to group SOAP messages into larger predefined size clusters (not only pairs).

Evaluation of the proposed techniques show promising results and prove that aggregation techniques can achieve significantly higher compression ratios for similar SOAP messages than compressing them separately. The compression ratios that can be achieved by aggregating clustered messages have been investigated and the aggregated SOAP message size reduction is potentially higher than the accumulated size of the separately compressed messages. Aggregation of SOAP messages is computed with clusters varying between two and ten messages per cluster. Vector space model clustering has been shown to be slightly better in supporting the proposed aggregation techniques to reduce the overall size of the aggregated SOAP messages. Furthermore, experiments show that vector space model clustering requires significantly less processing time than the Jaccard based clustering technique. The proposed Two-bit and One-bit XML status tree aggregation techniques are compared with the Binary Tree based aggregation technique (Al-Shammary and Khalil, 2010a) and both models have shown potentially higher performance in terms of the resultant compression ratios and the processing time that is required to aggregate the clustered SOAP messages.

The rest of this paper is organized as follows. Section 2 discusses related work. Section 3 explains the compressibility measurements of SOAP messages. Next, Section 4 states the development of Jaccard based clustering technique. Then, Section 5 explains the vector space model and its clustering model for grouping SOAP messages based on their cosine similarity degrees. Section 6 shows the structure of the XML tree and Section 7 explains the assigning process of the XML tree. Section 8 describes the encoding and aggregation process of the XML trees. The evaluation of the proposed techniques is depicted in Section 9. Finally, Section 10 concludes the paper.

Section snippets

Related work

In order to enhance the performance of SOAP Web services, compression of standalone (Hartmut Liefke, 2000, Christian Werner and Fischer, 2004) Web messages and textual aggregation models (Khoi Anh Phan and Bertok, 2008, Al-Shammary and Khalil, 2010a) have been proposed. In these works, the compression exploits the self-similarity of SOAP messages in order to reduce the overall message size while the textual aggregation techniques are mainly based on computing the similar content (tags and data

Similarity measurements and clustering of SOAP messages

Similarity-based clustering of SOAP messages represent a defacto operation for aggregation approaches by clustering messages with a high level of similarity to strengthen aggregation resulting in high size reduction. In this paper, we first introduce compressibility for pairs of SOAP messages as a simple and effective similarity measurement tool to support the proposed compression based aggregation technique by computing the compressibility of messages. Jaccard coefficients are well-known for

XML tree structure and compression

Generally, compression techniques have been used to enhance the performance of Web services by reducing the overall size of SOAP messages over the Internet to minimize the network traffic. These Web messages can be represented as a tree data structure which is a motivating factor for this work when designing the proposed redundancy-based aggregation model. The XML tree structure reduces the total number of SOAP message tags by keeping one occurrence in the tree and removing all duplicate

XML tree traversing and assignments

It is required to generate the XML minimized text expression (using the XML tree) in such a way that guarantees rebuilding the XML tree again in order to regenerate the original SOAP message. In this model, depth-first and breadth-first traversals are proposed to generate the minimized XML text expression by assigning all tags with binary codes to enable rebuilding of the XML tree by recognizing the correct position of each tag.

Aggregation of SOAP expressions

Encoding of the XML textual expression is the final step of the proposed model that generates the final compact version of the considered messages and represents the core component of the aggregation model. Fixed and variable length encoding techniques are proposed to generate the aggregated compact message from the combined textual expressions. Both encodings are well-known as lossless compression techniques that can remove the redundancies of letters by assigning binary codes for these

Experiments and discussion

In the evaluation of the proposed aggregation models, we have considered a variety of SOAP message sizes that range from only 140 bytes to 53 kbytes in order to show the efficiency of the models on small messages as well as large ones. The objective of considering small messages is to investigate the fact that lossless encodings usually create large lookup tables in comparison to the encoded part of the input message that could cause in many cases an even larger encoded message than the

Conclusion and future work

XML-aware compression techniques can be developed into efficient SOAP aggregation models that can exploit redundancies in several SOAP messages. In this paper, we have shown that redundancy-based aggregation techniques can outperform all the standalone compression techniques by achieving higher compression ratios for messages of all sizes: small, medium, large and very large. The performance of the Web services can be improved by applying the redundancy-aware aggregation models enabling Web

References (25)

  • Chonka A, Xiang Y, Zhou W, Bonti A. Cloud security defence to protect cloud computing against HTTP-DoS and XML-DoS...
  • M. Chen et al.

    Summarization of text clustering based vector space model

  • Cited by (8)

    • SMCA: An efficient SOAP messages compression and aggregation technique for improving web services performance

      2019, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      This model supports Huffman compression based on an aggregation tool. Two grouping models proposed by [3,37] show a reduction of XML Web messages size of 30% in comparison with the vector space model [7] and the dynamic fractal model [8]. A new version of dynamic grouping XML messages based on compression and an aggregation model [3] is proposed in this paper.

    • A distributed aggregation and fast fractal clustering approach for SOAP traffic

      2014, Journal of Network and Computer Applications
      Citation Excerpt :

      Therefore, a one pass clustering technique is a requirement for improving the clustering time of Web messages. This section overviews the basic aggregation model suggested in Al-Shammary and Khalil (2012), used as the core component of the proposed aggregation model. Redundancy-aware aggregation of SOAP messages starts with building the XML trees for the aggregating messages.

    • Fractal self-similarity measurements based clustering technique for SOAP Web messages

      2013, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      Web services usually suffer from bottlenecks and congestion as a result of high network traffic caused by Web applications like stock quote service [14,3]. In our previous work [2,4], we introduced a new SOAP message aggregation strategy based on utilizing compression concepts. This aggregation model is strengthened by the redundancy awareness feature of compression as an alternative similarity measurement to aggregate messages into one compact structure.

    • A Novel Lossless EEG Compression Model Using Fractal Combined with Fixed-Length Encoding Technique

      2022, Lecture Notes on Data Engineering and Communications Technologies
    View all citing articles on Scopus
    View full text