Elsevier

Applied Soft Computing

Volume 9, Issue 3, June 2009, Pages 1177-1183
Applied Soft Computing

Novel hybrid approach to data-packet-flow prediction for improving network traffic analysis

https://doi.org/10.1016/j.asoc.2009.03.003Get rights and content

Abstract

Forecast of the flow of data packets between client and server for a network traffic analysis is viewed as a part of web analytics. Thousands of web-smart businesses depend on web analytics to improve website conversions, reduce marketing costs, facilitate website optimization, speed-up website monitoring and provide a higher level of service to their customers and partners. This paper particularly intends to develop a high accurate prediction as one of core component of network traffic analysis. In this study, a novel hybrid approach, combining adaptive neuro-fuzzy inference system (ANFIS) with nonlinear generalized autoregressive conditional heteroscedasticity (NGARCH), is tuned optimally by quantum minimization (QM) and then applied to forecasting the flow of data packets around website. The composite model (QM-ANFIS/NGARCH) is setup in the forecast point of view to improve the predictive accuracy because it can resolve the problems of the overshoot and volatility clustering simultaneously within time series. As part of real-time intelligence web analytics, the high accurate prediction will aid webmaster to improve the throughput of data-packet-flow up to around 20%, with helping each webmaster to optimize their website, maximize online marketing conversions, and lead campaign tracking.

Introduction

Webmaster in fact does not want to spend money or time on a website that sits idle not yielding an enquiry or a sale. In other words, webmaster want apply website analytics (or free counter) to design websites that are integrated with effective search engine marketing strategies to generate traffic and convert that traffic to sales [1]. What we need is to seek the website analytics that provides detailed return-on-investment analysis for an unlimited number of search engine advertising, banner advertising, affiliate marketing or email marketing campaigns and click in- and out-tracking, combined with website statistics [2]. Therefore, network traffic analysis has become the trusted standard task in website statistics for various internet companies such as travel, dating sites and online shops. This is because website tracking capacity will help each webmaster to optimize their website, maximize online marketing conversions and lead campaign tracking. However, the website tracking or network traffic analysis is closely related to the flow of data packets between hosts. Particularly, a look-ahead prediction applied to forecasting the flow of data packets at upcoming time interval is considered as a measure to foresee the possibility of any fluctuation over flow instantly.

The researches on predictability analysis of network traffic [3], [4] have argued that prediction models with short-range-dependent behavior can capture statistics of (self-similar) traffic quite accurately for the limited time scales of interests in measurement-based traffic management. Those studies [3], [4] also found that the applicability of traffic prediction is limited by the deteriorating prediction accuracy with increasing prediction interval. This implies that a measure from short-term non-periodic forecast with limited time scales could effectively work out a reliable network traffic prediction. Furthermore, a traffic model called fractional autoregressive integrated moving-average (FARIMA) has been proposed for analyzing long-range or short-range dependency over network traffic trace [5] as result of fitting FARIMA to actual traffic trace very well. However, it will take a lot of computation time for fitting traffic trace due to its complexity for modeling a structure of FARIMA [6]. In essence, an improved approach about modeling FARIMA [6] has the feature of transferring the FARIMA problem into an ARMA problem so that any algorithm (not necessarily related to the method used for fitting the ARMA model) could be used to determine the Hurst parameter. This approach first determines the Hurst parameter (differencing level d) before fitting ARMA model, thus reducing the number of plausible models to be examined and the time to identify model iteratively. Accordingly, one can reduce the time to build traffic models and to do real-time modeling. In other words, ARMA model in fact can substitute original FARIMA for simulating actual traffic trace with less computational burden.

Several well-known prediction models have challenged a few crucial problems. For example, grey model (GM) [7] has encountered the overshoot problem such that it will induce big residual errors around turning-point region in time series during the prediction. Auto-regressive moving-average (ARMA) [8] cannot fit data sequences very well for irregular or non-periodic time series due to the lack of a dynamic learning mechanism, back-propagation artificial neural network (BPANN) [9] may have trouble with the problem of over-fitting or under-fitting results due to inappropriate parameters (weights) chosen after training, and adaptive neuro-fuzzy inference system (ANFIS) [10] model cannot avoid the volatility clustering [11] and thus this effect deteriorates the predictive accuracy a lot for the non-periodic short-term prediction. It is noted that ANFIS can possibly overcome the problem of over-fit and automatically tuned/trained by itself. Therefore, in this paper, incorporating a nonlinear generalized autoregressive conditional heteroscedasticity (NGARCH) [12] model into ANFIS system is schemed to tackle the overshoot and volatility clustering effects at the same time during single-step-look-ahead prediction. The proposed composite model ANFIS/NAGRCH is tuned optimally by quantum minimization (QM) [13] to form a linear combination of both models in such a way that it is not only simplify the complex system practically, but also improve the predictive accuracy significantly because of resolving the problems of the overshoot and volatility clustering simultaneously. In short, in order to manage the web resources effectively and efficiently, a high accurate prediction is applicable to foreseeing the in-flow and out-flow data packets to act as part of real-time intelligence network traffic analysis. Web traffic analysis provides valuable information for website administrators to customize the information that is hosted on their web servers so as to reach a larger audience. In the following sections, a novel hybrid (ANFIS/NGARCH) applied to forecasting time series is introduced first to act as one of core component of network traffic analysis. Next, quantum minimization is used to optimally tune the composite model ANFIS/NGARCH. Third, two experiments are demonstrated to verify the high accurate prediction implemented by the one we proposed among inflow and outflow of data packets around websites. Finally, the comparison of two aspects of website traffic monitoring is made with and without foreseeing the packet-flow around websites.

Section snippets

Volatility clustering problem

A certain characteristic commonly associated with financial time series is called volatility clustering, in which large changes tend to follow large changes, and small changes tend to follow small changes. In either case, the changes from one period to the next are typically of unpredictable sign. Volatility clustering, or persistence, suggests a time series model where successive disturbances, even if uncorrelated, are yet serially dependent. Thus large disturbances, either positive or

Quantum minimization—an optimum search algorithm

In the quantum-based representation [13], the operator | 〉 is defined as Dirac's conventional “ket” notation used for the qbits ψi such that a quantum state ψ is represented by superposed states |ψ=i=1Nωi|ψi where the probability amplitude ωi must satisfy i=1N||ωi||2=1. Quantum-based minimization that makes optimization task work out associated with probability of success at least 1/2 within an unsorted database is realized by quantum minimum searching algorithm [20]. A quantum exponential

QM tuning composite model ANFIS/NGARCH

A single-step-look-ahead prediction can be implemented by adding a variation to the current observed datum [10], and the variation is defined as backward–difference as follows.oˆ(k+1)=o(k)+δoˆ(k+1)δoˆ(k+1)=f(o(k),o(k1),,o(ks),δo(k),δo(k1),,δo(ks))δo(k)=o(k)o(k1)where oˆ(k+1), o(k), δoˆ(k+1), and δo(k) stand for the predicted output at next period, the current true observed datum, the predicted variation at next period, and the current variation, respectively.

We formulate a function of

Criteria for measuring accuracy

A critical point in time series prediction is what criterion can be used to measure the accuracy of the predicted results. Because of the wide applications of time series prediction, the question of whether accuracy is “good enough” is largely dependent on user-specified criteria. Measurement of forecasting performance is highly dependent on how rigidly the criteria are specified for measurement of the degree of accuracy. In order to justify reasonable accuracy for a time series prediction, two

Conclusions

Network traffic analysis provides valuable information for website administrators to customize the information that is hosted on their web servers so as to reach a larger audience. Thus, in order to manage the web resources effectively and efficiently, a high accurate prediction is applicable to foreseeing the inflow and outflow data packets to act as part of real-time intelligence network traffic analysis In this paper, A novel hybrid (QM-ANFIS/NGARCH) is setup in the forecast point of view to

Acknowledgements

This work is fully supported by the National Science Council, Taiwan, Republic of China, under grant number NSC 93-2218-E-143-001.

References (27)

  • A. Sang et al.

    A predictability analysis of network traffic

    Computer Networks

    (2002)
  • L. Hentschel

    All in the family: nesting symmetric and asymmetric GARCH models

    Journal of Financial Economics

    (1995)
  • T.A. Funkhouser et al.

    Management of large amounts of data in interactive building walkthroughs

  • S. Aissi et al.

    E-business process modeling: the next big step

    IEEE Computer

    (2002)
  • T. Tuan et al.

    Congestion Control for Self-Similar Network Traffic, Self-Similar Network Traffic and Performance Evaluation

    (2000)
  • J.K. Liu et al.

    Traffic modeling based on FARIMA models

  • Y.-T. Shu et al.

    Internet traffic modeling and prediction using FARIMA models

    Chinese Journal of Computers

    (2001)
  • B.R. Chang

    Hybrid BPNN-weighted grey-CLMS forecasting

    Journal of Information Science and Engineering

    (2005)
  • G.E.P. Box et al.

    Time Series Analysis: Forecasting & Control

    (1994)
  • S. Haykin

    Neural Networks

    (1999)
  • J.-S.R. Jang

    ANFIS: adaptive-network-based fuzzy inference systems

    IEEE Transactions on Systems, Man, and Cybernetics

    (1993)
  • C. Gourieroux et al.

    Financial Applications

    (1997)
  • T. Bellerslve

    Generalized autoregressive conditional heteroscedasticity

    Journal of Econometrics

    (1986)
  • Cited by (27)

    • Framework based on multiplicative error and residual analysis to forecast bitcoin intraday-volatility

      2022, Physica A: Statistical Mechanics and its Applications
      Citation Excerpt :

      These present advantages in comparison with HAR, GARCH, and MEM models since ANNs capture the non-linearity of the series and do not require the series to be stationary for modeling. These reasons have led ANN models to gain popularity for forecasting, being applied to different types of time series ([19; 20; 21; 22; 23; 24; 25; 26]). In all cases, ANNs have shown superior forecasting capabilities to linear models.

    • Multivariate intuitionistic fuzzy inference system for stock market prediction: The cases of Istanbul and Taiwan

      2022, Applied Soft Computing
      Citation Excerpt :

      A fusion ANFIS has been proposed for prediction Taiwan stock price series [8]. A hybrid IS, created as a combination of generalized autoregressive conditional heteroscedasticity and ANFIS, has modelled the flow of data packets around the website [35]. ANFISs have been presented for the purpose of prediction short-term trends of the stock exchange [5], and natural gas demand [36].

    • Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition

      2019, Neurocomputing
      Citation Excerpt :

      Therefore, many real world time series data presents complex nonlinear patterns which might not be modeled by ARIMA effectively. For the nonlinear time series modeling, Artificial Neural Networks (ANNs) are one of the most widely used algorithms [6] in many fields, such as finance [7], energy [8], hydrology [9], and network communications [10]. ANNs have several advantages over ARIMA and other forecasting models.

    • Intuitionistic time series fuzzy inference system

      2019, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      A fusion ANFIS procedure has been proposed for the forecasting stock price problems in Taiwan (Cheng et al., 2009). It has been intended to develop a novel hybrid approach combining ANFIS with nonlinear generalized autoregressive conditional heteroscedasticity for the prediction flow of data packets around website (Chang and Tsai, 2009). An ANFIS controller has been used to forecast stock market short-term trends (Atsalakis and Valavanis, 2009).

    • A moving-average filter based hybrid ARIMA-ANN model for forecasting time series data

      2014, Applied Soft Computing Journal
      Citation Excerpt :

      For these reasons, ANN models have become more popular in forecasting. ANNs have been applied to various time series data, such as electricity demand data [9], financial data [10], river flow data [11], and network data [12], for forecasting. In all these cases ANNs were shown to yield good forecasts compared with ARIMA models.

    View all citing articles on Scopus
    1

    Tel.: +886 7 6158000x3315; fax: +886 7 6158001.

    View full text