Elsevier

Neurocomputing

Volume 157, 1 June 2015, Pages 243-255
Neurocomputing

Vessel traffic flow forecasting by RSVR with chaotic cloud simulated annealing genetic algorithm and KPCA

https://doi.org/10.1016/j.neucom.2015.01.010Get rights and content

Abstract

The prediction of vessel traffic flow is complicated, its accuracy is influenced by uncertain socio-economic factors, especially by the singular points existed in the statistical data. Recently, the robust v-support vector regression model (RSVR) has been successfully employed to solve non-linear regression and time-series problems with the singular points. This paper will firstly propose a novel hybrid algorithm, namely chaotic cloud simulated annealing genetic algorithm (CcatCSAGA) for optimizing the parameters of RSVR, to improve the performance of vessel traffic flow prediction. In which, the proposed CcatCSAGA employs cat mapping to carefully expand variable searching space, to overcome premature local optimum, and uses cloud model efficiently to search a better solution in a small neighborhood of the current optimal solution, to improve the search efficiency. Secondly, the kernel principal component analysis (KPCA) algorithm is adopted to determine the final input vectors from the candidate input variables. Finally, a numerical example of vessel traffic flow and its influence factors data from Tianjin are employed to test the forecasting performance of the proposed KRSVR-CcatCSAGA model.

Introduction

Vessel traffic flow prediction is fundamental to development planning of national shipping and distribution of regional economic coordination. It also has important guiding significance to port layout planning, cooperation development of regional shipping economic, construction and renovation of waterway, and layout planning of national water rescue.

Numerous various forecasting approaches have been developed for vessel traffic flow prediction. In the conventional quantitative forecasting approaches, the autoregressive moving integrated moving average (ARIMA) models [1] and its improved models (more complicate and with high-accurate-capability) [2] and Kalman filtering model [3] are the most popular and practical time series forecasting model. They are often applied to forecast the series when data are inadequate to construct econometric, or when knowledge of the structure of forecasting models is limited. Time series models are simple in calculation and fast in speed, and are likely to outperform other models in some cases, specially, in short-term forecasting [4], [5]. However, time-series forecasting models fail to reflect other related factors of the predicting series, hence, they fail to obtain the accurately forecasting result, when the predicting sequences are affected by the related factors to a large extent.

Artificial neural network (ANN) is primarily based on a model of emulating the processing of human neurological system identify related spatial and temporal characteristics from the historical data patterns (particularly for nonlinear and dynamic evolutions), therefore, they can approximate any level of complexity and do not need prior knowledge of problem solving. Since the vessel traffic flow prediction is too complex to be solved by a single linear statistical algorithm, ANN should be considered as an alternative for solving vessel traffic flow forecasting problem. Due to superior performance to approximate any degree of complexity and without prior knowledge of problem solving, ANN models have been widely applied in traffic flow forecasting [6], [7], [8]. Even though ANN-based forecasting models can approximate to any function, particularly nonlinear functions, ANN models have difficulties in the non-convex problem of network training errors and explaining black-box operations, and are easy to trap in local minima [9], [10], ANN models have time-consuming training procedures and subjectivity in selecting an ANN model architecture [11]. Additionally, the training of ANN model requires large amount of training samples, while vessel traffic flow and related impact indicators have limited datum.

With the structure risk minimization criterion, support vector regression (SVR) has overcome the inherent defects of the ANN model [12]. SVR-based models [13] have been widely employed to receive higher forecasting accuracy in many fields, such as electric load forecasting [14], [15], [16], [17], [18], [19], [20], atmospheric science forecasting [21], [22], [23], [24], financial time series forecasting [25], [26], [27], [28], [29], tourist arrival prediction [30], [31] and traffic flow prediction [32], [33], [34].

The conclusions of these researches all indicate that the selection of the parameters in an SVR model play a critical role in forecasting accuracy improvement [35]. Although, some recommendations on appropriate setting of SVR parameters have been given in the literature [36], however, these suggestions do not contemporaneously consider the interactive effects among parameters. In order to obtain more appropriate parameters of SVR model, authors have conducted a variety of systematical researches [37], [38], [39], [40], [41] by applying various evolutionary algorithms (particle swarm optimization and genetic algorithms) to determine suitable parameter values. In which, all SVR models with parameters determined by different evolutionary algorithms are superior to other competitive forecasting models (ARIMA and ANNs etc.), however, based on the analysis of the research results, these employed algorithms almost have their theoretical drawbacks, such as population diversity decline, convergence time consuming, easy to fall into local optimum. Therefore, authors start another trials by hybridizing chaotic sequence and cloud model with evolutionary algorithms to overcome these shortcomings [42], [43]. To continue testing the superiorities of these hybridized chaotic sequence and cloud model with evolutionary algorithms, this paper tries to use the chaotic sequence and cloud model to improve the SAGA, for obtaining more appropriate parameters of SVR model.

GA is a simulated evolution optimization algorithm, in which, new individuals are generated by selection, crossover, and mutation operators. Based on special binary coding process, GA is able to solve some specified problems which are not easily to be solved by traditional algorithms. Therefore, it has been widely used in function optimization, neural network training, pattern classification, fuzzy control system and other engineering fields nowadays [44]. In previous papers [45], GA can empirically obtain a few best fitted off-springs from the whole population, however, due to population diversity decreases after some generations, it might lead to premature convergence. Simulated annealing (SA) is a generic probabilistic search technique that simulates the material physical process of heating and controlled cooling [46]. SA attempts to replace the current state by a random move in each step. The new state can be accepted with a probability that depends both on the temperature and difference between the corresponding functional values. Thus, SA has the ability to get more optimal solution [47]. However, in previous paper [48], SA is time consuming in annealing process. To speed up the search time and improve premature convergence, it is deserved to establish some effective approach to overcome these drawbacks of GA and SA. Hybridization of GA with SA (SAGA) algorithm is an innovative attempt by employing the superior capability of SA algorithm to produce better solution, and by applying the mutation process of GA to reduce the convergence time. Due to the ergodicity of the cat mapping and the randomness and the stable orientation of the cloud model, authors have employed cat mapping and cloud model to improve the PSO [42]. Application results show that the introduction of chaotic sequence and cloud model enrich the diversity of population and reduce the convergence time. To continue testing the superiorities of these hybridized chaotic sequence and cloud model with evolutionary algorithms, improve the search performance of SAGA, receive more appropriate parameters, this paper tries to use the chaotic sequence and cloud model to modify SAGA, by applying chaotic sequence to carefully expand variable searching space, let variable travel ergodically over the searching space, and employing the cloud model to efficiently search a better solution in a small neighborhood of the current optimal solution. Then, a novel hybrid algorithm, namely chaotic cloud simulated annealing genetic algorithm (CcatCSAGA), is proposed, expecting to obtain more suitable parameter combination of SVR model.

The mixed noise existed in forecasting sequence data and related impact factors datum will largely affect the final prediction results, especially on sensitive SVR model. Considering mixed noise of normal distribution, high amplitude values and singular point features in datum of prediction sequence and related impact factors, a robust loss function [49] is designed and a new support vector regression (namely RSVR) is obtained. Application results show that the RSVR model can effectively suppress noise and lead to better prediction results [48]. Therefore, to deal well with the mixed noise in vessel traffic flow sequence and its related impact factors, this paper adopts the RSVR model to improve robustness and accuracy of vessel traffic flow prediction.

The prediction of vessel traffic flow is a complex nonlinear dynamic procedure. The vessel traffic flow is affected by numerous factors, such as gross domestic product, per capita gross national product, total imports and exports, passenger throughput etc. In order to ensure forecasting accuracy, related factors should be selected as the input vector of RSVR model, however, high dimensions of input vector will not only increase the operation time consuming, but also reduce the forecasting accuracy of RSVR model, therefore, it is very necessary to reduce the dimension of the relevant factors. As one of the most classic dimension reduction method, KPCA is a nonlinear extension algorithm of linear PCA, it extracts principal component by adopting the nonlinear method, maps the input space to a high-dimensional space through some implicit way, and realizes the PCA in the feature space. KPCA has received intense attention and been widely used in many fields, such as, face recognition [50], image recognition [51], spectral dimension reduction analysis [52]. In order to save computing time consuming and improve the model prediction accuracy, this paper employs KPCA to analyze generation mechanism of vessel traffic flow and determine the input vector of forecasting model.

Consequently, the KRSVR-CcatCSAGA model, hybridizing RSVR with CcatCSAGA and KPCA algorithm, is established, to enhance forecasting accurate level of vessel traffic flow. Five other competitive forecasting approaches, the KSVR-CcatCSAGA, PRSVR-CcatCSAGA, KRSVR-ClogisticSAGA, KGRNN and ARIMA models are employed to compare the forecasting accuracy.

The remainder rest of this paper is organized as follows. The fundamental principle and formulation of RSVR, the principle and calculation steps of KPCA which is employed to determine the input vector of the RSVR model and the CcatCSAGA which is used to select the parameters of the SVR model are presented in Section 2. Section 3 describes the proposed KRSVR-CcatCSAGA vessel traffic flow forecasting scheme. A numerical example is provided in Section 4. The conclusions are shown in Section 5.

Section snippets

Robust support vector regression (RSVR)

Suppose the training set T={(x1,y1),,(xi,yi),,(xN,yN)}, where xiRn is n-dimensional input variable, and yiRn is the corresponding output value, i=1,2,,N. Through a nonlinear mapping function, ϕ(x)={ϕ(x1),ϕ(x2),,ϕ(xi)}, the SVR model maps the sample into a high dimensional feature space, Rm, in which the optimal decision function is constructed as follows,f(x)=ω×ϕ(x)+bwhere ω, ϕ(x) is m dimensional vector, bR is threshold, “·” is the dot-product in the feature space. SVR uses structural

The proposed vessel traffic flow forecasting scheme

This paper establishes KRSVR-CcatCSAGA vessel traffic flow forecasting scheme, integrating RSVR model, CcatCSAGA and KPCA algorithm. The flowchart of the proposed scheme is designed in Fig. 2. As shown in Fig. 2, the proposed vessel traffic flow forecasting scheme consists of four stages.

The first stage is to determine the influence factors of the vessel traffic flow. Based on correlation analysis, the vessel traffic flow is high related to numerous factors, such as gross domestic product (x1

Datasets and performance criteria

The vessel traffic flow sequence and its corresponding social-economic impact indicators data obtained from yearbook of the Shanghai and yearbook of China port, are used to evaluate the performance of the proposed vessel traffic flow forecasting scheme. The data collection period for vessel traffic flow sequence and its corresponding social-economic impact indicators data is from 1990 to 2013. There are totally 24 data points in the dataset. The original datasets standardized by z-score

Conclusion

Accurate vessel traffic flow forecasting result is great significance to enhance the ability of navigation, optimize the allocation of port resources, reduce shipping traffic accident and improve the navigation safety. Thus, it is worth improving the forecasting precision of vessel traffic flow. This paper proposes a KRSVR-CcatCSAGA scheme to predict vessel traffic flow, combining RSVR model, CcatCSAGA and KPCA algorithm. The empirical study based on Tianjin statistical data is used to test the

Acknowledgements

The work is supported by the Fundamental Research Funds for the Central Universities (HEUCF140108), Science and Technology Project of Western Transportation Construction of Ministry of Communications (2014364554050), Financial Assistance under Heilongjiang Postdoctoral Fund (LBH-Z14059).

Mingwei Li was born in 1984. He received his doctor degree in Engineering from Dalian University of Technology in 2013. Since September 2013, he has been with the College of Shipbuilding Engineering of Harbin Engineering University, where he is currently a lecturer. His research interests mainly include port policy and digital, applications of forecasting technology, computational intelligence and support vector forecasting. E-mail: [email protected]; [email protected].

References (62)

  • L. Cao

    Support vector machine experts for time series forecasting

    Neurocomputing

    (2003)
  • W. Huang et al.

    Forecasting stock market movement direction with support vector machine

    Comput. Oper. Res.

    (2005)
  • P.F. Pai et al.

    A hybrid ARIMA and support vector machines model in stock price forecasting

    Omega

    (2005)
  • W.C. Hong et al.

    SVR with hybrid chaotic genetic algorithms for tourism demand forecasting

    Appl. Soft Comput.

    (2011)
  • P.F. Pai et al.

    An improved neural network model in forecasting arrivals

    Ann. Tourism Res.

    (2005)
  • W.C. Hong

    Traffic flow forecasting by seasonal SVR with chaotic simulated annealing algorithm

    Neurocomputing

    (2011)
  • W.C. Hong et al.

    Hybrid evolutionary algorithms in a SVR traffic flow forecasting model

    Appl. Math. Comput.

    (2011)
  • W.C. Hong et al.

    Forecasting urban traffic flow by SVR with continuous ACO

    Appl. Math. Model.

    (2011)
  • P.F. Pai et al.

    Forecasting regional electric load based on recurrent support vector machines with genetic algorithms

    Electr. Power Syst. Res.

    (2005)
  • V. Cherkassky et al.

    Practical selection of SVR parameters and noise estimation for SVM regression

    Neural Networks

    (2004)
  • J. Geng et al.

    Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

    Neurocomputing

    (2015)
  • M.W. Li et al.

    Urban traffic flow forecasting using Gauss-SVR with cat mapping, cloud model and PSO hybrid algorithm

    Neurocomputing

    (2013)
  • P.F. Pai et al.

    An improved neural network model in forecasting arrivals

    Ann. Tourism Res.

    (2005)
  • P.F. Pai et al.

    Forecasting regional electric load based on recurrent support vector machines with genetic algorithms

    Electr. Power Syst. Res.

    (2005)
  • A.R. Teixeira et al.

    KPCA denoising and the pre-image problem revisited

    Digital Signal Process.

    (2008)
  • Y. Kao et al.

    A hybrid genetic algorithm and particle swarm optimization for multimodal functions

    Appl. Soft Comput.

    (2008)
  • S.F. Crone et al.

    The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing

    Eur. J. Oper. Res.

    (2006)
  • W.C. Hong et al.

    Taiwanese 3 G mobile phone demand forecasting by SVR with hybrid evolutionary algorithms

    Expert Syst. Appl.

    (2010)
  • W.Y. Zhang et al.

    Application of SVR with chaotic GASA algorithm in cyclic electric load forecasting

    Energy

    (2012)
  • G.E.P. Box et al.

    Time Series Analysis: Forecasting and Control

    (1976)
  • C. Han et al.

    A real-time short-term traffic flow adaptive forecasting method based on ARIMA model

    J. Syst. Simul.

    (2004)
  • Cited by (0)

    Mingwei Li was born in 1984. He received his doctor degree in Engineering from Dalian University of Technology in 2013. Since September 2013, he has been with the College of Shipbuilding Engineering of Harbin Engineering University, where he is currently a lecturer. His research interests mainly include port policy and digital, applications of forecasting technology, computational intelligence and support vector forecasting. E-mail: [email protected]; [email protected].

    Duanfeng Han was born in 1966. He received his doctor degree in Engineering from Harbin Engineering University. He is currently a professor in college of shipbuilding engineering of Harbin Engineering University. He is the dean of college of shipbuilding engineering of Harbin Engineering University, Committee member of the academic committee of Harbin Engineering University, secretary-general of shipbuilding society of Heilongjiang province, director of the national engineering laboratory of the digital shipbuilding. His research interests are design and manufacture of ship, simulation system of ship maneuvering, key technology of manned submersibles, performance of ship resistance and computational intelligence. E-mail: [email protected].

    Wen-long Wang was born in 1991. He received bachelor degree of Naval Architecture & Ocean Engineering Design and Manufacturing from Harbin Engineering University in 2010. And he is working toward a master degree in HEU too. His research interests are intelligent algorithm, the modeling and simulation optimization and logistics optimization.

    View full text