Elsevier

Neurocomputing

Volume 55, Issues 1–2, September 2003, Pages 285-305
Neurocomputing

Volatility forecasting from multiscale and high-dimensional market data

https://doi.org/10.1016/S0925-2312(03)00381-3Get rights and content

Abstract

Advantages and limitations of the existing volatility models for forecasting foreign-exchange and stock market volatility from multiscale and high-dimensional data have been identified. Support vector machines (SVM) have been proposed as a complimentary volatility model that is capable of effectively extracting information from multiscale and high-dimensional market data. SVM-based models can handle both long memory and multiscale effects of inhomogeneous markets without restrictive assumptions and approximations required by other models. Preliminary results with foreign-exchange data suggest that SVM can effectively work with high-dimensional inputs to account for volatility long-memory and multiscale effects. Advantages of the SVM-based models are expected to be of the utmost importance in the emerging field of high-frequency finance and in multivariate models for portfolio risk management.

Introduction

Predictive capabilities of the data-driven models of the systems with complex multiscale dynamics depend on the quality and amount of the available data and on the algorithm used to extract generalized mappings. Availability of the real-time, high-resolution data constantly increases in many fields of practical interest. However, the majority of advanced statistical and machine learning algorithms, including neural networks (NN), can encounter a set of problems called “dimensionality curse” when applied to high-dimensional data [4]. Nonstationarity of the system can also impose significant limitations on the size of a training set which leads to poor generalization ability of the model.

A very promising algorithm that can tolerate high-dimensional and incomplete data is support vector machine (SVM) [43], [44]. SVMs have recently been receiving significant interest due to excellent results in various applications [10]. SVM combines the training efficiency and simplicity of linear algorithms with the accuracy of the best nonlinear techniques, and systematic approach for optimal generalization. In many practical applications SVMs can tolerate high-dimensional and/or incomplete data and often demonstrate performances superior to the best available techniques including NNs [10]. Note that in this article majority of the comparative references to NNs would imply multilayer perceptron (MLP) or similar algorithms and architectures [4], [33]. Recent successful applications of SVM-based adaptive systems include image/object classification [32], face detection and recognition [30], text categorization [22], process identification in high-energy physics [42], cancer diagnostic and prognosis [26], gene classification [8], as well as many other scientific, engineering, medical, and biological applications.

Recently we have also applied SVM to a challenging problem of real-time space weather forecasting [20]. It has been shown that the performance of the SVM-based model for geomagnetic substorm prediction can be comparable (or superior) to that of the best existing models including NNs [19]. The advantages of the SVM-based techniques are expected to be much more pronounced in the next generation of the space-weather forecasting models, which will incorporate many types of high-dimensional, multiscale input data once real-time availability of this information becomes technologically feasible.

Financial time-series forecasting is another challenging area where advantages of the SVM-based systems could be very important. Although some financial applications of the SVM have been reported [13], [16] the full range of potential SVM applications in finance remains largely unexplored. For example, there are no comprehensive studies of the SVM applications to volatility forecasting from multiscale and high-dimensional market data. Exception is a recent work by Van Gestel et al. [41] where new SVM formulation is introduced and applied to financial time series. Encouraging results of the volatility modeling of the daily DAX30 closing pricing have been reported [41].

Volatility of the foreign exchange and stock markets is a very important quantity for option pricing, value-at-risk (VaR) calculations used in portfolio risk management, and for general decision making in real-time trading systems. The empirically confirmed existence of volatility long memory (up to several months) may require high-dimensional inputs in the volatility models. Multivariate structure of the volatility and covariance models used in portfolio risk management further increases dimensionality of the model. Emerging new field of the high-frequency finance [11], dealing with multiscale market data (from several minutes to several months), imposes even more demanding requirements on the dimensionality of the multiscale volatility models.

In this paper we review stylized facts and features of the market data and existing volatility models. Limitations of the known volatility models, especially their ability to handle long memory and multiscale nature of the market data, are identified. SVM-based system is proposed as a complimentary multiscale volatility model. Advantages and potential applications of the new model are discussed. Encouraging preliminary results of the SVM model application to volatility forecasting of foreign exchange market are reported. Although only foreign exchange market examples are considered in this paper, almost all discussion is relevant to stock market data as well. When necessary, specific differences between stock and foreign exchange markets are mentioned.

Section snippets

Data description: stylized facts of financial data

In this section we define the main measures used to characterize financial time series and describe their universal properties revealed in numerous empirical studies. A typical daily $US/DM exchange rate that will be used in this paper is shown in Fig. 1a. Nonstationarity of the moving average of the time series is clear from this figure. The more practical quantity is the logarithm of return given byri=ln(Xi/Xi−1),where i is an index of a homogeneous time sequence (e.g., the end of each

Existing volatility models and their limitations

There are two general classes of volatility models in widespread use: deterministic and stochastic models. Deterministic models consider volatility (conditional variance) to be a deterministic function of the past returns (and/or other observables) that are described by some stochastic process (e.g., Wiener process). Stochastic volatility models describe volatility by its own stochastic process. Below we give a short overview of the mentioned volatility models and their limitations. Less

Multiscale volatility models for heterogeneous market

One of the most significant limitations of the existing ARCH-type and similar deterministic volatility models is their inability to capture the heterogeneity of traders acting at different time horizons. For example, if the empirical data can be described as generated by one GARCH process at one particular data frequency, the dynamics of the data sampled at any other frequency is theoretically determined by temporal aggregation (or disaggregation) of the original process. These derived

Multiscale volatility model based on support vector machine

SVMs developed by Vapnik [43], [44] have recently been receiving significant interest due to excellent results in various applications [10]. We do not intend to give a detailed introduction to SVM in this paper and refer readers to excellent books and papers on this topic (e.g., see [10], and references therein). Also we provided a short introduction to the main ideas used in SVM in our recent paper [20]. Here we give only a brief description of the SVM and its main advantages.

SVM is a

Results

Building a full featured SVM-based volatility model that can be useful in real trading infrastructure is beyond the scope of this paper. Therefore extensive comparison of the SVM model and other available volatility models will be done in our future articles. Here we illustrate the ability of the SVM-based volatility model to handle challenge of the long-memory and multiscale effects of the real market data and present preliminary comparison with two basic models.

As an example we still use

Discussion and conclusion

In this paper, we addressed the problem of volatility forecasting from high-dimensional and multiscale market data. SVM-based model was proposed as a possible complimentary approach to volatility forecasting. SVM combines the learning effectiveness of linear machines with the classification/regression power of the best nonlinear algorithms. Unlike typical nonlinear techniques such as NNs, the size of the SVM input space is decoupled from the number of free parameters and allows one to process

Acknowledgements

This work is supported by Science Applications International Corporation. We thank all referees evaluated this paper for valuable comments and suggestions.

Dr. Valeriy V. Gavrishchaka received his MS and Ph.D. degrees in physics from Moscow Institute of Physics and Technology (Russia) and from West Virginia University (USA) in 1989 and 1996, respectively. From 1997 to 2001 he worked as a research scientist and since 2001 as a consultant at Science Applications International Corporation. His research interests include analysis and simulation of fundamental multiscale processes in space and laboratory plasmas as well as new multidisciplinary

References (45)

  • C.-C. Chang et al.

    The analysis of decomposition methods for support vector machines

    IEEE Trans. Neural Networks

    (2000)
  • N. Cristianini et al.

    Introduction to Support Vector Machines and other Kernel-Based Learning Methods

    (2000)
  • M.M. Dacorogna et al.

    An Introduction to High-Frequency Finance

    (2001)
  • D. Edelman, Enforced-denial support vector machines for noisy data with applications of financial time series...
  • R.F. Engle

    Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation

    Econometrica

    (1982)
  • R.F. Engle et al.

    What good is a volatility model?

    Quantitative Finance

    (2001)
  • A. Fan, D. Hong, M. Palanaswami, C. Tan, A support vector machine approach to bankruptcy prediction: a case study, The...
  • A. Fisher, L. Calvet, B.B. Mandelbrot, Multifractality of the DM/US dollar exchange rate, Cowles Foundation Discussion...
  • U. Frish

    Turbulence

    (1995)
  • V.V. Gavrishchaka et al.

    Optimization of the neutral-network geomagnetic model for forecasting large-amplitude substorm events

    J. Geophys. Res.

    (2001)
  • V.V. Gavrishchaka et al.

    Support vector machine as an efficient tool for high-dimensional data processingapplication to substorm forecasting

    J. Geophys. Res.

    (2001)
  • S. Ghashghaie et al.

    Turbulent cascades in foreign exchange markets

    Nature

    (1996)
  • Cited by (0)

    Dr. Valeriy V. Gavrishchaka received his MS and Ph.D. degrees in physics from Moscow Institute of Physics and Technology (Russia) and from West Virginia University (USA) in 1989 and 1996, respectively. From 1997 to 2001 he worked as a research scientist and since 2001 as a consultant at Science Applications International Corporation. His research interests include analysis and simulation of fundamental multiscale processes in space and laboratory plasmas as well as new multidisciplinary approaches for complex system modeling in finance, medicine, and other fields.

    Dr. Supriya B. Ganguli is the Chief Information Officer at the Command, Control, Communication and Information Technology Group at SAIC. She holds a Ph.D. in Physics. Her research interests include modeling and simulations of wide range of complex systems with applications in space physics and space weather forecasting, engineering, and other fields.

    View full text