Elsevier

Decision Support Systems

Volume 37, Issue 4, September 2004, Pages 501-513
Decision Support Systems

Distribution forecasting of high frequency time series

https://doi.org/10.1016/S0167-9236(03)00083-6Get rights and content

Abstract

The availability of high frequency data sets in finance has allowed the use of very data intensive techniques using large data sets in forecasting. An algorithm requiring fast k-NN type search has been implemented using AURA, a binary neural network based upon Correlation Matrix Memories. This work has also constructed probability distribution forecasts, the volume of data allowing this to be done in a nonparametric manner. In assistance to standard statistical error measures the implementation of simulations has allowed actual measures of profit to be calculated.

Introduction

Many techniques for forecasting nonlinear time series exist. Traditionally in finance forecasters have looked at daily or monthly prices. Recently the availability of large high frequency data sets has encouraged much more research into intra-day and even tick price forecasting. It has also made possible the use of very data-intensive forecasting techniques and the adjustment of traditional techniques. An introduction to this new field is Ref. [5].

The Farmer–Sidorowich [9] forecasting algorithm is well understood and many forecasting methods implement a variant of it. Delay coordinates are used to construct representations of the current and all previous states. The state at time t of a variable x, denoted S(xt) is constructed from the observed values:xt,xt−1,xt−2,…,xt−d+1

The parameter d is considered the window size or embedding dimension, the number of most recent historical values that will be used to construct a state. It is assumed that a functional relationship between the current state and the future state exists.xt+1=f(S(xt))For this function to exist the attractor must be static or evolving only slowly. At this stage, it is assumed that the time series shows this property (evidence of this relationship has been documented [19]) but it is tested for as part of this work.

The aim is to construct a predictor approximating the function f. A forecast of the next price is constructed from one or more ‘local’ states considered similar by some distance metric to the current state. For each local state, the next value is its evolution. The evolutions of the local states are used to forecast the evolution for the current state, often by fitting a linear polynomial. Delay coordinates require the window size to be determined but there are no clear rules on how this should be done. Several researchers have produced guidelines in line with the study of nonlinear dynamics [9], [14], [17], but embedding is still considered an art as much as a science. In the field of neural network design for forecasting, Genetic Algorithms have often been considered [16], [18], [24]. Other issues the technique introduces are which distance metric to use, how to implement it and how many neighbour states should be used to construct a forecast?

Traditional forecasting methods produce point forecasts, each prediction is a single value. Alternatively probability and interval forecasting has been reported in the literature, for example [1], [4]. Full probability distribution forecasts contain extra information, which can be utilised by modern risk management systems or allow improved trading models. This work produces as forecasts, discrete probability distributions and is part of a new but expanding area of research.

Section snippets

AURA for Farmer–Sidorowich forecasting

The Farmer–Sidorowich algorithm relies on the implementation of a k-NN (or similar) search to decide on the local states. Also, with large data sets of high frequency data it is important that this search can be done quickly, producing forecasts in time for them to actually be used. This work uses the Advanced Uncertainty Reasoning Architecture (AURA) developed at the University of York to implement this search.

AURA is an implementation of associative memories using binary Correlation Matrix

Distribution forecasting

Producing distribution forecasts or measuring risk (especially the likelihood of extreme values) is simplified by making the assumption that financial returns follow a normal distribution. Under such an assumption, many simple measures of risk such as the Sharpe Ratio [3], [20] have been developed. However, it is now fairly well accepted that financial returns do not follow a normal distribution, evidence for this being reported by [8], [15], [19], [21]. This has encouraged the measurement of

Forecasting architecture

A novel forecasting architecture, AURA-FS has been constructed by bringing together the Farmer–Sidorowich method, AURA networks (including a new encoding scheme for financial data) and distribution forecasting. The form of distribution forecasting used is the variant of historical simulation described in the previous section. Data is pre-processed before presentation to AURA-FS in line with common practice. The series of prices is converted to the series of returns, expressed as the percentage

Simulations

AURA-FS was tested on a data set of exchange rates between Japanese Yen and US Dollars supplied by Olsen and Associates. From this data, sets of size 100,000 and 200,000 were used. They are high frequency sets, with the set of 100,000 prices covering only from 1 October 1992 to 8 December 1992. The first 75% of each data set was taken for training and the rest held back as an out of sample test set.

For finance, error measures such as RMSE and Mean Actual Percentage Error (MAPE) are considered

Extending the forecasts

The simulation results highlighted both the promise of the forecasts and the problems of working with high frequency data. Forecasting only the next value provides a trading intensive algorithm, with a large number of transactions being created. This becomes an issue due to the margin between the bid and ask prices. For currency markets this margin is typically only a few basis points (the smallest amount a price can move) and is usually considered negligible for long term trading. For high

Evaluating the distribution forecast

The results and evaluation reported previously are for point forecasts. These forecasts have been constructed from the probability distribution forecasts produced by AURA-FS. The aim of this work was to produce accurate distributions which could be used by a risk management system or trading model. Evaluation of the full distributions is required.

The only technique known to the author of evaluating distribution forecasts using the observed values is to use the cumulative density function (cdf)

Conclusion

Financial markets are considered very efficient and therefore difficult to forecast. Our work provides further evidence that this is so. The best Theil's U statistic of 0.892 suggests only a 10% reduction in error than a naive forecast. There are however reasons to be optimistic about the obtained AURA-FS results. The best ADA of 56.9% and the Theil's error show improvement upon the values reported for financial data previously in Ref. [12]. Unfortunately, there is little work in the literature

Acknowledgements

The research reported here has been supported by an EPSRC studentship.

Andy Pasley is a PhD student at the University of York, Computer Science Department. His research interests are binary neural networks and time series forecasting, particularly in regard to financial and electricity demand data. Current work focuses on producing full probability distribution forecasts and evaluating their use in risk management.

References (26)

  • E.F. Fama

    The behaviour of stock market prices

    Journal of Business

    (1965)
  • J.D. Farmer et al.

    Predicting chaotic time series

    Physical Review Letters

    (1987)
  • J. Jiminez et al.

    Detecting chaos with local associative memories

    Physics Letters A

    (1992)
  • Cited by (0)

    Andy Pasley is a PhD student at the University of York, Computer Science Department. His research interests are binary neural networks and time series forecasting, particularly in regard to financial and electricity demand data. Current work focuses on producing full probability distribution forecasts and evaluating their use in risk management.

    Jim Austin is the Professor of Neural Computation at the University of York, Computer Science Department, where he directs the Advanced Computer Architectures Group. He is best known for his work in binary neural networks through the development of the AURA high performance pattern recognition system. He has over 150 publications in neural networks, computer architectures and computer vision. He is the founder and Director of Cybula set up to exploit the AURA technology.

    View full text