Effective Probability Forecasting for Time Series Data Using Standard Machine Learning Techniques

Lindsay, David; Cox, Siân

doi:10.1007/11551188_4

Effective Probability Forecasting for Time Series Data Using Standard Machine Learning Techniques

David Lindsay²⁰ &
Siân Cox²¹

Conference paper

2054 Accesses
10 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3686))

Abstract

This study investigates the effectiveness of probability forecasts output by standard machine learning techniques (Neural Network, C4.5, K-Nearest Neighbours, Naive Bayes, SVM and HMM) when tested on time series datasets from various problem domains. Raw data was converted into a pattern classification problem using a sliding window approach, and the respective target prediction was set as some discretised future value in the time series sequence. Experiments were conducted in the online learning setting to model the way in which time series data is presented. The performance of each learner’s probability forecasts was assessed using ROC curves, square loss, classification accuracy and Empirical Reliability Curves (ERC) [1]. Our results demonstrate that effective probability forecasts can be generated on time series data and we discuss the practical implications of this.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lindsay, D., Cox, S.: Improving the Reliability of Decision Tree and Naive Bayes Learners. In: Proc. of the 4th ICDM, pp. 459–462. IEEE, Los Alamitos (2004)
Google Scholar
Zadrozny, B., Elkan, C.: Transforming Classifier Scores into Accurate Multiclass Probability Estimates. In: Proc. of the 8th ACM SIGKDD, pp. 694–699. ACM Press, New York (2002)
Google Scholar
Dawid, A.P.: Calibration-based empirical probability (with discussion). Annals of Statistics 13, 1251–1285 (1985)
Article MATH MathSciNet Google Scholar
Murphy, A.H.: A New Vector Partition of the Probability Score. Journal of Applied Meteorology 12, 595–600 (1973)
Article Google Scholar
Witten, I., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Provost, F., Fawcett, T.: Analysis and Visualisation of Classifier Performance: Comparision Under Imprecise Class and Cost Distributions. In: Proc. of the 3rd ICKDD, pp. 43–48. AAAI Press, Menlo Park (1997)
Google Scholar
Fayyad, U., Irani, K.: The attribute selection problem in decision tree generation. In: Proc. of 10th Nat. Conf. on Artificial Intelligence, pp. 104–110. AAAI Press, Menlo Park (1992)
Google Scholar
Atkeson, C.G., Moore, A.W., Schaal, S.: Locally Weighted Learning. Artificial Intelligence Review 11, 11–73 (1997)
Article Google Scholar
Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, C.: Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)
Google Scholar
Vovk, V., Gammerman, A., Shafer, G.: The Analysis of Time Series: An Introduction, 4th edn. Chapman and Hall, London (1989)
Google Scholar
Keogh, E., Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In: Proc. of the 8th ACM SIGKDD, pp. 102–111. ACM Press, New York (2002)
Google Scholar
Hull, J.C.: Options, Futures, and other Derivatives, 5th edn. Prentice-Hall, Upper Saddle River (2002)
Google Scholar
Vovk, V., Takemura, A., Shafer, G.: Defensive Forecasting. In: Proc. of 10th International Workshop on Artificial Intelligence and Statistics. Electronic publication, Cologne University (2005)
Google Scholar
Langford, J., Zadronzy, B.: Estimating Class Membership Probabilities Using Classifier Learners. In: Proc. of 10th International Workshop on Artificial Intelligence and Statistics. Electronic publication, Cologne University (2005)
Google Scholar
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In: Proc. of the 9th ACM SIGKDD, pp. 216–225. ACM Press, New York (2003)
Google Scholar
Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of Workshop on Support Machines at IJCAI 1999. Electronic publication, Cologne University (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Learning Research Centre,
David Lindsay
School of Biological Sciences, Royal Holloway University of London, Egham, Surrey, TW20 OEX, UK
Siân Cox

Authors

David Lindsay
View author publications
You can also search for this author in PubMed Google Scholar
Siân Cox
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research School of Infomatics, Loughborough, UK
Sameer Singh
ATR Lab, Research School of Informatics, University of Loughborough, Loughborough, UK
Maneesha Singh
IBM Corporation, 1133 Wetchester Avenue, White Plains, 10604, New York, United States
Chid Apte
Institute of Computer Vision and applied Computer Sciences, IBaI, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lindsay, D., Cox, S. (2005). Effective Probability Forecasting for Time Series Data Using Standard Machine Learning Techniques. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds) Pattern Recognition and Data Mining. ICAPR 2005. Lecture Notes in Computer Science, vol 3686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551188_4

Download citation

DOI: https://doi.org/10.1007/11551188_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28757-5
Online ISBN: 978-3-540-28758-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics