research-article

Conditional mutual information-based contrastive loss for financial time series forecasting

Authors:

Markus FlierlAuthors Info & Claims

ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance

Article No.: 9, Pages 1 - 7

https://doi.org/10.1145/3383455.3422550

Published: 07 October 2021 Publication History

Abstract

We present a representation learning framework for financial time series forecasting. One challenge of using deep learning models for finance forecasting is the shortage of available training data when using small datasets. Direct trend classification using deep neural networks trained on small datasets is susceptible to the overfitting problem. In this paper, we propose to first learn compact representations from time series data, then use the learned representations to train a simpler model for predicting time series movements. We consider a class-conditioned latent variable model. We train an encoder network to maximize the mutual information between the latent variables and the trend information conditioned on the encoded observed variables. We show that conditional mutual information maximization can be approximated by a contrastive loss. Then, the problem is transformed into a classification task of determining whether two encoded representations are sampled from the same class or not. This is equivalent to performing pairwise comparisons of the training datapoints, and thus, improves the generalization ability of the encoder network. We use deep autoregressive models as our encoder to capture long-term dependencies of the sequence data. Empirical experiments indicate that our proposed method has the potential to advance state-of-the-art performance.

References

[1]

Devansh Arpit, Stanis Jastrzebski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, and Simon Lacoste-Julien. 2017. A Closer Look at Memorization in Deep Networks. In Proc. of the 34th International Conference on Machine Learning.

Digital Library

[2]

Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, and Devon Hjelm. 2018. Mutual Information Neural Estimation. In Proc. of the 35th International Conference on Machine Learning.

[3]

Anastasia Borovykh, Sander Bohte, and Cornelis W. Oosterlee. 2018. Conditional time series forecasting with convolutional neural networks. CoRR abs/1811.07557 (2018). http://arxiv.org/abs/1811.07557

[4]

T. M. Cover and J. A. Thomas. 2006. Elements of Information Theory (2nd ed.). Wiley, New York, NY, USA.

[5]

Fuli Feng, Huimin Chen, Xiangnan He, Ji Ding, Maosong Sun, and Tat-Seng Chua. 2019. Enhancing Stock Movement Prediction with Adversarial Training. In Proceedings of the Twenty-Eightth International Joint Conference on Artificial Intelligence.

[6]

K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[7]

Olivier J. Hénaff, Ali Razavi, Carl Doersch, S. M. Ali Eslami, and Aäron van den Oord. 2019. Data-Efficient Image Recognition with Contrastive Predictive Coding. CoRR abs/1905.09272 (2019).

[8]

Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In ICLR.

[9]

D. P. Kingma and J. Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.

[10]

Sudipto Mukherjee, Himanshu Asnani, and Sreeram Kannan. 2019. CCMI: Classifier based Conditional Mutual Information Estimation. In Conference on Uncertainty in Artificial Intelligence (UAI).

[11]

Ben Poole, Sherjil Ozair, Aaron Van Den Oord, Alex Alemi, and George Tucker. 2019. On Variational Bounds of Mutual Information. In Proc. of the 36th International Conference on Machine Learning. Long Beach, California, USA.

[12]

Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, and Garrison W. Cottrell. 2017. A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.

[13]

Jiaming Song and Stefano Ermon. 2019. Understanding the Limitations of Variational Mutual Information Estimators. arXiv e-prints (2019).

[14]

Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2019. Contrastive Multiview Coding. CoRR abs/1906.05849 (2019).

[15]

Kari Torkkola and William M. Campbell. 2000. Mutual Information in Learning Feature Transformations. In Proc. of the 17th International Conference on Machine Learning.

[16]

Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, and Mario Lucic. 2019. On Mutual Information Maximization for Representation Learning. arXiv e-prints (2019).

[17]

Aäron van den Oord and Yazhe Li and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR abs/1807.03748 (2018).

[18]

Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alexander Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. In Arxiv. https://arxiv.org/abs/1609.03499

[19]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS.

[20]

Yumo Xu and Shay B. Cohen. 2018. Stock Movement Prediction from Tweets and Historical Prices. In Proc. of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia.

[21]

Chiyuan Zhang, Samy Bengio, Moritz Hard, Benjamin Recht, and Oriol Vinyals. 2017. Understanding deep learning requires rethinking generlization. In ICLR.

[22]

X. S. Zhang and F. Wang. 2017. Signal Processing for Finance, Economics, and Marketing: Concepts, framework, and big data applications. IEEE Signal Processing Magazine 34, 3 (May 2017), 14--35.

[23]

Gregory Zuckerman. 2019. The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution. Penguin Publishing Group, London, England.

Cited By

Du KMao RXing FCambria ESerra ESpezzano F(2024)Explainable Stock Price Movement Prediction using Contrastive LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679544(529-537)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679544
Leong KXiu YChen BChan W(2023)Neural Causal Information Extractor for Unobserved CausesEntropy10.3390/e2601004626:1(46)Online publication date: 31-Dec-2023
https://doi.org/10.3390/e26010046
Li ZFang YLi YRen KWang YLuo XDuan JHuang CLi DQiu L(2023)Protecting the Future: Neonatal Seizure Detection with Spatial-Temporal Modeling2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394628(196-201)Online publication date: 1-Oct-2023
https://doi.org/10.1109/SMC53992.2023.10394628
Show More Cited By

Recommendations

Time series forecasting with the WARIMAX-GARCH method

It is well-known that causal forecasting methods that include appropriately chosen Exogenous Variables (EVs) very often present improved forecasting performances over univariate methods. However, in practice, EVs are usually difficult to obtain and in ...
Time series forecasting based on wavelet decomposition and feature extraction

Time series forecasting is one of the most important issues in numerous applications in real life. The objective of this study was to propose a hybrid neural network model based on wavelet transform (WT) and feature extraction for time series ...
Forecasting with information extracted from the residuals of ARIMA in financial time series using continuous wavelet transform

Time series of financial or economic data are often considered to have certain trends and patterns. It is believed that the study of historical patterns helps in the forecasting into the future. ARIMA model is one of the popular models for the task. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance

October 2020

422 pages

ISBN:9781450375849

DOI:10.1145/3383455

Conference Chair:
Tucker Balch
J.P. Morgan AI Research

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ICAIF '20

Sponsor:

ACM

ICAIF '20: ACM International Conference on AI in Finance

October 15 - 16, 2020

New York, New York

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
268
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)5

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Du KMao RXing FCambria ESerra ESpezzano F(2024)Explainable Stock Price Movement Prediction using Contrastive LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679544(529-537)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679544
Leong KXiu YChen BChan W(2023)Neural Causal Information Extractor for Unobserved CausesEntropy10.3390/e2601004626:1(46)Online publication date: 31-Dec-2023
https://doi.org/10.3390/e26010046
Li ZFang YLi YRen KWang YLuo XDuan JHuang CLi DQiu L(2023)Protecting the Future: Neonatal Seizure Detection with Spatial-Temporal Modeling2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394628(196-201)Online publication date: 1-Oct-2023
https://doi.org/10.1109/SMC53992.2023.10394628
Choi IKim W(2023)Estimating Historical Downside Risks of Global Financial Market Indices via Inflation Rate-Adjusted Dependence GraphsResearch in International Business and Finance10.1016/j.ribaf.2023.10207766(102077)Online publication date: Oct-2023
https://doi.org/10.1016/j.ribaf.2023.102077
Zhou FWang PXu XTai WTrajcevski G(2021)Contrastive Trajectory Learning for Tour RecommendationACM Transactions on Intelligent Systems and Technology10.1145/346233113:1(1-25)Online publication date: 29-Nov-2021
https://dl.acm.org/doi/10.1145/3462331

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten