Elsevier

Expert Systems with Applications

Volume 114, 30 December 2018, Pages 388-401
Expert Systems with Applications

An investor sentiment reward-based trading system using Gaussian inverse reinforcement learning algorithm

https://doi.org/10.1016/j.eswa.2018.07.056Get rights and content

Highlights

  • Model investor sentiment and return interaction with inverse reinforcement learning.

  • Use a preference graph to fit market change in response to sentiment shocks.

  • Show superior performance over other news sentiment signal based strategies.

  • Propose an adaptive sentiment reward trading system with SVM and retraining.

Abstract

Investor sentiment has been shown as an important factor that influences market returns, and a number of profitable trading systems have been proposed by taking advantage of investor sentiment signals. In this paper, we aim to design an investor sentiment reward-based trading system using Gaussian inverse reinforcement learning method. Our hypothesis is that while markets interact with investor’s sentiment, there exists an intrinsic mapping between investor’s sentiment and market conditions revealing future market directions. We propose an investor sentiment reward based trading system aimed at extracting only signals that generate either negative or positive market responses. Such a reward extraction mechanism is based not only on market returns but also market volatility representing a succinct and robust feature space. The back-test results show that the proposed sentiment reward-based trading system is superior to various benchmark strategies on S&P 500 index and market-based ETFs as well as few other existing news sentiment-based trading signals. Moreover, we find that sentiment reward trading system is much more effective in a volatile market, but it is sensitive to transaction costs.

Introduction

Many studies have emerged aiming to explain the phenomenon that irrational decisions tend to be biased on the same direction rather than holding rational expectations (De Long, Shleifer, Summers, Waldmann, 1990, Kahneman, Tversky, 1979, Shiller, 2003, Shiller, Fischer, Friedman, 1984). Investor sentiment has become a focus of behavioral finance in recent years (Antoniou, Doukas, Subrahmanyam, 2013, Barber, Odean, 2012, Sun, Najand, Shen, 2016, Tetlock, 2007, Yang, Mo, Zhu, 2014). On the one hand it directly challenges the assumption that participants of the financial markets are rational but on the other hand, it has inspired researchers to design novel trading strategies to exploit premiums caused by investor’s irrational behaviors driven by the market sentiment.

The aim of this research is to reveal the intrinsic relationship between investor sentiment and market returns using the inverse reinforcement learning (IRL) approach by designing an effective trading system. In this study we answer three questions: (i) What is the interaction mechanism between investor sentiment and financial market? (ii) Is there an inherent mapping between this interaction mechanism and future financial market directions? (iii) Can we take advantage of this interaction mechanism to consistently beat the market? Most of the effort in this field has been focused on building direct relationships between investor sentiment proxies and future financial market movements (Bollen, Mao, Zeng, 2011, Kurov, 2010, Yang, Song, Mo, Datta, Deane, 2015), where these relationship indicators are then used to design trading strategies (Yang, Mo, Liu, & Kirilenko, 2017). However, in this paper we model the financial market dynamics as a Markov decision process and regard investor sentiment as a series of actions taken at different market states. Assuming there exists an intrinsic market reward function governing this process, we then extract the reward function using Gaussian process based inverse reinforcement learning algorithm (Qiao & Beling, 2011), and use the rewards to forecast directions of the future marketx returns.

The intuition of this approach rests on the evidence that investor sentiment has significant influence on financial market movement. When investors are optimistic about the market, they tend to long assets and contribute to the boom of the market. When investors are pessimistic about the future of the market, they tend to take short positions on assets and contribute to the downturn of the market. Investor sentiment would also adjust itself to these market movements when the asset values deviate too far from the fundamental values. During this process, we posit that the reward function from the inverse reinforcement learning framework is “the most succinct, robust, and transferable definition of the task” (Abbeel & Ng, 2004), assuming the market as a whole follows a Markov process. In other words, the reward function is a feature that can be extracted from the observations of the past interactions between investor sentiment and financial market movement. This process naturally filters out the sentiment signals that do not generate market reactions, and hence the reward-based signals should improve the quality of the forecast where the reward function contains cleaner information of such interactions than the raw sentiment measures at least. Based on this intuition, we hypothesize that the market sentiment reward function should be a good feature space for predicting future financial market directions. Then we can design a profitable strategy based on the sentiment rewards extracted from this inverse learning process.

In this paper, we use news sentiment from Thomson Reuters news analytics database as the proxy of investor sentiment toward the general U.S. market. We apply Gaussian process based inverse reinforcement learning (GPIRL) method (Qiao & Beling, 2011) on past observations of market states and investor sentiment shocks. In this process, a preference graph is constructed to capture the distinctive state transition choices of the Markov decision process. One unique feature of this GPIRL method is that it requires less data observations than other linear IRL methods, moreover it is less susceptible to observation noise which is very prevalent when dealing with financial market data. We then compare the performance of the sentiment-reward signals with other sentiment signals such as the raw sentiment score (Feuerriegel, Prendinger, 2016, Sun, Najand, Shen, 2016, Yang, Mo, Zhu, 2014), sentiment shock (Song, Liu, Yang, 2017, Yang, Song, Mo, Datta, Deane, 2015), and sentiment trend (Song, Liu et al., 2017) using support vector machines (SVM) method to generate trading signals. The out-of-sample tests show significant improvement over the existing sentiment signals. Moreover, we examine the sensitivity of the sentiment reward-based signal using different machine learning methods such as Random Forest, Boosting, k-NN (k-nearest neighbors), Decision Tree and Bagging along with SVM method. We show that the SVM method combined with the sentiment reward signals outperforms all the other methods, suggesting an optimal sentiment reward-based trading system.

The major contribution of the paper is to propose a news sentiment based trading system where the trading signals are generated based on market’s rewards to investor sentiment. We argue investor sentiment signals are mostly very noisy, and often markets do not react to these signals. There are two sources of noise that prevent better model forecasting: a) news sentiment is only a proxy of the “true” investor sentiment given the investor sentiment is unobservable, and proxies are mostly deployed to indirectly measure the underlying investor sentiment (Antweiler, Frank, 2004, Barber, Odean, 2012); b) news sentiment measure itself is noisy where many expressions may be used in news articles, but not all accurately describe market conditions (Feuerriegel, Heitzmann, Neumann, 2015, Feuerriegel, & Neumann). Although efforts to filtering the noise existing in sentiment measures show improvement (Song, Almahdi, & Yang, 2017), their effect of applying direct filtering is rather limited (Song, Liu et al., 2017). In this study we propose an investor sentiment reward-based trading system aimed at extracting only signals that generate either negative or positive market responses. This filtering mechanism is realized through training a model based on how market gives rewards to sentiment signals using a supervised learning method. Because if the market does not reward certain news sentiment signals, the learned model will automatically filter out such sentiment signals. As a result, the model will only capture the effect of “true” investor sentiment on the market. Moreover, such a reward extraction mechanism is based not only on market returns but also market volatility representing a succinct and robust feature space. To the best of our knowledge, this approach is the first in representing investor sentiment influence to market returns in the reward space. It can be broadly applied to other market sentiment proxy based trading systems.

The rest of the paper is organized as follows. In Section 2, we review the literature about investor sentiment and news sentiment, as well as machine learning and reinforcement learning based trading strategies. Then we introduce the Gaussian process based inverse reinforcement learning technique and construct state space and action space and present our proposed trading strategy in Section 3. In Section 4, we assess the performance of this sentiment reward-based trading system with other existing sentiment filtering methods along with some popular passive trading strategies. In the last section we make a conclusion of our research and highlight the contributions of the paper.

Section snippets

Literature review

In this section we review two strands of related literature to set the background of our work. First, we review the reinforcement learning and inverse reinforcement learning approaches applied in the financial market forecasting and trading. In the second part we examine the recent work in combining investor sentiment and machine learning methods in building trading systems.

Methodology and data

In this section, we propose a sentiment reward-based inverse reinforcement learning approach to estimate the reward function based on past observations. We use support vector machine (SVM) to classify these rewards into up-trend or down-trend indicators. We also discuss how to estimate the reward function using Gaussian process based inverse reinforcement learning technique, followed by how to construct the state space and action space in our case as well as the design of a trading strategy. At

Experiments and discussions

In this section, we perform a number of experiments to evaluate the system parameter choices and assess the subsequent performance of the proposed sentiment reward-based trading system.

We first discuss the training data used in the experiments. As discussed in Section 3.3, we use a moving window of certain number of hours to learn the rewards, and then we use a number of learned rewards to predict the market direction of the next period under a stationarity assumption. This moving window sample

Conclusion

The main contributions of this study can be summarized as follows: (i) We model the interaction mechanism between investor sentiment and market return using Gaussian process inverse reinforcement learning method with a preference graph to fit the situation in which market states respond to investor sentiment shocks differently. (ii) We propose an inverse reinforcement learning method to extract market rewards toward investor sentiment and show that the sentiment rewards provide high quality

References (49)

  • L.A. Smales

    News sentiment in the gold futures market

    Journal of Banking and Finance

    (2014)
  • L.A. Smales

    Asymmetric volatility response to news sentiment in gold futures

    Journal of International Financial Markets, Institutions and Money

    (2015)
  • Q. Song et al.

    Stock portfolio selection using learning-to-rank algorithms with news sentiment

    Neurocomputing

    (2017)
  • P.C. Tetlock

    Giving content to investor sentiment: The role of media in the stock market

    Journal of Finance

    (2007)
  • S.Y. Yang et al.

    An empirical study of the financial community network on twitter

    Computational intelligence for financial engineering & economics (cifer), 2104 ieee conference on

    (2014)
  • S.Y. Yang et al.

    The impact of abnormal news sentiment on financial markets

    Journal of Business and Economics

    (2015)
  • B.D. Ziebart et al.

    Maximum entropy inverse reinforcement learning

    Aaai

    (2008)
  • P. Abbeel et al.

    Apprenticeship learning via inverse reinforcement learning

    Proceedings of the twenty-first international conference on machine learning

    (2004)
  • S. Almahdi et al.

    An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown

    Expert Systems with Applications

    (2017)
  • C. Antoniou et al.

    Cognitive dissonance, sentiment, and momentum

    Journal of Financial and Quantitative Analysis

    (2013)
  • W. Antweiler et al.

    Is all that talk just noise? The information content of Internet stock message boards

    Journal of Finance

    (2004)
  • M. Baker et al.

    Investor sentiment and the cross-section of stock returns

    The Journal of Finance

    (2006)
  • B.M. Barber et al.

    All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors

    The Review of Financial Studies

    (2012)
  • L.K.C. Chan et al.

    Institutional equity trading costs: NYSE versus nasdaq

    The Journal of Finance

    (1997)
  • Cited by (0)

    View full text