Abstract
As more and more netizens participate in financial market transactions, online discussions on asset price movements are becoming more comprehensive and timely. Online text, especially from social media, has the potential to be an important data source for financial opinion mining. Market sentiment analysis mainly includes direct analysis methods in the form of text-based surveys and indirect inference methods based on structured data such as price, trading volume, and volatility. In theory, the former is helpful for us to understand investor sentiment earlier, but due to the difficulty of obtaining a sufficient number of objective survey samples, its obtained research attentions are far less than the latter. To combine the advantages and offset the weakness of these two approaches, this paper uses Valence Aware Dictionary and Sentiment Reasoner (VADER) and Fast Fourier Transform (FFT) to construct social media sentiment indexes based on plenty of daily discussion texts about Bitcoin (BTC) and S &P500 (SPX) from Reddit for analyzing their interaction with prices. We also propose a new time series synchronization verification method called Rolling Time-lagged Cross-correlation (RTLCC) surface, and corresponding feature constructing methods, in which RTLCC helps us observe Time-lagged Cross-correlation from the perspective of Rolling Correlation while determining the hyperparameters (Window Size & Time Offset) for features construction. Finally, based on these features, we use four machine learning classifiers for modeling and verify the effectiveness of the proposed market sentiment analysis pipeline, in which on the prediction of 10-day price movements, the best model achieves 89.9% in accuracy (ACC) and 92.5% in AUC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al Nasseri, A., Tucker, A., De Cesare, S.: Quantifying stocktwits semantic terms’ trading behavior in financial markets: an effective application of decision tree algorithms. Expert Syst. Appl. 42(23), 9192–9210 (2015)
Aloui, C., Nguyen, D.K., Njeh, H.: Assessing the impacts of oil price fluctuations on stock returns in emerging markets. Econ. Model. 29(6), 2686–2695 (2012)
Boker, S.M., Rotondo, J.L., Xu, M., King, K.: Windowed cross-correlation and peak picking for the analysis of variability in the association between behavioral time series. Psychol. Methods 7(3), 338 (2002)
Cochran, W.T., et al.: What is the fast fourier transform? Proc. IEEE 55(10), 1664–1674 (1967)
Da, Z., Engelberg, J., Gao, P.: The sum of all fears investor sentiment and asset prices. Rev. Financ. Stud. 28(1), 1–32 (2015)
Gervais, S., Kaniel, R., Mingelgrin, D.H.: The high-volume return premium. J. Financ. 56(3), 877–919 (2001)
Goetzmann, W.N., Li, L., Rouwenhorst, K.G.: Long-term global market correlations (2001)
Guliyev, H., Mustafayev, E.: Predicting the changes in the WTI crude oil price dynamics using machine learning models. Resour. Policy 77, 102664 (2022)
Huang, D., Jiang, F., Tu, J., Zhou, G.: Investor sentiment aligned: a powerful predictor of stock returns. Rev. Financ. Stud. 28(3), 791–837 (2015)
Hutto, C., Gilbert, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, pp. 216–225 (2014)
Karabulut, Y.: Can facebook predict stock market activity? In: AFA 2013 San Diego Meetings Paper (2013)
Kim, Y.B., et al.: Predicting fluctuations in cryptocurrency transactions based on user comments and replies. PLoS ONE 11(8), e0161197 (2016)
Oliveira, N., Cortez, P., Areal, N.: The impact of microblogging data for stock market prediction: using twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Syst. Appl. 73, 125–144 (2017)
Pano, T., Kashef, R.: A complete VADER-based sentiment analysis of bitcoin (BTC) tweets during the ERA of COVID-19. Big Data Cogn. Comput. 4(4), 33 (2020)
Pettengill, G.N.: Holiday closings and security returns. J. Financ. Res. 12(1), 57–67 (1989)
Porshnev, A., Redkin, I., Shevchenko, A.: Machine learning in prediction of stock market indicators based on historical data and data from twitter sentiment analysis. In: 2013 IEEE 13th International Conference on Data Mining Workshops, pp. 440–444. IEEE (2013)
Ritter, J.R.: Behavioral finance. Pacific-Basin Financ. J. 11(4), 429–437 (2003)
Shen, C.: Analysis of detrended time-lagged cross-correlation between two nonstationary time series. Phys. Lett. A 379(7), 680–687 (2015)
Solt, M.E., Statman, M.: How useful is the sentiment index? Financ. Anal. J. 44(5), 45–55 (1988)
Whaley, R.E.: The investor fear gauge (2000)
Xing, F.Z., Cambria, E., Welsch, R.E.: Natural language based financial forecasting: a survey. Artif. Intell. Rev. 50(1), 49–73 (2018)
Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8(4), e1253 (2018)
Acknowledgment
This work is supported by the Macau SAR Science and Technology Development Fund (No. 0032/2022/A, No. 0091/2020/A2), Guangzhou Development Zone Science and Technology Grant (No. 2021GH10, No. 2020GH10, and No. EF003/FST-FSJ/2019/GSTIC), and Collaborative Research Grant of University of Macau (No. MYRG-GRG2022, No. MYRG-CRG2021-00002-ICI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J., Gong, Y., Zhao, Q., Xie, Y., Fong, S., Yen, J. (2023). Market Sentiment Analysis Based on Social Media and Trading Volume for Asset Price Movement Prediction. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14176. Springer, Cham. https://doi.org/10.1007/978-3-031-46661-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-46661-8_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46660-1
Online ISBN: 978-3-031-46661-8
eBook Packages: Computer ScienceComputer Science (R0)