Abstract
The explosive price volatility from the end of 2017 to January 2018 shows that bitcoin is a high risk asset. The deep reinforcement algorithm is straightforward idea for directly outputs the market management actions to achieve higher profit instead of higher price-prediction accuracy. However, existing deep reinforcement learning algorithms including Q-learning are also limited to problems caused by enormous searching space. We propose a combination of double Q-network and unsupervised pre-training using Deep Boltzmann Machine (DBM) to generate and enhance the optimal Q-function in cryptocurrency trading. We obtained the profit of 2,686% in simulation, whereas the best conventional model had that of 2,087% for the same period of test. In addition, our model records 24% of profit while market price significantly drops by −64%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008)
Agarwal, A., Hazan, E., Kale, S., Schapire, R.E.: Algorithms for portfolio management based on the Newton method. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 9–16. ACM (2006)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529 (2015)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI, vol. 16, pp. 2094–2100 (2016)
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616 (2009)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Huang, W., Nakamori, Y., Wang, S.Y.: Forecasting stock market movement direction with support vector machine. Comput. Oper. Res. 32, 2513–2522 (2005)
Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. 27, 12 (2009)
Patel, J., Shah, S., Thakkar, P., Kotecha, K.: Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst. Appl. 42, 259–268 (2015)
McNally, S.: Predicting the Price of Bitcoin using Machine Learning. National College of Ireland (2016)
Amjad, M., Shah, D.: Trading bitcoin and online time series prediction. In: NIPS 2016 Time Series Workshop, pp. 1–15 (2017)
Jiang, Z., Liang, J.: Cryptocurrency portfolio management with deep reinforcement learning. In: Intelligent Systems Conference, pp. 905–913 (2017)
Bell, T.: Bitcoin Trading Agents. University of Southampton (2016)
Żbikowski, K.: Application of machine learning algorithms for bitcoin automated trading. In: Ryżko, D., Gawrysiak, P., Kryszkiewicz, M., Rybiński, H. (eds.) Machine Intelligence and Big Data in Industry. SBD, vol. 19, pp. 161–168. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30315-4_14
Tesauro, G.: Extending Q-learning to general adaptive multi-agent systems. In: Advances in Neural Information Processing Systems, pp. 871–878 (2004)
Pinheiro, P.H., Collobert, R.: Recurrent convolutional neural networks for scene labeling. In: International Conference on Machine Learning, pp. 82–90 (2014)
Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: Acoustics, Speech and Signal Processing, pp. 4580–4584 (2015)
Ren, Y., Wu, Y.: Convolutional deep belief networks for feature extraction of EEG signal. In: International Joint Conference on Neural Networks, pp. 2850–2853 (2014)
Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: AAAI, pp. 2140–2146 (2017)
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Bu, S.-J., Cho, S.-B.: A hybrid system of deep learning and learning classifier system for database intrusion detection. In: Martínez de Pisón, F.J., Urraca, R., Quintián, H., Corchado, E. (eds.) HAIS 2017. LNCS (LNAI), vol. 10334, pp. 615–625. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59650-1_52
Cover, T.M.: Universal portfolios. In: The Kelly Capital Growth Investment Criterion: Theory and Practice, pp. 181–209 (2011)
Das, P., Banerjee, A.: Meta optimization and its applications to portfolio selection. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1163–1171 (2011)
Li, B., Zhao, P., Hoi, S.C., Gopalkrishnan, V.: PAMR: passive aggressive mean reversion strategy for portfolio selection. Mach. Learn. 87, 221–258 (2012)
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2606 (2008)
Acknowledgements
This research was supported by Korea Electric Power Corporation. (Grant number:R18XA05).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Bu, SJ., Cho, SB. (2018). Learning Optimal Q-Function Using Deep Boltzmann Machine for Reliable Trading of Cryptocurrency. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11314. Springer, Cham. https://doi.org/10.1007/978-3-030-03493-1_49
Download citation
DOI: https://doi.org/10.1007/978-3-030-03493-1_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03492-4
Online ISBN: 978-3-030-03493-1
eBook Packages: Computer ScienceComputer Science (R0)