research-article

A Novel Feature Selection Method for Risk Management in High-Dimensional Time Series of Cryptocurrency Market

Authors:
Erfan Varedi

CSE & IT Department, Faculty of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran

CSE & IT Department, Faculty of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran

0000-0001-8409-6342
View Profile

,
Reza Boostani

CSE & IT Department, Faculty of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran

CSE & IT Department, Faculty of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran

0000-0003-0055-4452
View Profile

Authors Info & Claims

Journal of Data and Information Quality Volume 15 Issue 3Article No.: 39pp 1–14https://doi.org/10.1145/3597309

Published:28 September 2023Publication History

Journal of Data and Information Quality

Abstract

In this study, a novel approach for feature selection has been presented in order to overcome the challenge of classifying positive and negative risk prediction in the cryptocurrency market, which contains high fluctuation. This approach is based on maximizing information gain with simultaneously minimizing the similarity of selected features to achieve a proper feature set for improving classification accuracy. The proposed method was compared with other feature selection techniques, such as sequential and bidirectional feature selection, univariate feature selection, and least absolute shrinkage and selection operator. To evaluate the feature selection techniques, several classifiers were employed: XGBoost, k-nearest neighbor, support vector machine, random forest, logistic regression, long short-term memory, and deep neural networks. The features were elicited from the time series of Bitcoin, Binance, and Ethereum cryptocurrencies. The results of applying the selected features to different classifiers indicated that XGBoost and random forest provided better results on the time series datasets. Furthermore, the proposed feature selection method achieved the best results on two (out of three) cryptocurrencies. The accuracy in the best state varied between 55% to 68% for different time series. It is worth mentioning that preprocessed features were used in this research, meaning that raw data (candle data) were used to derive efficient features that can explain the problem and help the classifiers in predicting the labels.

REFERENCES

[1] Melisa Ozdamar, Ahmet Sensoy, and Akdeniz Kevent. 2022. Retail vs institutional investor attention in the cryptocurrency market. Journal of International Financial Markets, Institutions and Money 81 (2022), 101674. DOI: Google ScholarCross Ref
[2] Arunima Ghosh, Shashank Gupta, Amit Dua, and Neeraj Kumar. 2020. Security of cryptocurrencies in blockchain technology: State-of-art, challenges and future prospects. Journal of Network and Computer Applications 163 (2020), 102635. Google ScholarCross Ref
[3] Hatem Brik, Jihene El Ouakdi, and Ftiti Zied. 2022. Roles of stable versus nonstable cryptocurrencies in Bitcoin market dynamics. Research in International Business and Finance 62 (2022), 101720. Google ScholarCross Ref
[4] Mohil Mahesh, Kumar Patel, Sudeep Tanwar, Rajesh Gupta, and Kumar Neeraj. 2020. A deep learning-based cryptocurrency price prediction scheme for financial institutions. Journal of Information Security and Applications 55 (2020), 102583. Google ScholarCross Ref
[5] Minqi Jiang, Jiapeng Liu, and Lu Zhang. 2021. An extended regularized Kalman filter based on genetic algorithm: Application to dynamic asset pricing models. The Quarterly Review of Economics and Finance 79 (2021), 28–44. Google ScholarCross Ref
[6] Amirizadeh Elham and Boostani Reza. 2021. CDEC: A constrained deep embedded clustering. International Journal of Intelligent Computing and Cybernetics 14, 4 (2021), 686–701. .Google ScholarCross Ref
[7] Boostani Reza, Karimzadeh Foroozan, and Nami Mohammad. 2017. A comparative review on sleep stage classification methods in patients and healthy individuals. Computer Methods and Programs in Biomedicine 140 (2017), 77–91. Google ScholarDigital Library
[8] Moayedi Fatemeh, Azimifar Zohreh, Boostani Reza, and Katebi Serajoddin. 2010. Contourlet-based mammography mass classification using the SVM family. Computers in Biology and Medicine 40, 4 (2010), 373–383. DOI:Google ScholarDigital Library
[9] Goshtasbi Narges, Boostani Reza, and Sanei Saeid. 2022. SleepFCN: A fully convolutional deep learning framework for sleep stage classification using single-channel electroencephalograms. IEEE Transactions on Neural Systems and Rehabilitation Engineering 30 (2022), 2088–2096. DOI:Google ScholarCross Ref
[10] Afshar Sara, Boostani Reza, and Sanei Saeid. 2021. A combinatorial deep learning structure for precise depth of anesthesia estimation from EEG signals. IEEE Journal of Biomedical and Health Informatics 25, 9 (2021), 3408–3415. DOI:Google ScholarCross Ref
[11] Rajabi Shahab, Roozkhos Pardis, and Motahari Farimani Nasser. 2022. MLP-based learnable window size for Bitcoin price prediction. Applied Soft Computing 129 (2022), 109584. Google ScholarDigital Library
[12] Jaquart Patrick, Dann David, and Weinhardt Christof. 2021. Short-term bitcoin market prediction via machine learning. The Journal of Finance and Data Science 7 (2021), 45–66. DOI:Google ScholarCross Ref
[13] Chen Zheshi, Li Chunhong, and Sun Wenjun. 2020. Bitcoin price prediction using machine learning: An approach to sample dimension engineering. Journal of Computational and Applied Mathematics 365 (2020), 112395. Google ScholarDigital Library
[14] Atsalakis George, Atsalaki Ioanna G., Pasiouras Fotios, and Zopounidis Constantin. 2019. Bitcoin price forecasting with neuro-fuzzy techniques. European Journal of Operational Research 276 (2019), 770–780. Google ScholarCross Ref
[15] Yaohao Peng, Pedro Henrique Melo Albuquerque, Herbert Kimura, Cayan Atreio Portela, and Barcena Saavedra. 2021. Feature selection and deep neural networks for stock price direction forecasting using technical analysis indicators. Machine Learning with Applications 5 (2021), 100060. Google ScholarCross Ref
[16] Tibshirani Robert. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267–288. Google ScholarCross Ref
[17] Takeshi Emura, Shigeyuki Matsui, and Chen Hsuan-Yu. 2019. Compound.Cox: Univariate feature selection and compound covariate for predicting survival. Computer Methods and Programs in Biomedicine 168 (2019), 21–37. DOI:Google ScholarCross Ref
[18] Cover Thomas M. and Hart Peter E.. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 1 (1967), 21–27. DOI:Google ScholarDigital Library
[19] Cortes Corinna and Vapnik Vladimir. 1995. Support-vector networks. Machine Learning 20, 3 (1995), 273–297. Google ScholarDigital Library
[20] TinKam Ho. 1998. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 2, 8 (1995), 832–844. Google ScholarDigital Library
[21] Hosmer David W. and Lemeshow Stanley. 2000. Applied Logistic Regression (2nd ed.). WileyGoogle ScholarCross Ref
[22] Sagi Omer and Rokach Lior. 2021. Approximating XGBoost with an interpretable decision tree. Information Sciences 572 (2021), 522–542. Google ScholarDigital Library
[23] Hashempour Sara, Boostani Reza, Mohammadi Mokhktar, and Sanei Saeid. 2022. Continuous scoring of depression from EEG signals via a hybrid of convolutional neural networks. IEEE Transactions on Neural Systems and Rehabilitation Engineering 30 (2022), 176–183. DOI:Google ScholarCross Ref
[24] Dehghani Maryam, Mobaien Ali, and Boostani Reza. 2021. A deep neural network-based transfer learning to enhance the performance and learning speed of BCI systems. Brain-Computer Interfaces 8, 1-2 (2021), 14–25. Google ScholarCross Ref
[25] Afrasiabi Somayeh, Boostani Reza, Masnadi-Shirazi Mohammad Ali, and Nezam Tahereh. 2021. An EEG based hierarchical classification strategy to differentiate five intensities of pain. Expert Systems with Applications 180 (2021), 115010-1-14. DOI:Google ScholarDigital Library
[26] Modarres-Haghighi Parisa, Boostani Reza, Nami Mohammad, and Sanei Saeid. 2021. Quantification of pain severity using EEG-based functional connectivity. Biomedical Signal Processing and Control 69 (2021), 102840. Google ScholarCross Ref
[27] Hossein Shakoor Mohammad, Boostani Reza, Sabeti Malihe, and Mohammadi Mokhtar. 2023. Feature selection and mapping of local binary pattern for texture classification. Multimedia Tools and Applications 82, 5 (2023), 7639–7676. Google ScholarDigital Library
[28] Ganjei Mohammad Ahmadi and Boostani Reza. 2022. A hybrid feature selection scheme for high-dimensional data. Engineering Applications of Artificial Intelligence 113 (2022), 104894. Google ScholarDigital Library
[29] Ganjei Mohammad Ahmadi and Boostani Reza. 2019. A fast hybrid feature selection method. 9th International Conference on Computer and Knowledge Engineering (ICCKE), Tehran (Iran), (2019), 6–11. DOI: Google ScholarCross Ref
[30] Ijadi Maghsoodi Abtin. 2023. Cryptocurrency portfolio allocation using a novel hybrid and predictive big data decision support system. 115 (2023), 102787. Google ScholarCross Ref
[31] Yi-Shuai Ren, Chao-Qun Ma, Xiao-Lin Kong, Konstantinos Baltas, and Qasim Zureigat. 2022. Past, present, and future of the application of machine learning in cryptocurrency research. Research in International Business and Finance 63 (2022), 101799. Google ScholarCross Ref

Index Terms

A Novel Feature Selection Method for Risk Management in High-Dimensional Time Series of Cryptocurrency Market

Recommendations

A novel feature selection approach for biomedical data classification

This paper presents a novel feature selection approach to deal with issues of high dimensionality in biomedical data classification. Extensive research has been performed in the field of pattern recognition and machine learning. Dozens of feature ...
Read More
Effective hybrid feature subset selection for multilevel datasets using decision tree classifiers

Feature selection is one of the most significant procedures in machine learning algorithms. It is particularly to improve the performance and prediction accuracy for complex data classification. This paper discusses a hybrid feature selection technique ...
Read More
Ensemble feature selection for high dimensional data: a new method and a comparative study

The curse of dimensionality is based on the fact that high dimensional data is often difficult to work with. A large number of features can increase the noise of the data and thus the error of a learning algorithm. Feature selection is a solution for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Journal of Data and Information Quality Volume 15, Issue 3
September 2023
326 pages
ISSN:1936-1955
EISSN:1936-1963
DOI:10.1145/3611329
Editor:
Tiziana Catarci
Sapienza University of Rome, Rome, Italy
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 September 2023
- Online AM: 26 May 2023
- Accepted: 17 April 2023
- Revised: 10 April 2023
- Received: 16 December 2022
Published in jdiq Volume 15, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Risk prediction
feature selection
time series
cryptocurrency
XGBoost
random forest
SFFS
bidirectional
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 190
  Total Downloads
- Downloads (Last 12 months)190
- Downloads (Last 6 weeks)27
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

A Novel Feature Selection Method for Risk Management in High-Dimensional Time Series of Cryptocurrency Market

Journal of Data and Information Quality

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A novel feature selection approach for biomedical data classification

Effective hybrid feature subset selection for multilevel datasets using decision tree classifiers

Ensemble feature selection for high dimensional data: a new method and a comparative study