skip to main content
10.1145/3629264.3629284acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccdaConference Proceedingsconference-collections
research-article

Novel Domain-Knowledge Based Feature Selection Framework for Price Prediction: Comprehensive Modelling in Sailboat Market

Published:19 December 2023Publication History

ABSTRACT

The high-dimensional data in market price prediction is a great challenge that was not effectively addressed by the traditional data-driven feature selection approaches. This paper introduces a novel Domain-Knowledge based Feature Selection Framework (DKFS), specifically designed for product-oriented applications. By adopting a two-step approach, the traditional statistical method (i.e., filter method) is integrated with domain-specific knowledge, offering an enhanced layer of selection that ensures a rigorous and efficient exclusion of irrelevant or redundant features. The framework was applied to a real-world application of sailboat price prediction, with three modelling techniques (Multiple Linear Regression, Random Forest, and Gradient Boosting) evaluated based on a comprehensive dataset of over 2,500 sailboat transactions. The adopted approach demonstrated an exceptional performance capturing 90.8% variability with a small set of 26 features, including economic indicators and geographical factors. The proposed framework illustrates significant effectiveness in dimensionality reduction and offers broad applicability across various domains. It also presents a promising direction for further research into the use of expert systems and adaptive feature selection design.

References

  1. Dierks, L. and Seuken, S. (2020) ‘The competitive effects of variance-based pricing’, Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence [Preprint]. doi:10.24963/ijcai.2020/51.Google ScholarGoogle ScholarCross RefCross Ref
  2. Dunn, Jack & Mingardi, Luca & Zhuo, Ying. (2021). Comparing interpretability and explainability for feature selection., Kenji Kira and Larry A. Rendell. 1992. The feature selection problem: traditional methods and a new algorithm. In Proceedings of the tenth national conference on Artificial intelligence (AAAI'92). AAAI Press, 129–134.Google ScholarGoogle Scholar
  3. Kumar, V. (2014) ‘Feature selection: A literature review’, The Smart Computing Review, 4(3). doi:10.6029/smartcr.2014.03.007.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bolón-Canedo, V., Sánchez-Maroño, N. and Alonso-Betanzos, A. (2012) ‘A review of feature selection methods on Synthetic Data’, Knowledge and Information Systems, 34(3), pp. 483–519. doi:10.1007/s10115-012-0487-8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Liu, H., Liu, L. and Zhang, H. (2008) ‘Feature selection using Mutual Information: An experimental study’, PRICAI 2008: Trends in Artificial Intelligence, pp. 235–246. doi:10.1007/978-3-540-89197-0_24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. George Forman. 2003. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, null (3/1/2003), 1289–1305.Google ScholarGoogle Scholar
  7. Budak, H. and Erpolat Taşabat, S. (2016) ‘A modified T-score for feature selection’, ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering, 17(5), pp. 845–845. doi:10.18038/aubtda.279853.Google ScholarGoogle ScholarCross RefCross Ref
  8. Kohavi, R. and John, G.H. (1997) ‘Wrappers for feature subset selection’, Artificial Intelligence, 97(1–2), pp. 273–324. doi:10.1016/s0004-3702(97)00043-x.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Riyaz Sikora and Selwyn Piramuthu. 2007. Framework for efficient feature selection in genetic algorithm based data mining. European Journal of Operational Research 180, 723-737. https://doi.org/10.1016/j.ejor.2006.02.040Google ScholarGoogle ScholarCross RefCross Ref
  10. Hehui Qian and Zhiwei Qiu. 2014. Feature selection using C4.5 algorithm for electricity price prediction. 2014 International Conference on Machine Learning and Cybernetics, 175-180. https://doi.org/10.1109/icmlc.2014.7009113Google ScholarGoogle ScholarCross RefCross Ref
  11. He, B. (2019) ‘Heuristic search algorithm for dimensionality reduction optimally combining feature selection and feature extraction’, Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), pp. 2280–2287. doi:10.1609/aaai.v33i01.33012280.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kenji Kira and Larry A. Rendell. 1992. The feature selection problem: traditional methods and a new algorithm. In Proceedings of the tenth national conference on Artificial intelligence (AAAI'92). AAAI Press, 129–134.Google ScholarGoogle Scholar
  13. Qingqi Zhang. 2021. Housing Price Prediction Based on Multiple Linear Regression. Scientific Programming 2021, 1-9. https://doi.org/10.1155/2021/7678931Google ScholarGoogle ScholarCross RefCross Ref
  14. Radhika Swarnkar, Rhea Sawant, Harikrishnan R, and Srideviponmalar P. 2023. Multiple Linear Regression Algorithm-based Car Price Prediction. 2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS), 675-681. https://doi.org/10.1109/icais56108.2023.10073882Google ScholarGoogle ScholarCross RefCross Ref
  15. Jengei Hong, Heeyoul Choi, and Woo-sung Kim. 2020. A HOUSE PRICE VALUATION BASED ON THE RANDOM FOREST APPROACH: THE MASS APPRAISAL OF RESIDENTIAL PROPERTY IN SOUTH KOREA. International Journal of Strategic Property Management 24, 140-152. https://doi.org/10.3846/ijspm.2020.11544Google ScholarGoogle ScholarCross RefCross Ref
  16. Khaidem, L., Saha, S. and Dey, S.R., 2016. Predicting the direction of stock market prices using random forest. arXiv:1605.00003. Retrieved from https://arxiv.org/abs/1605.00003Google ScholarGoogle Scholar
  17. Sarkar Snigdha Sarathi Das, Mohammed Eunus Ali, Yuan-Fang Li, Yong-Bin Kang, and Timos Sellis. 2021. Boosting house price predictions using geo-spatial network embedding. Data Mining and Knowledge Discovery 35, 2221-2250. https://doi.org/10.1007/s10618-021-00789-xGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  18. Baoyang Cui, Zhonglin Ye, Haixing Zhao, Zhuome Renqing, Lei Meng, and Yanlin Yang. 2022. Used Car Price Prediction Based on the Iterative Framework of XGBoost+LightGBM. Electronics 11, 2932. https://doi.org/10.3390/electronics11182932Google ScholarGoogle ScholarCross RefCross Ref
  19. Kazi Ekramul Hoque and Hamoud Aljamaan. 2021. Impact of Hyperparameter Tuning on Machine Learning Models in Stock Price Forecasting. IEEE Access 9, 163815-163830. https://doi.org/10.1109/access.2021.3134138Google ScholarGoogle ScholarCross RefCross Ref
  20. Prabaljeet Singh Saini and Lekha Rani. 2023. Performance Evaluation of Popular Machine Learning Models for Used Car Price Prediction. Proceedings of International Conference on Data Analytics and Insights, ICDAI 2023, 577-588. DOI:https://doi.org/10.1007/978-981-99-3878-0_49Google ScholarGoogle ScholarCross RefCross Ref
  21. Zhang, Y. (2022) ‘Analysis and prediction of second-hand house price based on Random Forest’, Applied Mathematics and Nonlinear Sciences, 7(1), pp. 27–42. doi:10.2478/amns.2022.1.00052.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jing Zhang, Shicheng Cui, Yan Xu, Qianmu Li, and Tao Li. 2018. A novel data-driven stock price trend prediction system. Expert Systems with Applications 97, 60-69. https://doi.org/10.1016/j.eswa.2017.12.026Google ScholarGoogle ScholarCross RefCross Ref
  23. Bin Weng, Lin Lu, Xing Wang, Fadel M. Megahed, and Waldyn Martinez. 2018. Predicting short-term stock prices using ensemble methods and online data sources. Expert Systems with Applications 112, 258-273. https://doi.org/10.1016/j.eswa.2018.06.016Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Novel Domain-Knowledge Based Feature Selection Framework for Price Prediction: Comprehensive Modelling in Sailboat Market
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ICCDA '23: Proceedings of the 2023 7th International Conference on Computing and Data Analysis
            September 2023
            137 pages
            ISBN:9798400700576
            DOI:10.1145/3629264

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 December 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited
          • Article Metrics

            • Downloads (Last 12 months)29
            • Downloads (Last 6 weeks)12

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format