End-to-End Implementation of Automated Price Forecasting Applications

Stühler, Horst; Klau, Dennis; Zöller, Marc-André; Beiderwellen-Bedrikow, Alexandre; Tutschku, Christian

doi:10.1007/s42979-024-02735-2

End-to-End Implementation of Automated Price Forecasting Applications

Original Research
Published: 06 April 2024

Volume 5, article number 402, (2024)
Cite this article

SN Computer Science Aims and scope Submit manuscript

44 Accesses
Explore all metrics

Abstract

Forecasting prices of used construction equipment is challenging due to spatial and temporal price fluctuations. Automating this forecasting process using current market data is, therefore, highly desirable. A promising and common strategy is the application of machine learning (ML) techniques. However, small and medium-sized enterprise often struggle with the implementation of ML approaches due to a lack of ML expertise. In response, we demonstrate the potential of substituting manually created ML pipelines with automated machine learning (AutoML) solutions, which autonomously create the underlying pipelines. Therefore, we follow the CRISP-DM process to identify tasks requiring ML expertise. First, we dissect the ML pipeline into an machine learning and non-machine learning part and use AutoML to automate the former. Consecutively, we also automate the data preprocessing step, being part of the non-machine learning tasks, to further reduce the dependency on data processing expertise. Additionally, we implement a data-centric result evaluation, rating the reliability of the trained ML models. This approach supports the domain-driven creation of ML pipelines, democratizing the use of ML. To address all complex industrial requirements and showcase the practicality of our approach, we developed an innovative metric called method evaluation score. This metric encompasses key technical and non-technical parameters essential for domain experts to assess the quality and usability of the generated models. Based on this metric, we demonstrate in our case study that combining domain knowledge with AutoML and automatic preprocessing can reduce the reliance on ML experts for innovative small and medium-sized enterprise keen on adopting such technologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Algorithm Application in the Construction Industry – A Review

Machine learning applied in production planning and control: a state-of-the-art in the era of industry 4.0

Article 11 January 2020

Interpretability of Machine Learning Solutions in Industrial Decision Engineering

Data availability

The data is available within the GitHub repository depicted in https://github.com/AutoQML/End-to-End-Automated-Price-Forecasting.

Notes

The market portals are https://www.mascus.de, https://catused.cat.com, https://www.mobile.de, https://machineryline.de, https://trademachines.de, https://www.truck1.eu, and https://www.truckscout24.de.
See https://github.com/AutoQML/End-to-End-Automated-Price-Forecasting.
The list of hyperparameters is available in Appendix 7.

References

Ali R, Lee S, Chung TC. Accurate multi-criteria decision making methodology for recommending machine learning algorithm. Expert Syst Appl. 2017;71:257–78.
Article Google Scholar
Alshboul O, Shehadeh A, Al-Kasasbeh M, Al Mamlook RE, Halalsheh N, Alkasasbeh M. Deep and machine learning approaches for forecasting the residual value of heavy construction equipment: a management decision support model. Engineering, Construction and Architectural Management; 2021.
Ardic OP, Mylenko N, Saltane V. Small and medium enterprises: A cross-country analysis with a new data; 2011.
Baudart G, Hirzel M, Kate K, Ram P, Shinnar A, Tsay J. Pipeline combinators for gradual automl. In: Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW, editors. Advances in Neural Information Processing Systems, vol. 34. Curran Associates Inc; 2021. p. 19705–18.
Google Scholar
Bauer M, van Dinther C, Kiefer D. Machine learning in sme: an empirical study on enablers and success factors. AIS Electronic Library (AISeL); 2020.
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. JMLR. 2012;13:281–305.
MathSciNet Google Scholar
Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, Bernstein MS, Bohg J, Bosselut A, Brunskill E, Brynjolfsson E, Buch S, Card D, Castellon R, Chatterji NS, Chen AS, Creel KA, Davis J, Demszky D, Donahue C, Doumbouya M, Durmus E, Ermon S, Etchemendy J, Ethayarajh K, Fei-Fei L, Finn C, Gale T, Gillespie LE, Goel K, Goodman ND, Grossman S, Guha N, Hashimoto T, Henderson P, Hewitt J, Ho DE, Hong J, Hsu K, Huang J, Icard TF, Jain S, Jurafsky D, Kalluri P, Karamcheti S, Keeling G, Khani F, Khattab O, Koh PW, Krass MS, Krishna R, Kuditipudi R, Kumar A, Ladhak F, Lee M, Lee T, Leskovec J, Levent I, Li XL, Li X, Ma T, Malik A, Manning CD, Mirchandani SP, Mitchell E, Munyikwa Z, Nair S, Narayan A, Narayanan D, Newman B, Nie A, Niebles JC, Nilforoshan H, Nyarko JF, Ogut G, Orr L, Papadimitriou I, Park JS, Piech C, Portelance E, Potts C, Raghunathan A, Reich R, Ren H, Rong F, Roohani YH, Ruiz C, Ryan J, R’e C, Sadigh D, Sagawa S, Santhanam K, Shih A, Srinivasan KP, Tamkin A, Taori R, Thomas AW, Tramèr F, Wang RE, Wang W, Wu B, Wu J, Wu Y, Xie SM, Yasunaga M, You J, Zaharia MA, Zhang M, Zhang T, Zhang X, Zhang Y, Zheng L, Zhou K, Liang P. On the opportunities and risks of foundation models. 2021.
Carlini N, Erlingsson Úlfar, Papernot N. Distribution density, tails, and outliers in machine learning: Metrics and applications; 2019. arXiv preprint arXiv:1910.13427
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016. p. 785–794.
Chiteri M. Cash-Flow and Residual Value Analysis for Construction Equipment. Master’s thesis, University of Alberta; 2018.
Crisan A. Fiore-Gartland B. Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop. In: Conference on Human Factors in Computing Systems (Association for Computing Machinery, 2021). p. 1–15.
De Mauro A, Greco M, Grimaldi M, Ritala P. Human resources for big data professions: A systematic classification of job roles and required skill sets. Information Processing & Management. 2018;54(5).
Erickson N, Mueller J, Shirkov A, Zhang H, Larroy P, Li M, Smola A. Autogluon-tabular: Robust and accurate automl for structured data; 2020. arXiv preprint arXiv:2003.06505
Fan H, AbouRizk S, Kim H, Zaïane O. Assessing residual value of heavy construction equipment using predictive data mining model. J Comput Civ Eng. 2008;22(3):181–91.
Article Google Scholar
Feurer M, Eggensperger K, Falkner S, Lindauer M, Hutter F. Auto-sklearn 2.0: Hands-free automl via meta-learning; 2020. arXiv preprint arXiv:2007.04074
Frazier PI. A tutorial on bayesian optimization; 2018. p. 1–22. arXiv preprint arXiv: 1807.02811
Géron A. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media; 2022.
Google Scholar
Gijsbers P, LeDell E, Thomas J, Poirier S, Bischl B, Vanschoren J. An open source automl benchmark; 2019. arXiv preprint arXiv:1907.00909
Hollmann N, Müller S, Hutter F. Llms for semi-automated data science: Introducing caafe for context-aware automated feature engineering; 2023.
Hong S, Zhuge M, Chen J, Zheng X, Cheng Y, Zhang C, Wang J, Wang Z, Yau SKS, Lin Z, Zhou L, Ran C, Xiao L, Wu C, Schmidhuber J. Metagpt: Meta programming for a multi-agent collaborative framework. Science. 2023.
Hutter F, Kotthoff L, Vanschoren J. Automated machine learning: methods, systems, challenges. Springer Nature; 2019.
Book Google Scholar
Jenkins DG, Quintana-Ascencio PF. A solution to minimum sample size for regressions. PloS one. 2020;15(2).
Jin H, Chollet F, Song Q, Hu X. Autokeras: An automl library for deep learning. J Mach Learn Res. 2023;24(6):1–6.
MathSciNet Google Scholar
Kanter JM, Veeramachaneni K. Deep feature synthesis: Towards automating data science endeavors. In: 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, October 19–21, 2015 (IEEE, 2015). p. 1–10.
Kolyshkina I, Simoff S. Interpretability of machine learning solutions in industrial decision engineering. In: Australasian Conference on Data Mining; 2019.
Lucko G. A statistical analysis and model of the residual value of different types of heavy construction equipment. Ph.D. thesis, Virginia Tech; 2003.
Lucko G. Modeling the residual market value of construction equipment under changed economic conditions. JCEMD4. 2011;137(10):806–16.
Google Scholar
Lucko G, Vorster MC. Predicting the residual value of heavy construction equipment. In: Towards a vision for information technology in civil engineering. American Society of Civil Engineers; 2004.
Lucko G, Vorster MC, Anderson-Cook CM. Unknown element of owning costs - impact of residual value. JCEMD4. 2007;133(1).
Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, editors. Advances in Neural Information Processing Systems, vol. 30. Curran Associates Inc; 2017. p. 4765–74.
Google Scholar
Microsoft: Neural Network Intelligence; 2021. https://github.com/microsoft/nni
Milošević I, Kovačević M, Petronijević P. Estimating residual value of heavy construction equipment using ensemble learning. JCEMD4. 2021;147(7).
Milošević I, Petronijević P, Arizanović D. Determination of residual value of construction machinery based on machine age. Građevinar. 2020;72:45–55.
Google Scholar
Newman DA. Missing data: Five practical guidelines. Organizational Research Methods. 2014;17(4).
Nielsen J. Usability Heuristics, chap. 5.5 Feedback. Morgan Kaufmann; 1993.
Google Scholar
Peng D, Dong X, Real E, Tan M, Lu Y, Bender G, Liu H, Kraft A, Liang C, Le Q. Pyglove: Symbolic programming for automated machine learning. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H, editors. Advances in Neural Information Processing Systems, vol. 33. Curran Associates Inc; 2020. p. 96–108.
Google Scholar
Ponnaluru SS, Marsh TL, Brady M. Spatial price analysis of used construction equipment: The case of excavators. Constr Manag Econ. 2012;30(11):981–94.
Article Google Scholar
Shapley LS. Notes on the N-Person Game - II: The Value of an N-Person Game. Santa Monica, CA: RAND Corporation; 1951.
Google Scholar
Shearer C. The crisp-dm model: the new blueprint for data mining. Journal of data warehousing. 2000;5(4).
Shehadeh A, Alshboul O, Al Mamlook RE, Hamedat O. Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, lightgbm, and xgboost regression. Automation in Construction. 2021;129.
Stühler H, Zöller M, Klau D, Beiderwellen-Bedrikow A, Tutschku C. Benchmarking automated machine learning methods for price forecasting applications. In: Proceedings of the 12th International Conference on Data Science, Technology and Applications - DATA (INSTICC, SciTePress, 2023). p. 30–39.
Studer S, Bui TB, Drescher C, Hanuschkin A, Winkler L, Peters S, Müller KR. Towards crisp-ml (q): a machine learning process model with quality assurance methodology. Machine Learning and Knowledge Extraction. 2021;3(2):392–413.
Article Google Scholar
Vinutha H, Poornima B, Sagar B. Detection of outliers using interquartile range technique from intrusion dataset. In: Information and Decision Sciences: Proceedings of the 6th International Conference on FICTA (Springer, 2018). pp. 511–518.
Wan X, Wang W, Liu J, Tong T. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol. 2014;14:1–13.
Article Google Scholar
Wang C, Wu Q, Liu X, Quintanilla L. Automated machine learning & tuning with flaml. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; 2022.
Yao Q, Wang M, Chen Y, Dai W, Li YF, Tu WW, Yang Q, Yu Y. Taking human out of learning applications: A survey on automated machine learning; 2018. arXiv preprint arXiv:1810.13306
Zhang S, Gong C, Wu L, Liu X, Zhou M. Automl-gpt: Automatic machine learning with gpt; 2023.
Zöller MA, Huber MF. Benchmark and survey of automated machine learning frameworks. Journal of Artificial Intelligence Research. 2021;70:409–72.
Article MathSciNet Google Scholar
Zöller MA, Nguyen TD, Huber MF. Incremental search space construction for machine learning pipeline synthesis. In: Advances in Intelligent Data Analysis XIX; 2021.
Zong Y. Maintenance cost and residual value prediction of heavy construction equipment. Master’s thesis, University of Alberta; 2017.
Zoph B, Le QV. Neural architecture search with reinforcement learning; 2016. arXiv preprint arXiv:1611.01578

Download references

Acknowledgements

This work was partly funded by the German Federal Ministry of Economic Affairs and Climate Action in the research project AutoQML (Grant no. 01MQ22002).

Author information

Authors and Affiliations

Zeppelin GmbH, Graf-Zeppelin-Platz 1, 85748, Garching, Germany
Horst Stühler & Alexandre Beiderwellen-Bedrikow
Fraunhofer IAO, Nobelstraße 12, 70569, Stuttgart, Germany
Dennis Klau & Christian Tutschku
USU GmbH, Rüppurrer Str. 1, 76137, Karlsruhe, Germany
Marc-André Zöller

Authors

Horst Stühler
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Klau
View author publications
You can also search for this author in PubMed Google Scholar
Marc-André Zöller
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Beiderwellen-Bedrikow
View author publications
You can also search for this author in PubMed Google Scholar
Christian Tutschku
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Horst Stühler.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Recent Trends on Data Science, Technology and Applications” guest edited by Slimane Hammoudi, Alfredo Cuzzocrea and Oleg Gusikhin.

Appendices

Appendix A Example Usage

The manual implementation of the ML methods (Polynomial Regression, Decision Tree, Random Forest, Support Vector Regressor, K-Nearest Neighbor, AdaBoost Regressor and Multy Layer Perceptron) require approximately 50 lines of code (LOC) on average and approximately 13 different libraries:

On the other hand training and prediction with AutoGluon can be implemented within three lines of code:

The same holds for AutoSklearn

Flaml

and AutoKeras

Appendix B Search Spaces

See Tables 4, 5, 6.

Table 4 ML and AutoML methods

Full size table

Table 5 AutoML configuration

Full size table

Table 6 ML Hyperparameter

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Stühler, H., Klau, D., Zöller, MA. et al. End-to-End Implementation of Automated Price Forecasting Applications. SN COMPUT. SCI. 5, 402 (2024). https://doi.org/10.1007/s42979-024-02735-2

Download citation

Received: 20 December 2023
Accepted: 10 February 2024
Published: 06 April 2024
DOI: https://doi.org/10.1007/s42979-024-02735-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Implementation of Automated Price Forecasting Applications

Abstract

Access this article

Similar content being viewed by others

Machine Learning Algorithm Application in the Construction Industry – A Review

Machine learning applied in production planning and control: a state-of-the-art in the era of industry 4.0

Interpretability of Machine Learning Solutions in Industrial Decision Engineering

Data availability

Notes

References

Acknowledgements