Skip to main content
Log in

End-to-End Implementation of Automated Price Forecasting Applications

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Forecasting prices of used construction equipment is challenging due to spatial and temporal price fluctuations. Automating this forecasting process using current market data is, therefore, highly desirable. A promising and common strategy is the application of machine learning (ML) techniques. However, small and medium-sized enterprise often struggle with the implementation of ML approaches due to a lack of ML expertise. In response, we demonstrate the potential of substituting manually created ML pipelines with automated machine learning (AutoML) solutions, which autonomously create the underlying pipelines. Therefore, we follow the CRISP-DM process to identify tasks requiring ML expertise. First, we dissect the ML pipeline into an machine learning and non-machine learning part and use AutoML to automate the former. Consecutively, we also automate the data preprocessing step, being part of the non-machine learning tasks, to further reduce the dependency on data processing expertise. Additionally, we implement a data-centric result evaluation, rating the reliability of the trained ML models. This approach supports the domain-driven creation of ML pipelines, democratizing the use of ML. To address all complex industrial requirements and showcase the practicality of our approach, we developed an innovative metric called method evaluation score. This metric encompasses key technical and non-technical parameters essential for domain experts to assess the quality and usability of the generated models. Based on this metric, we demonstrate in our case study that combining domain knowledge with AutoML and automatic preprocessing can reduce the reliance on ML experts for innovative small and medium-sized enterprise keen on adopting such technologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data is available within the GitHub repository depicted in https://github.com/AutoQML/End-to-End-Automated-Price-Forecasting.

Notes

  1. The market portals are https://www.mascus.de, https://catused.cat.com, https://www.mobile.de, https://machineryline.de, https://trademachines.de, https://www.truck1.eu, and https://www.truckscout24.de.

  2. See https://github.com/AutoQML/End-to-End-Automated-Price-Forecasting.

  3. The list of hyperparameters is available in Appendix 7.

References

  1. Ali R, Lee S, Chung TC. Accurate multi-criteria decision making methodology for recommending machine learning algorithm. Expert Syst Appl. 2017;71:257–78.

    Article  Google Scholar 

  2. Alshboul O, Shehadeh A, Al-Kasasbeh M, Al Mamlook RE, Halalsheh N, Alkasasbeh M. Deep and machine learning approaches for forecasting the residual value of heavy construction equipment: a management decision support model. Engineering, Construction and Architectural Management; 2021.

  3. Ardic OP, Mylenko N, Saltane V. Small and medium enterprises: A cross-country analysis with a new data; 2011.

  4. Baudart G, Hirzel M, Kate K, Ram P, Shinnar A, Tsay J. Pipeline combinators for gradual automl. In: Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW, editors. Advances in Neural Information Processing Systems, vol. 34. Curran Associates Inc; 2021. p. 19705–18.

    Google Scholar 

  5. Bauer M, van Dinther C, Kiefer D. Machine learning in sme: an empirical study on enablers and success factors. AIS Electronic Library (AISeL); 2020.

  6. Bergstra J, Bengio Y. Random search for hyper-parameter optimization. JMLR. 2012;13:281–305.

    MathSciNet  Google Scholar 

  7. Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, Bernstein MS, Bohg J, Bosselut A, Brunskill E, Brynjolfsson E, Buch S, Card D, Castellon R, Chatterji NS, Chen AS, Creel KA, Davis J, Demszky D, Donahue C, Doumbouya M, Durmus E, Ermon S, Etchemendy J, Ethayarajh K, Fei-Fei L, Finn C, Gale T, Gillespie LE, Goel K, Goodman ND, Grossman S, Guha N, Hashimoto T, Henderson P, Hewitt J, Ho DE, Hong J, Hsu K, Huang J, Icard TF, Jain S, Jurafsky D, Kalluri P, Karamcheti S, Keeling G, Khani F, Khattab O, Koh PW, Krass MS, Krishna R, Kuditipudi R, Kumar A, Ladhak F, Lee M, Lee T, Leskovec J, Levent I, Li XL, Li X, Ma T, Malik A, Manning CD, Mirchandani SP, Mitchell E, Munyikwa Z, Nair S, Narayan A, Narayanan D, Newman B, Nie A, Niebles JC, Nilforoshan H, Nyarko JF, Ogut G, Orr L, Papadimitriou I, Park JS, Piech C, Portelance E, Potts C, Raghunathan A, Reich R, Ren H, Rong F, Roohani YH, Ruiz C, Ryan J, R’e C, Sadigh D, Sagawa S, Santhanam K, Shih A, Srinivasan KP, Tamkin A, Taori R, Thomas AW, Tramèr F, Wang RE, Wang W, Wu B, Wu J, Wu Y, Xie SM, Yasunaga M, You J, Zaharia MA, Zhang M, Zhang T, Zhang X, Zhang Y, Zheng L, Zhou K, Liang P. On the opportunities and risks of foundation models. 2021.

  8. Carlini N, Erlingsson Úlfar, Papernot N. Distribution density, tails, and outliers in machine learning: Metrics and applications; 2019. arXiv preprint arXiv:1910.13427

  9. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016. p. 785–794.

  10. Chiteri M. Cash-Flow and Residual Value Analysis for Construction Equipment. Master’s thesis, University of Alberta; 2018.

  11. Crisan A. Fiore-Gartland B. Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop. In: Conference on Human Factors in Computing Systems (Association for Computing Machinery, 2021). p. 1–15.

  12. De Mauro A, Greco M, Grimaldi M, Ritala P. Human resources for big data professions: A systematic classification of job roles and required skill sets. Information Processing & Management. 2018;54(5).

  13. Erickson N, Mueller J, Shirkov A, Zhang H, Larroy P, Li M, Smola A. Autogluon-tabular: Robust and accurate automl for structured data; 2020. arXiv preprint arXiv:2003.06505

  14. Fan H, AbouRizk S, Kim H, Zaïane O. Assessing residual value of heavy construction equipment using predictive data mining model. J Comput Civ Eng. 2008;22(3):181–91.

    Article  Google Scholar 

  15. Feurer M, Eggensperger K, Falkner S, Lindauer M, Hutter F. Auto-sklearn 2.0: Hands-free automl via meta-learning; 2020. arXiv preprint arXiv:2007.04074

  16. Frazier PI. A tutorial on bayesian optimization; 2018. p. 1–22. arXiv preprint arXiv: 1807.02811

  17. Géron A. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media; 2022.

    Google Scholar 

  18. Gijsbers P, LeDell E, Thomas J, Poirier S, Bischl B, Vanschoren J. An open source automl benchmark; 2019. arXiv preprint arXiv:1907.00909

  19. Hollmann N, Müller S, Hutter F. Llms for semi-automated data science: Introducing caafe for context-aware automated feature engineering; 2023.

  20. Hong S, Zhuge M, Chen J, Zheng X, Cheng Y, Zhang C, Wang J, Wang Z, Yau SKS, Lin Z, Zhou L, Ran C, Xiao L, Wu C, Schmidhuber J. Metagpt: Meta programming for a multi-agent collaborative framework. Science. 2023.

  21. Hutter F, Kotthoff L, Vanschoren J. Automated machine learning: methods, systems, challenges. Springer Nature; 2019.

    Book  Google Scholar 

  22. Jenkins DG, Quintana-Ascencio PF. A solution to minimum sample size for regressions. PloS one. 2020;15(2).

  23. Jin H, Chollet F, Song Q, Hu X. Autokeras: An automl library for deep learning. J Mach Learn Res. 2023;24(6):1–6.

    MathSciNet  Google Scholar 

  24. Kanter JM, Veeramachaneni K. Deep feature synthesis: Towards automating data science endeavors. In: 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, October 19–21, 2015 (IEEE, 2015). p. 1–10.

  25. Kolyshkina I, Simoff S. Interpretability of machine learning solutions in industrial decision engineering. In: Australasian Conference on Data Mining; 2019.

  26. Lucko G. A statistical analysis and model of the residual value of different types of heavy construction equipment. Ph.D. thesis, Virginia Tech; 2003.

  27. Lucko G. Modeling the residual market value of construction equipment under changed economic conditions. JCEMD4. 2011;137(10):806–16.

    Google Scholar 

  28. Lucko G, Vorster MC. Predicting the residual value of heavy construction equipment. In: Towards a vision for information technology in civil engineering. American Society of Civil Engineers; 2004.

  29. Lucko G, Vorster MC, Anderson-Cook CM. Unknown element of owning costs - impact of residual value. JCEMD4. 2007;133(1).

  30. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, editors. Advances in Neural Information Processing Systems, vol. 30. Curran Associates Inc; 2017. p. 4765–74.

    Google Scholar 

  31. Microsoft: Neural Network Intelligence; 2021. https://github.com/microsoft/nni

  32. Milošević I, Kovačević M, Petronijević P. Estimating residual value of heavy construction equipment using ensemble learning. JCEMD4. 2021;147(7).

  33. Milošević I, Petronijević P, Arizanović D. Determination of residual value of construction machinery based on machine age. Građevinar. 2020;72:45–55.

    Google Scholar 

  34. Newman DA. Missing data: Five practical guidelines. Organizational Research Methods. 2014;17(4).

  35. Nielsen J. Usability Heuristics, chap. 5.5 Feedback. Morgan Kaufmann; 1993.

    Google Scholar 

  36. Peng D, Dong X, Real E, Tan M, Lu Y, Bender G, Liu H, Kraft A, Liang C, Le Q. Pyglove: Symbolic programming for automated machine learning. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H, editors. Advances in Neural Information Processing Systems, vol. 33. Curran Associates Inc; 2020. p. 96–108.

    Google Scholar 

  37. Ponnaluru SS, Marsh TL, Brady M. Spatial price analysis of used construction equipment: The case of excavators. Constr Manag Econ. 2012;30(11):981–94.

    Article  Google Scholar 

  38. Shapley LS. Notes on the N-Person Game - II: The Value of an N-Person Game. Santa Monica, CA: RAND Corporation; 1951.

    Google Scholar 

  39. Shearer C. The crisp-dm model: the new blueprint for data mining. Journal of data warehousing. 2000;5(4).

  40. Shehadeh A, Alshboul O, Al Mamlook RE, Hamedat O. Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, lightgbm, and xgboost regression. Automation in Construction. 2021;129.

  41. Stühler H, Zöller M, Klau D, Beiderwellen-Bedrikow A, Tutschku C. Benchmarking automated machine learning methods for price forecasting applications. In: Proceedings of the 12th International Conference on Data Science, Technology and Applications - DATA (INSTICC, SciTePress, 2023). p. 30–39.

  42. Studer S, Bui TB, Drescher C, Hanuschkin A, Winkler L, Peters S, Müller KR. Towards crisp-ml (q): a machine learning process model with quality assurance methodology. Machine Learning and Knowledge Extraction. 2021;3(2):392–413.

    Article  Google Scholar 

  43. Vinutha H, Poornima B, Sagar B. Detection of outliers using interquartile range technique from intrusion dataset. In: Information and Decision Sciences: Proceedings of the 6th International Conference on FICTA (Springer, 2018). pp. 511–518.

  44. Wan X, Wang W, Liu J, Tong T. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol. 2014;14:1–13.

    Article  Google Scholar 

  45. Wang C, Wu Q, Liu X, Quintanilla L. Automated machine learning & tuning with flaml. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; 2022.

  46. Yao Q, Wang M, Chen Y, Dai W, Li YF, Tu WW, Yang Q, Yu Y. Taking human out of learning applications: A survey on automated machine learning; 2018. arXiv preprint arXiv:1810.13306

  47. Zhang S, Gong C, Wu L, Liu X, Zhou M. Automl-gpt: Automatic machine learning with gpt; 2023.

  48. Zöller MA, Huber MF. Benchmark and survey of automated machine learning frameworks. Journal of Artificial Intelligence Research. 2021;70:409–72.

    Article  MathSciNet  Google Scholar 

  49. Zöller MA, Nguyen TD, Huber MF. Incremental search space construction for machine learning pipeline synthesis. In: Advances in Intelligent Data Analysis XIX; 2021.

  50. Zong Y. Maintenance cost and residual value prediction of heavy construction equipment. Master’s thesis, University of Alberta; 2017.

  51. Zoph B, Le QV. Neural architecture search with reinforcement learning; 2016. arXiv preprint arXiv:1611.01578

Download references

Acknowledgements

This work was partly funded by the German Federal Ministry of Economic Affairs and Climate Action in the research project AutoQML (Grant no. 01MQ22002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Horst Stühler.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Recent Trends on Data Science, Technology and Applications” guest edited by Slimane Hammoudi, Alfredo Cuzzocrea and Oleg Gusikhin.

Appendices

Appendix A Example Usage

The manual implementation of the ML methods (Polynomial Regression, Decision Tree, Random Forest, Support Vector Regressor, K-Nearest Neighbor, AdaBoost Regressor and Multy Layer Perceptron) require approximately 50 lines of code (LOC) on average and approximately 13 different libraries:

figure a

On the other hand training and prediction with AutoGluon can be implemented within three lines of code:

figure b

The same holds for AutoSklearn

figure c

Flaml

figure d

and AutoKeras

figure e

Appendix B Search Spaces

See Tables 4, 5, 6.

Table 4 ML and AutoML methods
Table 5 AutoML configuration
Table 6 ML Hyperparameter

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stühler, H., Klau, D., Zöller, MA. et al. End-to-End Implementation of Automated Price Forecasting Applications. SN COMPUT. SCI. 5, 402 (2024). https://doi.org/10.1007/s42979-024-02735-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-02735-2

Keywords

Navigation