Abstract
This paper is aimed at showing the relevance and importance of high quality dataset in machine learning within the field of economic intelligence. As open source dataset flourish and algorithm are trained with different and often very narrow data of various kind, in the field of economic intelligence it is important to train machines with proper and high value data to avoid or reduce at maximum false positives, errors and biases of various kind. We propone the case study and the solution offered by Bureau Van Dijk, where economic data are carefully evaluated, organized and its API and REST services could matter in the very near future in the field of data mining and machine learning for economic intelligence purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cagala, T.: Improving Data Quality and Closing Data Gaps with Machine Learning. Deutsche Bundesbank, German Securities Holding Statistics (2017)
Gudivada, V.N., Apon, A., Dingh, J.: Data quality considerations for big data and machine learning: going beyond data cleaning and transformation. Int. J. Adv. Softw. 10(1–2), 1–20 (2017)
Borovicka, T., Jirina Jr., M., Kordik, P., Jirina, M.: Selecting Representative Data Sets. Advances in Data Mining Knowledge Discovery and Application. Intech (2012)
Schelter, S., Lange, D., Schmidt, P., Celikel, M., Biessmann, F., Grafberger, A.: Automating large-scale data quality verification. In: VLDB Endowment Proceedings of the VLDB Endowment, vol. 11, no. 12, pp. 1781–1794. VLDB Endowment (2016)
Redman, T.C.: Bad Data Costs the U.S. 3 Trillion Per Year. Harvard Business Review (2016)
Echerson, W.W.: Data Quality and the Bottom Line, 1st edn. The Data Warehousing Institute (2002)
Marcheggiani, D., Sebastiani, F.: On the effects of low-quality training data on information extraction from clinical reports. Data Inf. Qual. 9(1), 1–25 (2017)
Zellal, N., Zaouia, A.: An examination of factors influencing the quality of data in a data warehouse. IJCSNS Int. J. Comput. Sci. Netw. Secur. 17(1), 161–169 (2017)
Teti, A.: Open Source Intelligence and Cyberspace, 1st edn. Rubbettino, Roma (2015)
Collingwood, L., Wilkerson, J.: Tradeoffs in accuracy and efficiency in supervised learning methods. J. Inf. Technol. Polit. (JITP) 4, 298–318 (2011)
Sessions, V., Valtorta, M.: The Effects of Data Quality on Machine Learning Algorithms. Research in Progress, University of South Carolina (2017)
Singh, H., Singh, P.B.: Business intelligence: effective machine learning for business administration. Int. J. IT Eng. Appl. Sci. Res. (IJIEASR) 2(1), 13–19 (2013)
Duy Vo, Q., Jaya, T., Cho, S., De, P., Choi, B.J., Sael, L.: Next generation business intelligence and analytics: a survey. IEEE Commun. Surv. Tutor. 18(1), 491–506 (2017)
Financial Stability Board: Artificial Intelligence and Machine Learning in Financial Services. Financial Stability Board (2017)
Sapp, C.E.: Preparing and Architecting for Machine Learning. Gartner Technical Professional Advice (2017)
Pirani, Z., Marewar, A., Bhavnagarwala, Z., Kamble, M.: Application of business intelligence using machine learning approach. Int. J. Comput. Appl. 165(13), 28–31 (2017)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Franchina, L., Sergiani, F. (2020). High Quality Dataset for Machine Learning in the Business Intelligence Domain. In: Bi, Y., Bhatia, R., Kapoor, S. (eds) Intelligent Systems and Applications. IntelliSys 2019. Advances in Intelligent Systems and Computing, vol 1037. Springer, Cham. https://doi.org/10.1007/978-3-030-29516-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-29516-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29515-8
Online ISBN: 978-3-030-29516-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)