Abstract
Over the past decade, several sophisticated analytic techniques such as machine learning, neural networks, and predictive modelling have evolved to enable scientists to derive insights from data. Data Science is characterised by a cycle of model selection, customization and testing, as scientists often do not know the exact goal or expected results beforehand. Existing research efforts which explore maximising automation, reproducibility and interoperability are quite mature and fail to address a third criterion, usability. The main contribution of this paper is to explore the development of more complex semantic data models linked with existing ontologies (e.g. FIBO) that enable the standardisation of data formats as well as meaning and interpretation of data in automated data analysis. A model-driven architecture with the reference model that capture statistical learning requirement is proposed together with a prototype based around a case study in commodity pricing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Function is multiple linear regression, which is a widely used form in statistical learning.
.
References
Info Package for UNSW Data Science Hackathon. http://www.cse.unsw.edu.au/~fethir/HackathonInfo/HackathonStudentPack_v7.pdf. Accessed on 10 Sep 2016
OASIS SOA Reference Model Technical Committee. https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=soa-rm/. Accessed on 10 Sep 2016
Financial Industry Business Ontology Foundations, The Enterprise Data Management Council. http://www.edmcouncil.org/edmcouncil. Accessed on 10 Sep 2016
Financial Industry Business Ontology (FIBO), Object Management Group. http://www.omg.org/spec/EDMC-FIBO/. Accessed on 10 Sep 2016
Merriam Webster, Measure (Definition). http://www.merriam-webster.com/dictionary/measure. Accessed on 10 Sep 2016
Roach, T.M.: CAPSICUM – A Semantic Framework for Strategically Aligned Business Architecture. Ph.D Thesis, UNSW, Sydney, Australia (2011)
Behnaz, A., Rabhi, F., Peat, M.: A software architecture for enabling time series analysis on real-time event data. In: Proceedings of International Work-Conference on Time Series, June 2016
Vapnik, V.: The Nature of Statistical Learning Theory, 2nd edn. Springer, New York (1999)
Rabhi, F.A., Yao, L., Guabtni, A.: ADAGE: a framework for supporting user-driven ad-hoc data analysis processes. Computing 94(6), 489–519 (2012). doi:10.1007/s00607-012-0193-0
Yao, L., Rabhi, F.A.: Building architectures for data-intensive science using the adage framework. Concurrency Comput. Pract. Exp. 27(5), 1188–1206 (2015)
Chen, J., Choudhary, A., Feldman, S., Hendrickson, B., Johnson, C., Mount, R., Sarkar, V., White, V., Williams, D.: Synergistic challenges in data-intensive science and exascale computing. DOE ASCAC Data Subcommittee Report, Department of Energy Office of Science (2013)
Yao, L., Rabhi, F., Peat, M.: Supporting data-intensive analysis processes: a review of enabling technologies and trends. In: Ramanathan, R., Raja, K. (eds.) Handbook of Research on Architectural Trends in Service-Driven Computing, vol. 2, pp. 481–508. IGI Global, Hershey (2014). doi:10.4018/978-1-4666-6178-3
Bernstien, P.A., Wecker, D., Krishnamurthy, A., Manocha, D., Gardner, J., Kolker, N., Reschke, C., Stombaugh, J., Vagata, P., Stewart, E.: Technology and data-intensive science in the beginning of the 21st century. Omics: J. Integr. Biol. 15, 203–207 (2011)
Yao, L.: ADAGE A Framework For Supporting User-Driven Ad Hoc Data Analysis Processes. Doctor of Philosophy, University of New South Wales (2013)
OASIS, OASIS Web Services Business Process Execution Language (WSBPEL) TC | OASIS. https://www.oasis-open.org/committees/wsbpel/. Accessed 9 Sep 2016
TAVERNA 2009, Taverna - open source and domain independent Workflow Management System (2009). http://www.taverna.org.uk/. Accessed 9 Sep 2016
Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurrency Comput. Pract. Exp. 18, 1039–1065 (2006)
Deelman, E., Moody, J., Kim, J., Ratnakar, V., Gil, Y., Gonzalez-Calero, P.A., Groth, P.: Wings: intelligent workflow-based design of computational experiments. IEEE Intell. Syst. 26(1), 62–72 (2011)
Gnumeric.org., Gnumeric (2016). http://www.gnumeric.org/. Accessed 17 Sep 2016
Apps.google.com. Google Sheets – Spreadsheets & Data Analysis for Business (2016). https://apps.google.com/intx/en_au/products/sheets/. Accessed 17 Sep 2016
Evans, E.: Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley Professional, Reading (2004)
Völter, M., Stahl, T., Bettin, J., Haase, A., Helsen, S.: Model-Driven Software Development: Technology, Engineering, Management. John Wiley & Sons, Hoboken (2013)
W3.org. OWL Web Ontology Language Guide. (2016). https://www.w3.org/TR/owl-guide/. Accessed 17 Sep 2016
W3.org. Financial Industry Business Ontology Community Group (2016). https://www.w3.org/community/fibo/. Accessed 17 Sep 2016
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning with Applications in R. Springer, New York (2013)
Milosevic, Z., Chen, W., Berry, A., Rabhi, F.A.: An open architecture for event-based analytics. Accepted in Int. J. Data Sci. Anal. (2016)
Natarajan, A.: Aventis, An architecture for event data analysis. Doctor of Philosophy, University of New South Wales (2016)
Behnaz, A., Rabhi, F., Peat, M.: A Software Architecture for Enabling Statistical Learning on Big Data. Springer Series on Statistics (2016)
Acknowledgements
We are grateful to ANZ Bank Agribusiness unit, especially Richard Schroder and Felipe Flores, Thomson Reuters and IBM for sponsoring the Hackathon which provided the data for the case study of this paper. We are also grateful to Terry Roach and Max Gillmore for helping on different aspects of this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Behnaz, A., Natarajan, A., Rabhi, F.A., Peat, M. (2017). A Semantic-Based Analytics Architecture and Its Application to Commodity Pricing. In: Feuerriegel, S., Neumann, D. (eds) Enterprise Applications, Markets and Services in the Finance Industry. FinanceCom 2016. Lecture Notes in Business Information Processing, vol 276. Springer, Cham. https://doi.org/10.1007/978-3-319-52764-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-52764-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52763-5
Online ISBN: 978-3-319-52764-2
eBook Packages: Business and ManagementBusiness and Management (R0)