Skip to main content

Data Science Maturity Model: From Raw Data to Pearl’s Causality Hierarchy

  • Conference paper
  • First Online:
Information Systems and Technologies (WorldCIST 2023)

Abstract

Data maturity models are an important and current topic since they allow organizations to plan their medium and long-term goals. However, most maturity models do not follow what is done in digital technologies regarding experimentation. Data Science appears in the literature related to Business Intelligence (BI) and Business Analytics (BA). This work presents a new data science maturity model that combines previous ones with the emerging Business Experimentation (BE) and causality concepts. In this work, each level is identified with a specific function. For each level, the techniques are introduced and associated with meaningful wh-questions. We demonstrate the maturity model by presenting two case studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Cao, L.: Domain-driven, actionable knowledge discovery. In: IEEE Intelligent Systems, pp. 78–79. IEEE Computer Society, Sydney (2007)

    Google Scholar 

  • Cao, L.: Domain-driven data mining: challenges and prospects. IEEE Trans. Knowl. Data Eng. 22(6), 755–769 (2010). https://doi.org/10.1109/TKDE.2010.32

    Article  Google Scholar 

  • Carvalho, J.V., Rocha, A., Vasconcelos, J., Abreu, A.: A health data analytics maturity model for hospitals information systems. Int. J. Inf. Manage. 46, 278–285 (2019). https://doi.org/10.1016/j.ijinfomgt.2018.07.001

    Article  Google Scholar 

  • Cavique, L., Mendes, A.B., Martiniano, H.F.M.C., Correia, L.: A bi-objective feature selection algorithm for large omics datasets. Expert Syst. e12301 (2018a). https://doi.org/10.1111/exsy.12301

  • Cavique, L.: A scalable algorithm for the market basket analysis. J. Retail. Consum. Serv. Spec. Issue Data Min. Retail. Consum. Serv. 14(6), 400–407 (2007)

    Article  Google Scholar 

  • Cavique, L., Rego, C., Themido, I.: Subgraph ejection chains and tabu search for the crew scheduling problem. JORS J. Oper. Res. Soc. 50(6), 608–616 (1999)

    Article  Google Scholar 

  • Cavique, L., Cavique, M., Gonçalves, A.: Extraction of fact tables from a relational database: an effort to establish rules in denormalization. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S. (eds.) WorldCIST’19 2019. AISC, vol. 930, pp. 936–945. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16181-1_88

    Chapter  Google Scholar 

  • Cavique, L., Cavique, M., Santos, J.: Supply-demand matrix: a process-oriented approach for data warehouses with constellation schemas. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S., Orovic, I., Moreira, F. (eds.) WorldCIST 2020. AISC, vol. 1159, pp. 324–332. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45688-7_33

    Chapter  Google Scholar 

  • Cavique, L., Marques, N.C., Gonçalves, A.: A data reduction approach using hypergraphs to visualize communities and brokers in social networks. Soc. Netw. Anal. Min. 8, 60 (2018b). https://doi.org/10.1007/s13278-018-0538-6

  • Chiarello, F., Belingheri, P., Fantoni, G.: Data science for engineering design: State of the art and future directions. Comput. Ind. 129, 103447 (2021). https://doi.org/10.1016/j.compind.2021.103447. ISSN 0166-3615

  • Davenport, T.H.: DELTA plus model & five stages of analytics maturity: a primer, international institute for analytics (2018)

    Google Scholar 

  • Dhar, V.: Data science and prediction. Commun. ACM 56(12), 64–73 (2013)

    Article  Google Scholar 

  • Gartner. Gartner analytic ascendancy model. Gartner.com (2012)

    Google Scholar 

  • Luca, M., Bazerman, M.H.: The Power of Experiments: Decision Making in a Data-Driven World. MIT Press (2020). ISBN 978-0262043878

    Google Scholar 

  • Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  • Pearl, J.: The seven tools of causal inference, with reflections on machine learning. Commun. ACM 62(3), 54–60 (2019)

    Article  Google Scholar 

  • Pearl, J., Mackenzie, D.: The Book of Why: The New Science of Cause and Effect. Basic Books, New York (2018). ISBN: 978-0-465-09760-9

    Google Scholar 

  • Pearl, J., Glymour, M.: Causal Inference in Statistics: A Primer. Wiley (2016). ISBN 978-1-119-18684-7

    Google Scholar 

  • Pfeffer, J., Sutton, R.I.: Knowing ‘what’ to do is not enough: turning knowledge into action. Calif. Manage. Rev. 42, 83–108 (1999)

    Google Scholar 

  • Pinheiro, P., Cavique, L.: Uplift modeling using the transformed outcome approach. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds.) EPIA 2022. LNCS, vol. 13566, pp. 623–635. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16474-3_51

    Chapter  Google Scholar 

  • Santos, J., Negas, E.R., Santos, L.C.: Introduction to data envelopment analysis. In: Mendes, A., L. D. G. Soares da Silva, E., Azevedo Santos, J. (eds.) Efficiency Measures in the Agricultural Sector, pp. 37–50. Springer, Dordrecht (2013). https://doi.org/10.1007/978-94-007-5739-4_3. ISBN 978-94-007-5738-7

  • Telco Customer Churn. Dataset (2018). https://www.kaggle.com/blastchar/telco-customer-churn. Accessed 01 Nov 2021

  • Thomke, S.H.: Experimentation Works: The Surprising Power of Business Experiments. Harvard Business Review Press (2020) ISBN 978-1633697102

    Google Scholar 

  • Tiple P., Cavique, L., Marques, N.C.: Ramex-forum: a tool for displaying and analyzing complex sequential patterns of financial products. Expert Syst. 1–16 (2016). https://doi.org/10.1111/exsy.12174

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luís Cavique .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cavique, L., Pinheiro, P., Mendes, A. (2024). Data Science Maturity Model: From Raw Data to Pearl’s Causality Hierarchy. In: Rocha, A., Adeli, H., Dzemyda, G., Moreira, F., Colla, V. (eds) Information Systems and Technologies. WorldCIST 2023. Lecture Notes in Networks and Systems, vol 801. Springer, Cham. https://doi.org/10.1007/978-3-031-45648-0_32

Download citation

Publish with us

Policies and ethics