Abstract
The verification and validation (V&V) of the data analysis process is critical for establishing the objective correctness of an analytic workflow. Yet, problems, mechanisms, and shortfalls for verifying and validating data analysis processes have not been investigated, understood, or well defined by the data analysis community. The processes of verification and validation evaluate the correctness of a logical mechanism, either computational or cognitive. Verification establishes whether the object of the evaluation performs as it was designed to perform. (“Does it do the thing right?”) Validation establishes whether the object of the evaluation performs accurately with respect to the real world. (“Does it do the right thing?”) Computational mechanisms producing numerical or statistical results are used by human analysts to gain an understanding about the real world from which the data came. The results of the computational mechanisms motivate cognitive associations that further drive the data analysis process. The combination of computational and cognitive analytical methods into a workflow defines the data analysis process. People do not typically consider the V&V of the data analysis process. The V&V of the cognitive assumptions, reasons, and/or mechanisms that connect analytical elements must also be considered and evaluated for correctness. Data Analysis Process Verification and Validation (DAP-V&V) defines a framework and processes that may be applied to identify, structure, and associate logical elements. DAP-V&V is a way of establishing correctness of individual steps along an analytical workflow and ensuring integrity of conceptual associations that are composed into an aggregate analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
It is important to note that Google Flu Trends is no longer active having been terminated in 2015 [17].
- 2.
- 3.
“Key Concepts of VV and A” Sept. 15, 2006; official DOD pp. 7–8; http://vva.msco.mil/Key/key-prd.pdf.
- 4.
Ibid. p. 6.
- 5.
Though the purpose for defining an epistemological hierarchy (EH) model of knowledge elements was for evaluating the verification and validation of Human, Social, Cultural, Behavioral (HSCB) models, the mechanism is applicable to any kind of inquiry-based modeling. The prerequisite for an EH decomposition of a model is a Kantian composition of knowledge elements defined as observable concepts and reasoned understanding over those concepts [8].
- 6.
- 7.
The R&M data consists of names of particular LRUs diagnosed as “faulty” and dates they were removed from the aircraft.
- 8.
The Mission is defined as specific characteristics of how the aircraft is being flown. In this case, the Mission was induced from the data. Eventually, the Data Analytic Process to produce the Mission will require its own pair of hierarchies in order to be properly characterized, verified and validated.
- 9.
Indicators may also be found in the R&M data. For instance, sequences or co-occurrences of LRU removals may be used to predict faults. Analysis for this DAP is not included in this use case.
References
W. Abdullah, R. Reddy, C. Butler, W. Walters, Utilizing Bayesian belief networks to model the ocean-atmosphere interface. J Miss Acad. Sci 63(1), 121–122 (2018)
R. Adcock, D. Collier, Measurement validity: A shared standard for qualitative and quantitative research. Am. Polit. Sci. Rev. 95(03), 529–546 (2001). https://doi.org/10.2307/3118231
A. Bekker, 4 types of Data Analytics to Improve Decision-Making (Science Soft, 2017), [Online]. Available: https://www.scnsoft.com/blog/4-types-of-data-analytics. Accessed 3 Dec 2018
F.C. Copleston, A History of Philosophy (Image Books, Garden City, 1964)
M. Cronbach, P. Meehl, Construct validity in psychological tests. Psychol. Bull. 52(4), 281–302 (1955)
J.D. Fearon, D.D. Laitin, Ordinary Language and External Validity: Specifying Concepts in the Study of Ethnicity*. LiCEP Meetings (2000), Retrieved from https://web.stanford.edu/group/fearon-research/cgi-bin/wordpress/wp-content/uploads/2013/10/Ordinary-Language-and-External-Validity-Specifying-Concepts-in-the-Study-of-Ethnicity.pdf
T. Harford, Big data: A big mistake? Significance 11(5), 14–19 (2014). https://doi.org/10.1111/j.1740-9713.2014.00778.x
I. Kant, Critique of Pure Reason, 1. paperback ed., 15. print (Cambridge Univ. Press, Cambridge [u.a.], 2009)
A. Kaplan, The Conduct of Inquiry (Chandler, San Francisco, 1964)
A. Kaplan, The conduct of inquiry (Transaction Publishers, 1973). Retrieved from https://books.google.com/books?id=ks8wuZHSKs8C&pg=PA53&lpg=PA53&dq=Abraham+Kaplan%27s+paradox&source=bl&ots=bHV9ptpV3g&sig=8_k3iRGHtuBuIOvAcZSGqLwTTYo&hl=en&sa=X&ved=0ahUKEwjzvrDS777YAhVDRN8KHaxlBA4Q6AEISTAI#v=onepage&q=AbrahamKaplan’s paradox&f=fals
C. Kufs, The five pursuits you meet in statistics. (Stats With Cats Blog, 2010), Retrieved May 10, 2018, from https://statswithcats.wordpress.com/2010/08/22/the-five-pursuits-you-meet-in-statistics/
D. Lazer, R. Kennedy, What we can learn from the epic failure of google flu trends. (WIRED, 2015), Retrieved May 10, 2018, from https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/
I.S. Lustick, M.R. Tubin, Verification as a form of validation: Deepening theory to broaden application of DOD protocols to the social sciences, in Proceedings of the 4th International Conference on Applied Human Factors and Ergonomics, (San Francisco, 2012). Retrieved from http://lustickconsulting.com/data/Verification as a Form of Validation - Lustick, Tubin.pdf
J. Overton, Going Pro in Data Science (O’Reilly Media, Inc, 2012). Retrieved from https://www.oreilly.com/data/free/files/going-pro-in-data-science.pdf?mkt_tok=eyJpIjoiWW1GbU1XSmhNRGMwTkRVdyIsInQiOiJGNlRrSFZnZExYXC9wR0ZOZWZOaWZ1ZHFUZjBFM1RhblFJSHM4VmpibW5udVwvY2FLRVVKVFdsQzlCNnV6ZEQ3NkI3VEg3c09idlhZWU5YNEVCTlIySjM0eCtNRGJnQnpsR1Q0QTFaU
J.A. Paulos, Metric Mania (The New York Times, 2010)
A. Ruvinsky, J. Wedgwood, J. Welsh, Establishing bounds of responsible operational use of social science models via innovations in verification and validation, in 2nd International Conference on Cross-Cultural Decision Making, 2012
F. Sailer, Google Flu Trends is dead – long live Google Trends? (UCL Research Department of Primary Care and Population Health Blog, 2018), Retrieved August 23, 2018, from http://blogs.ucl.ac.uk/pcph-blog/2018/01/23/google-flu-trends-is-dead-long-live-google-trends/
J.D. Stemwedel, Basic concepts: Falsifiable claims. – Adventures in ethics and science. Retrieved August 24, 2018, from http://scienceblogs.com/ethicsandscience/2007/01/31/basic-concepts-falsifiable/
A.G. Stephenson, D.R. Mulville, F.H. Bauer, G.A. Dukeman, P. Norvig, L.S. LaPiana, R. Sackheim, Mars Climate Orbiter Mishap Investigation Board Phase I Report (1999), Retrieved from http://sunnyday.mit.edu/accidents/MCO_report.pdf
K. Vasileva, Common mistakes in data analysis – The Data Nudge – Medium. (2017), Retrieved May 10, 2018, from https://medium.com/the-data-nudge/common-mistakes-in-data-analysis-951e366084b9
VV&A Recommended Practices Guide, (2011). https://vva.msco.mil/Key/key-pr.pdf
Wikipedia_contributors, Data analysis. (2018), Retrieved October 5, 2018, from https://en.wikipedia.org/w/index.php?title=Data_analysis&oldid=838877371
N. Yau, Why Context is as Important as the Data Itself (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ruvinsky, A. et al. (2019). An Epistemological Model for a Data Analysis Process in Support of Verification and Validation. In: Bossé, É., Rogova, G. (eds) Information Quality in Information Fusion and Decision Making. Information Fusion and Data Science. Springer, Cham. https://doi.org/10.1007/978-3-030-03643-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-03643-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03642-3
Online ISBN: 978-3-030-03643-0
eBook Packages: Computer ScienceComputer Science (R0)