Domain knowledge and data quality perceptions in genome curation work
Abstract
Purpose
The purpose of this paper is to understand genomics scientists’ perceptions in data quality assurances based on their domain knowledge.
Design/methodology/approach
The study used a survey method to collect responses from 149 genomics scientists grouped by domain knowledge. They ranked the top-five quality criteria based on hypothetical curation scenarios. The results were compared using χ2 test.
Findings
Scientists with domain knowledge of biology, bioinformatics, and computational science did not reach a consensus in ranking data quality criteria. Findings showed that biologists cared more about curated data that can be concise and traceable. They were also concerned about skills dealing with information overloading. Computational scientists on the other hand value making curation understandable. They paid more attention to the specific skills for data wrangling.
Originality/value
This study takes a new approach in comparing the data quality perceptions for scientists across different domains of knowledge. Few studies have been able to synthesize models to interpret data quality perception across domains. The findings may help develop data quality assurance policies, training seminars, and maximize the efficiency of genome data management.
Keywords
Citation
Huang, H. (2015), "Domain knowledge and data quality perceptions in genome curation work", Journal of Documentation, Vol. 71 No. 1, pp. 116-142. https://doi.org/10.1108/JD-08-2013-0104
Publisher
:Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited