ABSTRACT
In view of data provenance in ETL, to improve the efficiency of provenance tracking, this paper analyzes the common transformation and attribute mapping of ETL, focuses on the key attributes in key attribute mapping, summarizes its characteristics, puts forward the concept of minimal attribute set, and designs the data provenance method based on minimal attribute set. In the reverse tracking, using this method construct the reverse transformation sequence whose input and output patterns are dynamically transformed, the number of attributes is decreasing, the space - time costs is reduced and the provenance efficiency is improved.
- Ni Jing, Meng Xianxue. PROV model and its Web application{J}. Library and Information Service, 2014, 58(3):13--19.Google Scholar
- Karvounarakis G, Ives Z G, Tannen V. Querying data provenance{J}. Sigmod, 2010:951--962. Google ScholarDigital Library
- Bowers A, M K. Techniques for efficiently querying scientific workflow provenance graphs{J}. In: EDBT (2010, 2010:287--298. Google ScholarDigital Library
- Narock T, Yoon V, March S. A provenance-based approach to semantic web service description and discovery{J}. Decision Support Systems, 2014, 64(3):90--99. Google ScholarDigital Library
- Braun U, Shinnar A, Seltzer M. Securing provenance{C}// Conference on Hot Topics in Security. USENIX Association, 2008:752. Google ScholarDigital Library
- Liu Tong. Research in the Field of Securing Provenance based on OPM {D}. Shandong University of Technology, 2013.Google Scholar
- Liu Tong, Wang Fengying. Security Provenance Model based on OPM {J}. Application Research of Comoputer, 2013, 30(10):3117--3120.Google Scholar
- Moreau L, Clifford B, Freire J, et al. The Open Provenance Model core specification (v1.1){J}. Future Generation Computer Systems, 2011, 27(6):743--756. Google ScholarDigital Library
- Initiative D C M. Dublin core metadata element set, version 1.1{J}. 2013Google Scholar
- Sahoo S S, Sheth A P. Provenir ontology: Towards a framework for escience provenance management{J}. 2009.Google Scholar
- Moreau L, Missier P, Cheney J, et al. PROV-N: The Provenance Notation{J}. 2013.Google Scholar
- Yue P, Gong J, Di L. Augmenting geospatial data provenance through metadata tracking in geospatial service chaining{J}. Computers & Geosciences, 2010, 36(3):270--281. Google ScholarDigital Library
- Dai Chaofan, Wang Tao, Zhang Pengcheng. Survey of Provenance Technique {J}.Application Research of Comoputer, 2010, 27(9):3215--3221.Google Scholar
- Wang Zhong, Yin Jianli.Traceability Mechanism Design against Personal Data Privacy Disclosure under the Context of Big Data{J}. China's circulation economy newsroom.Google Scholar
- Rahm E, Hong H D. Data Cleaning: Problems and Current Approaches{J}. IEEE Data Engineering Bulletin, 2000, 23(23):3--13.Google Scholar
- Chen Genshang. Research on Developing ETL System Basing on Common Warehouse Metamodel {D}. Nanjing University of Aeronautics and Astronautics, 2005.Google Scholar
- Liu Xiping, Wan Changxuan. Research on Data Provenance An Overview{J}. Science and technology square, 2005(1):47--52.Google Scholar
- Min Hua, Zhang Yong, Fu Xiaohui. Survey of Data Provenance{J}. Journal of Chinese Computer Systems, 2012, 33(9):1917--1923.Google Scholar
- Wang Liwei, Bao Zhfeng, KOEHLER Henning, etc. An Approach for Optimizing Relational Provenance Storage{J}. Chinese Journal of Computers, 2011, 34(10):1863--1875.Google ScholarCross Ref
- Dai C F, Zhang X Y, Zhao Y P. Data Provenance Tracing for Transformation Diagram Based on Wivern{J}. Applied Mechanics & Materials, 2014, 631--632:1061--1066.Google ScholarCross Ref
- Dai Chaofan, Theories and Approach of Data Lineage Tracing in Data Warehouse Environment {D}. Chang Sha: NUDT, 2002.Google Scholar
Index Terms
- A Minimal Attribute Set-oriented Data Provenance Method
Recommendations
Design of ETL Provenance Tool Based on Minimal Attribute Set
ICBDR '17: Proceedings of the 1st International Conference on Big Data ResearchFor the ETL process, this paper designs a provenance tool based on inversible transformation, and describes the meta-information of ETL and data provenance process in two ways: one is to take the database two-dimensional table to describe the relevant ...
Provenance: On and Behind the Screens
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataCollecting and processing provenance, i.e., information describing the production process of some end product, is important in various applications, e.g., to assess quality, to ensure reproducibility, or to reinforce trust in the end product. In the ...
Using data provenance to improve software process enactment, monitoring and analysis
ICSE '16: Proceedings of the 38th International Conference on Software Engineering CompanionA practice to support software processes continuous improvement is to reuse the knowledge acquired in previous executions. One way to capture process execution data is by using data provenance models. Data provenance refers to the origin, lineage or ...
Comments