Delta encoding represents a target file making use of a source file by replacing common substrings by pointer references. Two similar, yet different, models are introduced and investigated in this paper: the Compressed Transitive Delta Encoding (CTDE) and the Compressed Source Delta Encoding (CSDE) paradigms. In these models we are given two delta files and the goal is to construct a third delta file working directly on the given compressed forms.
Formally, given a source file
and two differencing files and , where is used to denote the delta file of the target file with respect to the source file , the objective of the CTDE problem is to be able to attain . Unlike the traditional way which uses to decompress , in order to attain , and then applies on to obtain , CTDE constructs a delta file working directly on the two given delta files and , without any decompression or the use of the base file . Thus, avoiding the storage of the redundant intermediate file . An algorithm for solving CTDE is proposed and its compression performance is compared to the traditional “double delta decompression”. Not only does it use constant space, as opposed to linear memory storage used by the traditional method, experiments show that the compression efficiency of the constructed delta file is usually better than both and .
The CSDE problem deals with a source file
and two differencing files and , and the goal is still to be able to attain . Although it is not always possible to construct the target file by processing only the two input delta files, empirical experiments show that on typical real life data, usually about 99% of the file can be constructed using the proposed algorithm for the CSDE problem.