Abstract
In inverted file systems, queries can be written as Boolean expressions of inverted attributes. In response to a query, the system accesses address lists associated with the attributes in the query, merges them, and selects those records that satisfy the search logic. In this paper we consider the minimization of the CPU time needed for the merging operation. The time can possibly be reduced by taking address lists that occur in several product terms as a common factor of these products. This means that the union operation must be performed before the intersection operation. We present formulas which can be used to decide whether the above method is advantageous. The time can also be reduced by choosing the order of intersection operations so that it takes into consideration the occurrences of the address lists in the products and the lengths of the address lists. For choosing the order of intersection operations we give a heuristic algorithm that minimizes the total time needed for intersections.
Similar content being viewed by others
References
D. H. Anderson and P. B. Berra, “Minimum cost selection of secondary indexes for formated files,”ACM Trans. Database Systems 2(1): 68–90 (1977).
A. F. Cárdenas, “Evaluation and selection of file organization—A model and system,”Comm. ACM 16(9): 540–548 (1973).
A. F. Cárdenas, “Analysis and performance of inverted database structures,”Comm. ACM 18(5): 253–263 (1975).
D. A. Huffman, “A method for the construction of minimum redundancy codes,”Proc. IRE 40:1098–1101 (1952).
F. K. Hwang and S. Lin, “A simple algorithm for merging two disjoint linearly ordered sets,”Siam J. Comp. 1(1): 31–39 (1972).
E. E. Knuth,The Art of Computer Programming (Addison-Wesley, Reading, Massachusetts, 1975).
D. Lefkovitz,File Structures for On-Line Systems (Spartan Books, New York, 1969).
J. W. S. Liu, “Algorithms for parsing search queries in systems with inverted file organization,”ACM Trans. Database Systems 1(4): 299–316 (1976).
H. Wedekind,Datenbanksysteme II (B. I.-Wissenschaftsverlag, Zürich, 1976).
S. B. Yao, “An attribute based model for database access cost analysis,”ACM Trans. Database Systems 2(1): 45–67 (1977).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Putkonen, A. The order of merging operations for queries in inverted file systems. International Journal of Computer and Information Sciences 9, 351–369 (1980). https://doi.org/10.1007/BF00978519
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00978519