Abstract
We present a new method for a type of processing required in data base management systems. The method efficiently determin relevance of a given query value to each of many (target) sets of data. By using a new type of data structure, the method allows complete parallelism both for operations on different target sets and for those within each target set. The method never generates a false drop (i.e. indicates that an irrelevant target set is relevant to the query) and always identifies all relevant target sets. This eliminates the the overhead of reading each selected target set to ensure that the selection was not a false drop. A good deterministic bound on the system's performance is established.
With O(ln N v +ln ln M) processors, the relevance of any target set can be completely determined in O(1) time against a query consisting of a subset of N v vocabulary items. The space complexity is O(N i (ln N v +ln lnN v )) bits, where N i is the number of items relevant to target set i. As a concrete example, for a database using 64 byte keys, having a 100,000 word vocabulary (potentially valid keys) and in which a target set can have up to 64 distinct relevant elements, the relevance of a target set can be determined in 2 parallel operations using 6 processors. In other words, with 64K processors a database of one million target sets can be processed in 184 parallel operations. No probability distribution assumptions are necessary.
This article was processed using the LaTEX macro package with LMAMULT style
Preview
Unable to display preview. Download preview PDF.
References
S. R. Ahuja and C. S. Roberts, “An Associative/Parallel Processor for Partial Match Retrieval Using Superimposed Codes,” in Annual Symposium on Computer Architecture, 1980, pp. 218–227.
J. Bentley, “A Spelling Checker,” Communications of the ACM,” Vol. 28, no. 5, pp. 456–462, 1985.
C. Faloutsos, “Access Methods for Text,” Computing Surveys, vol. 17, no. 1, pp. 49–74, 1985.
M. Fredman, J. Komlos and E. Szemeredi, “Storing a Sparse Table with O(1) Worst Case Access Time,” Journal of the ACM, vol. 31, no. 3, pp. 538–544, 1984.
L. L. Gremillion, “Designing a Bloom Filter for Differential Access,” Communications of the ACM, vol. 25, no. 7, pp. 600–604, 1980.
D. E. Knuth, 1973. The Art of Computer Programming, vol. S: Sorting and Searching. Reading, Mass.: Addison-Wesley, 1973.
J. W. Lloyd, “Optimal Partial Match Retrieval,” BIT, vol. 20, pp. 406–413, 1980.
P. E. McKenney, “High Speed Event Counting and Classification Using a Dictionary Hash Technique,” in Proceedings of the International Conference on Parallel Processing, pp. 218–227, 1989.
H. N. Shapiro, Introduction to the Theory of Numbers. New York: John Wiley and Sons, 1983.
D. Tsichritzis D. Christodoulakis and S. Christodoulakis, “Message Files,” ACM Trans. Office Inf. Systems, vol. 1, no. 1, pp. 88–98, 1983.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Winters, V.G. (1992). Parallelism for high performance query processing. In: Pirotte, A., Delobel, C., Gottlob, G. (eds) Advances in Database Technology — EDBT '92. EDBT 1992. Lecture Notes in Computer Science, vol 580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032441
Download citation
DOI: https://doi.org/10.1007/BFb0032441
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55270-3
Online ISBN: 978-3-540-47003-8
eBook Packages: Springer Book Archive