Abstract
We consider the problem of evaluating an expression over sets. The sets are preprocessed and are therefore sorted, and the operators can be any of union, intersection, difference, complement, and symmetric difference (exclusive union). Given the expression as a formula and the sizes of the input sets, we are interested in the worst-case complexity of evaluation (in terms of the size of the sets). The problem is motivated by document retrieval in search engines where a user query translates directly to an expression over the sets containing the user-entered words. Special cases of of this problem have been studied [7,6] where the expression has a restricted form. In this paper, we present an efficient algorithm to evaluate the most general form of a set expression. We show a lower bound on this problem for expressions of the form E 1, or E 1 − E 2 where E 1 and E 2 are expressions with union, intersection, and symmetric difference operators. We demonstrate that the algorithm’s complexity matches the lower bound in these instances. We, moreover, conjecture that the algorithm works optimally, even when we allow difference and complement operations in E 1 and E 2.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barbay, J., Kenyon, C.: Adaptive intersection and t-threshold problems. In: SODA, pp. 390–399 (2002)
Bille, P., Pagh, A., Pagh, R.: Fast evaluation of union-intersection expressions. In: Tokuyama, T. (ed.) ISAAC 2007. LNCS, vol. 4835, pp. 739–750. Springer, Heidelberg (2007)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1–7), 107–117 (1998)
Brown, M.R., Tarjan, R.E.: A fast merging algorithm. J. ACM 26(2), 211–226 (1979)
Brown, M.R., Tarjan, R.E.: Design and analysis of a data structure for representing sorted lists. SIAM Journal on Computing 9(3), 594–614 (1980)
Chiniforooshan, E., Farzan, A., Mirzazadeh, M.: Worst case optimal union-intersection expression evaluation. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 179–190. Springer, Heidelberg (2005)
Demaine, E.D., López-Ortiz, A., Munro, J.I.: Adaptive set intersections, unions, and differences. In: SODA 2000: Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, pp. 743–752. Society for Industrial and Applied Mathematics, Philadelphia (2000)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)
Hwang, F.K., Lin, S.: A simple algorithm for merging two disjoint linearly ordered sets. SIAM Journal on Computing 1(1), 31–39 (1972)
Lee, G., Park, M., Won, H.: Using syntactic information in handling natural language quries for extended boolean retrieval model (1999)
Mauldin, M.: Lycos: design choices in an internet search service. IEEE Expert 12(1), 8–11 (1997)
Mirzazadeh, M.: Adaptive comparison-based algorithms for evaluating set queries (2004)
Pugh, W.: A skip list cookbook. Tech. rep., University of Maryland at College Park, College Park, MD, USA (1990)
Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann Publishers, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chiniforooshan, E., Farzan, A., Mirzazadeh, M. (2008). Evaluation of General Set Expressions. In: Hong, SH., Nagamochi, H., Fukunaga, T. (eds) Algorithms and Computation. ISAAC 2008. Lecture Notes in Computer Science, vol 5369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92182-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-540-92182-0_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92181-3
Online ISBN: 978-3-540-92182-0
eBook Packages: Computer ScienceComputer Science (R0)