Skip to main content
Log in

O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Time intervals are often associated with tuples to represent their valid time in temporal relations, where overlap join is crucial for various kinds of queries. Many existing overlap join algorithms use indices based on tree structures such as quad-tree, B+-tree and interval tree. These algorithms usually have high CPU cost since deep path traversals are unavoidable, which makes them not so competitive as data-partition or plane-sweep based algorithms. This paper proposes an efficient overlap join algorithm based on a new two-layer flat index named as Overlap Interval Inverted Index (i.e., O2i Index). It uses an array to record the end points of intervals and approximates the nesting structures of intervals via two functions in the first layer, and the second layer uses inverted lists to trace all intervals satisfying the approximated nesting structures. With the help of the new index, the join algorithm only visits the must-be-scanned lists and skips all others. Analyses and experiments on both real and synthetic datasets show that the proposed algorithm is as competitive as the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Cafagna F, Böhlen M H. Disjoint interval partitioning. The VLDB Journal, 2017, 26(3): 447-466.

    Article  Google Scholar 

  2. Bouros P, Mamoulis N. A forward scan based plane sweep algorithm for parallel interval joins. Proceedings of the VLDB Endowment, 2017, 10(11): 1346-1357.

    Article  Google Scholar 

  3. Piatov D, Helmer S, Dignös A. An interval join optimized for modern hardware. In Proc. the 32nd IEEE International Conference on Data Engineering, May 2016, pp.1098-1109.

  4. Dignös A, Böhlen M H, Gamper J. Overlap interval partition join. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.1459-1470.

  5. Gao D, Jensen C S, Snodgrass R T, Soo M. Join operations in temporal databases. The VLDB Journal, 2005, 14(1): 2-29.

    Article  Google Scholar 

  6. Elmasri R, Wuu G T J, Kim Y J. The time index: An access structure for temporal data. In Proc. the 16th International Conference on Very Large Databases, Aug. 1990, pp.1-12.

  7. Son D, Elmasri R. Efficient temporal join processing using time index. In Proc. the 8th International Conference on Scientific and Statistical Data Base Management, Jun. 1996, pp.252-261.

  8. Soo M D, Snodgrass R T, Jensen C S. Efficient evaluation of the valid-time natural join. In Proc. the 10th International Conference on Data Engineering, Feb. 1994, pp.282-292.

  9. Sitzmann I, Stuckey P J. Improving temporal joins using histograms. In Proc. the 11th International Conference on Database and Expert Systems Applications, Sept. 2000, pp.488-498.

  10. Gunadhi H, Segev A. Query processing algorithms for temporal intersection joins. In Proc. the 7th International Conference on Data Engineering, April 1991, pp.336-344.

  11. Rana S P, Fotouhi F. Efficient processing of time-joins in temporal data bases. In Proc. the 3rd International Conference on Database Systems for Advanced Applications, April 1993, pp.427-432.

  12. Pfser D, Jensen S J. Incremental join of time-oriented data. In Proc. the 11th International Conference on Scientific and Statistical Data Base Management, Jul. 1999, pp.232-243.

  13. Zhang C, Naughton J, DeWitt D, Luo Q, Lohman G. On supporting containment queries in relational database management systems. In Proc. the 2001 ACM SIGMOD International Conference on Management of Data, May 2001, pp.425-436.

  14. Zhang D, Tsotras V J, Seeger B. Efficient temporal join processing using indices. In Proc. the 18th International Conference on Data Engineering, Feb. 2002, pp.103-113.

  15. Kriegel H P, Pötke M, Seidl T. Managing intervals efficiently in object-relational databases. In Proc. the 26th International Conference on Very Large Data Bases, Sept. 2000, pp.407-418.

  16. Tsotras V J, Kangelaris N. The snapshot index: An I/O-optimal access method for timeslice queries. Information Systems, 1995, 20(3): 237-260.

    Article  Google Scholar 

  17. Beckmann N, Seeger B. A revised R -tree in comparison with related index structures. In Proc. the 2009 ACM SIGMOD International Conference on Management of Data, June 2009, pp.799-812.

  18. Ozsoyoglu G, Snodgrass R T. Temporal and real-time databases: A survey. IEEE Transactions on Knowledge and Data Engineering, 1995, 7(4): 513-532.

    Article  Google Scholar 

  19. Beckmann N, Kriegel H P, Schneider R, Seeger B. The R -tree: An efficient and robust access method for points and rectangles. In Proc. the 1990 ACM SIGMOD International Conference on Management of Data, May 1990, pp.322-331.

  20. Koudas N, Sevcik K C. Size separation spatial join. In Proc. the 1997 ACM SIGMOD International Conference on Management of Data, May 1997, pp.324-335.

  21. Nobari S, Tauheed F, Heinis T, Karras P, Bressan S, Ailamaki A. TOUCH: In-memory spatial join by hierarchical data-oriented partitioning. In Proc. the 2013 ACM SIGMOD International Conference on Management of Data, Jun. 2013, pp.701-712.

  22. Leung T Y C, Muntz R. Stream processing: Temporal query processing and optimization. In Temporal Databases: Theory, Design, and Implementation, Tassel A U, Clifford J, Gadia S et al. (eds.), Benjamin-Cummings, 1993, pp.329–355.

  23. Lu H, Ooi B C, Tan K L. On spatially partitioned temporal join. In Proc. the 20th International Conference on Very Large Data Bases, Sept. 1994, pp.546-557.

  24. Shen H, Ooi B C, Lu H. The TP-index: A dynamic and efficient indexing mechanism for temporal databases. In Proc. the 10th Int. Conf. Data Engineering, Feb. 1994, pp.274-281.

  25. Arge L, Procopiuc O, Ramaswamy S et al. Scalable sweeping-based spatial join. In Proc. the 24th International Conference on Very Large Data Bases, Aug. 1998, pp.570-581.

  26. Lo M L, Ravishankar C V. Spatial hash-joins. In Proc. the 1996 ACM SIGMOD International Conference on Management of Data, June 1996, pp.247-258.

  27. Patel J M, Dewitt D J. Partition based spatial-merge join. In Proc. the 1996 ACM SIGMOD International Conference on Management of Data, Jun. 1996, pp.259-270.

  28. Samet H. Foundation of Multidimensional and Metric Data Structures (1st edition). Morgan Kaufmann, 2006.

  29. Enderle J, Hampel M, Seidl T. Joining interval data in relational databases. In Proc. the 2004 ACM SIGMOD International Conference on Management of Data, Jun. 2004, pp.683-694.

  30. Zhang D, Tsotras V J. Index based processing of semi-restrictive temporal joins. In Proc. the 9th International Symposium on Temporal Representation and Reasoning, July 2002, pp.70-77.

  31. Luo J Z, Shi S F, Wang H Z, Li J Z. FrepJoin: An efficient partition-based algorithm for edit similarity join. Frontiers of Information Technology & Electronic Engineering, 2017, 18(10): 1499-1510.

    Article  Google Scholar 

  32. Wang C P, Wang C K, Wang H, Chen J, Ye X J. Preference join on heterogeneous data. In Proc. the 18th Asia Pacific Web Conference, Sept. 2016, pp.317-329.

  33. Khayyat Z, Lucia W, Singh M, Ouzzani M, Papotti P, Quiané-Ruiz J A, Tang N, Kalnis P. Fast and scalable inequality joins. The VLDB Journal, 2017, 26(1): 125–150.

    Article  Google Scholar 

Download references

Acknowledgment

We would like to appreciate Zeng-Long Wang, a Master student of Ji-Zhou Luo, for his work on early experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji-Zhou Luo.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 784 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, JZ., Shi, SF., Yang, G. et al. O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join. J. Comput. Sci. Technol. 33, 1023–1038 (2018). https://doi.org/10.1007/s11390-018-1872-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-018-1872-x

Keywords

Navigation