Master-detail clustering using merged indexes

Graefe, Goetz

doi:10.1007/s00450-007-0022-4

Master-detail clustering using merged indexes

Themenheft Datenbanksysteme
Published: 08 May 2007

Volume 21, pages 127–145, (2007)
Cite this article

Informatik - Forschung und Entwicklung

Goetz Graefe¹

78 Accesses
7 Citations
Explore all metrics

Abstract

Merged indexes are B-trees that contain multiple traditional indexes and interleave their records based on a common sort order. In relational databases, merged indexes implement ‘‘master-detail clustering’’ of related records, e.g., orders and order details. Thus, merged indexes shift de-normalization from the logical level of tables and rows to the physical level of indexes and records, which is a much more appropriate place for it. For object-oriented applications, clustering can reduce the I/O cost for joining rows in related tables to a fraction compared to traditional indexes, with additional beneficial effects on buffer pool requirements.

Prior research has covered merged indexes without providing much guidance for their implementation. Enabling the design proposed here is a strict separation of B-tree and index into two layers of abstraction. In addition, this paper provides algorithms (i) for data layout including bitmap indexes and column stores, (ii) for concurrency control and recovery including locking of individual keys and of complex objects, (iii) for update operations including bulk insertions, bulk deletions, and deferred index maintenance, (iv) for adding and removing individual indexes within a merged index, (v) for enforcement of relational integrity constraints from uniqueness constraints to foreign key constraints, and (vi) for query processing including caching in query execution plans dominated by nested iteration and index navigation.

In the proposed design for merged indexes, the set of tables, views, and indexes can evolve without restriction. The set of clustering columns can also evolve freely. A relational query processor can search and update index records just as in traditional indexes. With these abilities, the proposed design may finally bring general master-detail clustering to traditional databases together with its advantages in performance and cost.

Zusammenfassung

Merged Indexes (eine Art physischer Sammelindex) sind spezielle B-Bäume, die mehrere herkömmliche Indexe vereinigen und ihre Einträge in einer gemeinsamen Sortierreihenfolge speichern. In relationalen Datenbanken implementieren Merged Indexes eine ,,hierarchische Clusterbildung‘‘ (master-detail clustering) zusammengehöriger Sätze, beispielsweise von Bestellungen und Bestellposten. Damit verlagern Merged Indexes die Denormalisierung von der logischen Ebene der Tabellen und Zeilen auf die physische Ebene der Indexe und Sätze, was eine viel besser geeignete Stelle dafür ist. Im Vergleich zum Einsatz herkömmlicher Indexe können bei objektorientierten Anwendungen durch Clusterbildung die E/A-Kosten für den Verbund in Beziehung stehender Tabellen auf einen Bruchteil gesenkt werden, wobei zusätzlich Spareffekte bei der Puffernutzung erzielt werden.

In früheren Publikationen werden Merged Indexes ohne spezielle Hinweise zu ihrer Implementierung erwähnt. Unsere Vorgehensweise bei ihrer Realisierung sieht eine strikte Trennung von B-Baum und Index in zwei Abstraktionsebenen vor. Zusätzlich liefert unser Beitrag Algorithmen (i) für die Datenabbildung einschließlich der Bitlisten-Indexe und der spaltenorientierten Speicherung, (ii) für Mehrbenutzersynchronisation und Recovery einschließlich spezieller Sperren für einzelne Schlüsselwerte und komplexe Objekte, (iii) für Aktualisierungsoperationen einschließlich Einfügungen und Löschungen großer Datenmengen, (iv) für das Hinzufügen und Entfernen einzelner Indexe innerhalb eines Merged Index, (v) zur Kontrolle relationaler Integritätsbedingungen wie Unique- und Fremdschlüsselbedingungen und (vi) für die Anfrageverarbeitung einschließlich Caching-Vorkehrungen bei Anfrageausführungsplänen mit hohem Anteil an geschachtelter Iteration und Index-Navigation.

In unserem Entwurf für Merged Indexes kann die Menge der beteiligten Tabellen, Sichten und Indexe unbeschränkt erweitert werden. Auch die Menge der Attribute (Spalten) mit Clusterbildung unterliegt keinen Beschränkungen. Ein relationaler Anfrageprozessor kann Indexeinträge genauso wie in herkömmlichen Indexen suchen und modifizieren. Diese Eigenschaften unseres Entwurfs ermöglichen es schließlich, das Konzept der allgemeinen hierarchischen Clusterbildung in herkömmliche Datenbanken zu integrieren und dabei ihre Vorteile hinsichtlich Leistung und Kosten zu erhalten.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Abadi DJ (2007) Column stores for wide and sparse data. CIDR 2007
Agarwal S, Blakeley JA, Casey T, Delaney K, Galindo-Legaria CA, Graefe G, Rys M, Zwilling MJ (2001) Microsoft SQL Server. Chapter 27, p 969–1006. In: Silberschatz A, Korth HF, Sudarshan S (eds) Database System Concepts 4th Edn. McGraw-Hill Higher Education
Bayer R, Unterauer K (1977) Prefix B-Trees. ACM TODS 2(1):11–26
Article Google Scholar
Fernandez PM (1994) Red Brick Warehouse: A Read-Mostly RDBMS for Open SMP Platforms. SIGMOD 1994, p 492
Graefe G (2003) Sorting and Indexing with Partitioned B-Trees. CIDR 2003
Graefe G (2003) Executing Nested Queries. BTW Conf. 2003, pp 58–77
Graefe G (2006) B-tree indexes, interpolation search, and skew. DaMoN 2006, p 5
Graefe G (2007) Hierarchical locking in B-tree Indexes. BTW Conf. 2007
Graefe G (2007) Algorithms for merged indexes. BTW Conf. 2007
Graefe G: Surrogate compression for columnar storage. To appear in ACM SIGMOD Record
Graefe G, Bunker R, Cooper S (1998) Hash joins and hash teams in Microsoft SQL Server. VLDB 1998, pp 86–97
Gray J, Graefe G (1997) The Five-Minute Rule ten years later, and other computer storage rules of thumb. SIGMOD Record 26(4):63–68
Article Google Scholar
Gottemukkala V, Lehman TJ (1992) Locking and Latching in a Memory-Resident Database System. VLDB 1992, pp 533–544
Graefe G, Larson P (2001) B-tree indexes and CPU caches. ICDE 2001, pp 349–358
Gray J, Lorie RA, Putzolu GR, Traiger IL (1975) Granularity of Locks in a Large Shared Data Base. VLDB 1975, pp 428–451
Gassner P, Lohman GM, Schiefer KB, Wang Y (1993) Query Optimization in the IBM DB2 Family. IEEE Data Eng Bull 16(4):4–18
Google Scholar
Gray J, Reuter A (1993) Transaction Processing: Concepts and Techniques. Morgan-Kaufman, San Mateo, CA
MATH Google Scholar
Graefe G, Shapiro LD (1991) Data compression and database performance. ACM/IEEE-CS Symp. on Applied Computing
Galindo-Legaria CA, Stefani S, Waas F (2004) Query Processing for SQL Updates. SIGMOD 2004:844–849
Graefe G, Zwilling MJ (2004) Transaction support for indexed views. SIGMOD 2004:323–334
Härder T (1978) Implementing a Generalized Access Path Structure for a Relational Database System. ACM TODS 3(3):285–298
Article Google Scholar
Hamilton J (2007) An architecture for modular data centers. CIDR 2007
Harizopoulos S, Liang V, Abadi DJ, Madden S (2006) Performance Tradeoffs in Read-Optimized Databases. VLDB 2006, pp 487–498
Härder T, Reuter A (1983) Principles of Transaction-Oriented Database Recovery. ACM Comput Surv 15(4):287–317
Article Google Scholar
Härder T, Reinert J (1996) Access Path Support for Referential Integrity in SQL2. VLDB J 5(3):196–214
Article Google Scholar
Hsu WW, Smith AJ (2004) The performance impact of I/O optimizations and disk improvements. IBM J Res Dev 48(2):255–289
Article Google Scholar
Joshi AM (1991) Adaptive Locking Strategies in a Multi-node Data Sharing Environment. VLDB 1991, pp 181–191
Korth HF (1983) Locking Primitives in a Database System. J ACM 30(1):55–79
Article MATH MathSciNet Google Scholar
Lomet DB (1993) Key Range Locking Strategies for Improved Concurrency. VLDB 1993, pp 655–664
Lomet DB (2001) The Evolution of Effective B-tree: Page Organization and Techniques: A Personal Account. SIGMOD Record 30(3):64–69
Article Google Scholar
Lehman TJ, Carey MJ (1989) A Concurrency Control Algorithm for Memory-Resident Database Systems. FODO 1989, pp 490–504
Leslie H, Jain R, Birdsall D, Yaghmai H (1995) Efficient Search of Multi-Dimensional B-Trees. VLDB 1995, pp 710–719
Lomet DB, Tuttle MR (1995) Redo Recovery after System Crashes. VLDB 1995, pp 457–468
Mohan C, ARIES/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes. VLDB 1990, pp 392–405
McJones PR (1997) The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995. Digital System Research Center Report SRC1997-018
Mohan C, Levine F (1992) ARIES/IM: An Efficient and High Concurrency Index Management Method Using Write-Ahead Logging. SIGMOD 1992, pp 371–380
Mohan C, Narang I (1992) Algorithms for Creating Indexes for Very Large Tables without Quiescing Updates. SIGMOD Conf. 1992, pp 361–370
O’Neil PE (1986) The Escrow Transactional Method. ACM TODS 11(4):405–430
Article Google Scholar
O’Neil PE, Graefe G (1995) Multi-table joins through bitmapped join indices. SIGMOD Record 24(3):8–11
Article Google Scholar
Orenstein JA, Merrett TH (1984) A Class of Data Structures for Associative Searching. PODS 1984, pp 181–190
Pöss M, Potapov D (2003) Data Compression in Oracle. VLDB 2003, pp 937–947
Ramsak F, Markl V, Fenk R, Zirkel M, Elhardt K, Bayer R (2000) Integrating the UB-Tree into a Database System Kernel. VLDB 2000, pp 263–272
Short-stroking. Storage magazine. http://storagemagazine.techtarget.com, July 2005
Stonebraker M, Abadi DJ, Batkin A, Chen X, Cherniack M, Ferreira M, Lau E, Lin A, Madden S, O’Neil EJ, O’Neil PE, Rasin A, Tran N, Zdonik SB (2005) C-Store: A Column-oriented DBMS. VLDB 2005, pp 553–564
Severance DG, Lohman GM (1976) Differential Files: Their Application to the Maintenance of Large Databases. ACM TODS 1(3):256–267
Article Google Scholar
Tsatalos OG, Solomon MH, Ioannidis YE (1996) VLDB J 5(2):101–118
Article Google Scholar
Valduriez P (1987) Join Indices. ACM TODS 12(2):218–246
Article Google Scholar
Wu K, Otoo EJ, Shoshani A (2006) Optimizing bitmap indices with efficient compression. ACM TODS 31(1):1–38
Article Google Scholar

Download references

Author information

Authors and Affiliations

HP Labs, 1501 Page Mill Rd, 94304, Palo Alto, CA, USA
Goetz Graefe

Authors

Goetz Graefe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Goetz Graefe.

Additional information

CR subject classification

H2.2; H2.4; H3.2; H3.3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Graefe, G. Master-detail clustering using merged indexes . Informatik Forsch. Entw. 21, 127–145 (2007). https://doi.org/10.1007/s00450-007-0022-4

Download citation

Received: 10 October 2006
Accepted: 09 March 2007
Published: 08 May 2007
Issue Date: June 2007
DOI: https://doi.org/10.1007/s00450-007-0022-4

Keywords

Schlagworte

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Master-detail clustering using merged indexes

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Query Processing of Pre-partitioned Data Using Sandwich Operators

Flower Master Index for Relational Database Selection and Joining

Orthogonal Key-Value Locking

References

Author information

Authors and Affiliations

Corresponding author

Additional information

CR subject classification

Rights and permissions

About this article

Cite this article

Keywords

Schlagworte

Navigation

Master-detail clustering using merged indexes

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Query Processing of Pre-partitioned Data Using Sandwich Operators

Flower Master Index for Relational Database Selection and Joining

Orthogonal Key-Value Locking

References

Author information

Authors and Affiliations

Corresponding author

Additional information

CR subject classification

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Schlagworte

Search

Navigation