ABSTRACT
The performance of main-memory index structures is increasingly determined by the number of CPU cache misses incurred when traversing the index. When keys are stored indirectly, as is standard in main-memory databases, the cost of key retrieval in terms of cache misses can dominate the cost of an index traversal. Yet it is inefficient in both time and space to store even moderate sized keys directly in index nodes. In this paper, we investigate the performance of tree structures suitable for OLTP workloads in the face of expensive cache misses and non-trivial key sizes. We propose two index structures, pkT-trees and pkB-trees, which significantly reduce cache misses by storing partial-key information in the index. We show that a small, fixed amount of key information allows most cache misses to be avoided, allowing for a simple node structure and efficient implementation. Finally, we study the performance and cache behavior of partial-key trees by comparing them with other main-memory tree structures for a wide variety of key sizes and key value distributions.
- 1.A. Aho, J. Hopcroft, and J.D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. Google ScholarDigital Library
- 2.J. Baulier, P. Bohannon, S. Gogate, C. Gupta, S. Haldar, S. Joshi, A. Khivesera, H. Korth, P. Mcilroy, J. Miller, P.P.S. Narayan, M. Nemeth, R. Rastogi, S. Seshardi, A. Silberschatz, S. Sudarshan, M. Wilder, and C. Wei. Datablitz storage manager: Main memory database performance for critical applications. In Proceedings of the 1999 ACM SIGMOD/PIDS International Conference on Management of Data, June 1999. Google ScholarDigital Library
- 3.R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173-189, 1972.Google ScholarDigital Library
- 4.R. Bayer and K. Unterauer. Prefix B-trees. ACM Transactions on Database Systems, 2(1), March 1977. Google ScholarDigital Library
- 5.P. Boncz, S. Manegold, and M. Kersten. Database architecture optimized for the new bottleneck: Memory access. In Proceedings of the Twenty-Fifth International Conference on Very Large Databases, Edinburgh, August 1999. Google ScholarDigital Library
- 6.T.M. Chilimbi, J.R. Larus, and M. Hill. Imporving pointer-based codes through cache-conscious data placement. Technical Report 98, University of Wisconsin-Madison, 1998.Google Scholar
- 7.Intel Corporation. Pentium III processor for the SC242 at 450 MHz to 800 MHz datasheet. http://developer.intel.com/design/pentiumiii/datashts/244452.htm, 2000.Google Scholar
- 8.T. Cover and J. Thomas. Elements of Information Theory. John Wiley & Sons, New York, 1991. Google ScholarDigital Library
- 9.D.J. DeWitt, R. Katz, F. Olken, D. Shapiro, M. Stonebraker, and D. Wood. Implementation techniques for main memory database systems. In Proc. of ACM-SIGMOD Int'l Conference on Management of Data, pages 1-8, Boston, Mass., June 1984. Google ScholarDigital Library
- 10.R. Embody and B. Moore. Perfmon user's guide. http://www.cse.msu.edu/ enbody/perfmon.html.Google Scholar
- 11.D. Ferguson. Bit-tree a data structure for fast file processing. Communications of the ACM, 35(6):114-120, June 1992. Google ScholarDigital Library
- 12.H. Garcia-Molina and K. Salem. Main memory database systems: An overview. IEEE Transactions on Knowledge and Data Engineering, 4(6):509-516, December 1992. Google ScholarDigital Library
- 13.J. Goldstein, R. Ramakrishnan, and U. Shaft. "Compressing Relations and Indexes". In Proceedings of the International Conference on Data Engineering, Orlando, Florida, 1998. Google ScholarDigital Library
- 14.B. Jacob and T. Mudge. Virtual memory in contemporary microprocessors. IEEE Micro, 18(4), July 1998. Google ScholarDigital Library
- 15.H. V. Jagadish, D. Lieuwen, R. Rastogi, A. Silberschatz, and S. Sudarshan. Dali: A high performance main-memory storage manager. In Proc. of the Int'l Conf. on Very Large Databases, 1994. Google ScholarDigital Library
- 16.T. Lehman, E. J. Shekita, and L. Cabrera. An evaluation of Starburst's memory resident storage component. IEEE Transactions on Knowledge and Data Engineering, 4(6):555-566, December 1992. Google ScholarDigital Library
- 17.T. J. Lehman and M. J. Carey. A study of index structures for main memory database management systems. In Proc. of the Int'l Conf. on Very Large Databases, pages 294-303, August 1986. Google ScholarDigital Library
- 18.L. McVoy and C. Staelin. lmbench: Portable tools for performance analysis. In USENIX, editor, USENIX 1996 Annual Technical Conference, January 22-26, 1996. San Diego, CA, pages 279-294, Berkeley, CA, USA, January 1996. USENIX. Google ScholarDigital Library
- 19.Sun Microsystems. The Ultra 30 architecture, technical white paper. http://www.sun.com/desktop/products/Ultra30/u30.pdf, 1997.Google Scholar
- 20.Sun Microsystems. Ultra 60 workstation datasheet. http://www.sun.com/desktop/products/Ultra60/ultra60 datasheet.html, 1998.Google Scholar
- 21.C. Mohan. ARIES/KVL: A key-value locking method for concurrency control of multiaction transactions operating on Btree indexes. In IBM Almaden Res.Ctr, Res.R. No.RJ7008, 27pp., March 1990.Google Scholar
- 22.J.P. Morgenthal. Microsoft COM+ will challenge application server market. Technical Whitepaper: http://www.microsoft.com/Com/wpaper/complus-appserv.asp, 1999.Google Scholar
- 23.W.K. Ng and C.V. Ravishankar. Block-oriented compression techniques for large statistical databases. IEEE Transactions on Knowledge and Data Engineering, 9(2):314-328, 1997. Google ScholarDigital Library
- 24.J. Rao and K.A. Ross. Cache conscious indexing for decision-support in main memory. In Proceedings of the Twenty-Fifth International Conference on Very Large Databases, Edinburgh, August 1999. Google ScholarDigital Library
- 25.J. Rao and K.A. Ross. Making B* -trees cache conscious in main memory. In (to be published) Proceedings of ACM-SIGMOD 2000 International Conference on Management of Data, May 2000. Google ScholarDigital Library
- 26.R. Rastogi, S. Seshadri, P. Bohannon, D. Leinbaugh, A. Silberschatz, and S. Sudarshan. Logical and physical versioning in main-memory databases. In Proceedings of the 23rd Int'l Conference on Very Large Databases, August 1997. Google ScholarDigital Library
- 27.T. Romer, W. Ohlrich, A. Karlin, and B. Bershad. Reducing TLB and memory overhead using online superpage promotion. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 176-187, 1995. Google ScholarDigital Library
- 28.Mikael Ronstrom. Design and Modelling of a Parallel Data Server for Telecom Applications. PhD thesis, Link oping University, 1998.Google Scholar
- 29.The TimesTen Team. In-memory data management for consumer transactions the timesten approach. In Proceedings of the 1999 ACM SIGMOD/PIDS International Conference on Management of Data, June 1999. Google ScholarDigital Library
Index Terms
- Main-memory index structures with fixed-size partial keys
Recommendations
Main-memory index structures with fixed-size partial keys
The performance of main-memory index structures is increasingly determined by the number of CPU cache misses incurred when traversing the index. When keys are stored indirectly, as is standard in main-memory databases, the cost of key retrieval in terms ...
Minimizing the Directory Size for Large-Scale Shared-Memory Multiprocessors
Directory-based cache coherence schemes are commonly used in large-scale shared-memory multiprocessors, but most of them rely on heuristics to avoid large hardware requirements. We proposed using physical address mapping on directories to significantly ...
An efficient cache design for scalable glueless shared-memory multiprocessors
CF '06: Proceedings of the 3rd conference on Computing frontiersTraditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the access to main memory to recover the sharing status of the block is ...
Comments