research-article

An MPSoC for energy-efficient database query processing

DAC '16: Proceedings of the 53rd Annual Design Automation Conference

Article No.: 112, Pages 1 - 6

https://doi.org/10.1145/2897937.2897986

Published: 05 June 2016 Publication History

Abstract

This paper presents a heterogeneous database hardware accelerator MPSoC manufactured in 28 nm SLP CMOS. The 18 mm² chip integrates a runtime task scheduling unit for energy-efficient query processing and hierarchical power management supported by an ultra-fast dynamic voltage and frequency scaling. Four processing elements, connected by a star-mesh network-on-chip, are accelerated by an instruction set extension tailored to fundamental data-intensive applications. We evaluate the MPSoC with typical database benchmarks focusing on scans and bitmap operations. When the processing elements operate on data stored in local memories, the chip consumes 250 mW and shows a 96x energy efficiency improvement compared to state-of-the-art platforms.

References

[1]

I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. A survey on sensor networks. IEEE Communications Magazine, 40(8):102--114, Aug 2002.

Digital Library

[2]

O. Arnold, S. Haas, G. Fettweis, B. Schlegel, T. Kissinger, T. Karnagel, and W. Lehner. Hashi: An application-specific instruction set extension for hashing. In ADMS, pages 25--33, 2014.

[3]

O. Arnold, S. Haas, G. Fettweis, B. Schlegel, T. Kissinger, and W. Lehner. An application-specific instruction set for accelerating set-oriented database primitives. In ACM SIGMOD, pages 767--778, June 2014.

Digital Library

[4]

O. Arnold, E. Matus, B. Noethen, M. Winter, T. Limberg, and G. Fettweis. Tomahawk: Parallelism and heterogeneity in communications signal processing mpsocs. ACM Trans. Embed. Comput. Syst., 13(3s):107:1--107:24, 2014.

Digital Library

[5]

J. Chhugani, A. D. Nguyen, V. W. Lee, W. Macy, M. Hagog, Y.-K. Chen, A. Baransi, S. Kumar, and P. Dubey. efficient implementation of sorting on multi-core simd cpu architecture. Proc. VLDB Endow., 1(2):1313--1324, 2008.

Digital Library

[6]

F. Fusco, M. Vlachos, X. Dimitropoulos, and L. Deri. Indexing million of packets per second using gpus. IMC'13, pages 327--332, October 2013.

Digital Library

[7]

B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Relational query coprocessing on graphics processors. ACM Trans. Database Syst., 34(4):21:1--21:39, 2009.

Digital Library

[8]

S. Höppner, S. Hänzsche, G. Ellguth, D. Walter, H. Eisenreich, and R. Schüffny. A fast-locking adpll with instantaneous restart capability in 28-nm cmos technology. IEEE Transactions on Circuits and Systems II: Express Briefs, 60(11):741--745, Nov 2013.

[9]

S. Höppner, C. Shao, H. Eisenreich, G. Ellguth, M. Ander, and R. Schüffny. A power management architecture for fast per-core dvfs in heterogeneous mpsocs. In IEEE International Symposium on Circuits and Systems (ISCAS), pages 261--264, May 2012.

[10]

S. Höppner, D. Walter, T. Hocker, S. Henker, S. Hänzsche, D. Sausner, G. Ellguth, J.-U. Schlüßler, et al. An energy efficient multi-gbit/s noc transceiver architecture with combined ac/dc drivers and stoppable clocking in 65 nm and 28 nm cmos. IEEE Journal of Solid-State Circuits, 50(3):749--762, March 2015.

[11]

S. Kanev, J. P. Darago, and K. Hazelwood. Profiling a warehouse-scale computer. In Proceedings of the 42nd International Symposium on Computer Architecture, ISCA'15, June 2015.

Digital Library

[12]

P. Li, J. L. Shin, G. Konstadindis, F. Schumacher, V. Krishnaswamy, H. Cho, S. Dash, R. Masleid, et al. A 20nm 32-core 64mb l3 cache sparc m7 processor. In ISSCC Dig. Tech. Papers, ISSCC'15, pages 72--73, Feb 2015.

[13]

M. Mazzola, G. Schaaf, F. Niewels, and T. Kurner. Exploration of centralized car2x-systems over lte. In 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), pages 1--5, May 2015.

[14]

B. Nöthen, O. Arnold, E. P. Adeva, T. Seifert, E. Fischer, S. Kunze, E. Matus, and G. F. others. A 105gops 36mm2 heterogeneous sdr mpsoc with energy-aware dynamic scheduling and iterative detection-decoding for 4g in 65nm cmos. In ISSCC Dig. Tech. Papers, ISSCC'14, pages 188--189, Feb 2014.

[15]

I. Psaroudakis, T. Kissinger, D. Porobic, T. Ilsche, E. Liarou, P. Tözün, A. Ailamaki, and W. Lehner. Dynamic fine-grained scheduling for energy-efficient main-memory queries. DaMoN'14, pages 1:1--1:7. ACM, 2014.

Digital Library

[16]

M. Satyanarayanan, R. Schuster, M. Ebling, G. Fettweis, H. Flinck, K. Joshi, and K. Sabnani. An open ecosystem for mobile-cloud convergence. In IEEE Communications Magazine, vol. 53, no. 3, pages 63--70, March 2015.

[17]

B. Schlegel, T. Willhalm, and W. Lehner. Fast sorted-set intersection using simd instructions. ADMS'11, 2011.

[18]

B. Sukhwani, H. Min, M. Thoennes, P. Dube, B. Iyer, B. Brezzo, D. Dillenberger, and S. Asaad. Database analytics acceleration using fpgas. PACT'12, pages 411--420, 2012.

Digital Library

[19]

D. Tsirogiannis, S. Harizopoulos, and M. A. Shah. Analyzing the energy efficiency of a database server. SIGMOD'10, pages 231--242. ACM, 2010.

Digital Library

[20]

A. Ungethüm, D. Habich, T. Karnagel, W. Lehner, N. Asmussen, M. Völp, B. Nöthen, and G. Fettweis. Query processing on low-energy many-core processors. HardBD'15, 2015.

[21]

K. Wu, E. J. Otoo, and A. Shoshani. An efficient compression scheme for bitmap indices. Technical report, ACM Transactions on Database Systems, 2004.

[22]

L. Wu, A. Lottarini, T. K. Paine, M. A. Kim, and K. A. Ross. Q100: The architecture and design of a database processing unit. ASPLOS'14, pages 255--268. ACM, 2014.

Digital Library

Cited By

Guo BWu JPu YZhang JYu J(2024)Energy consumption estimation and profiling for queries in distributed database systems based on a bottom-up comprehensive energy modelFuture Generation Computer Systems10.1016/j.future.2024.04.059159:C(379-394)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.future.2024.04.059
Guo BYu JYang DLeng HLiao B(2022)Energy-Efficient Database Systems: A Systematic SurveyACM Computing Surveys10.1145/353822555:6(1-53)Online publication date: 7-Dec-2022
https://dl.acm.org/doi/10.1145/3538225
Budhkar PAbsalyamov IZois VWindh SNajjar WTsotras V(2019)Accelerating In-Memory Database Selections Using Latency Masking Hardware ThreadsACM Transactions on Architecture and Code Optimization10.1145/331022916:2(1-28)Online publication date: 9-Apr-2019
https://dl.acm.org/doi/10.1145/3310229
Show More Cited By

Recommendations

Architecture-sensitive database query processing on chip multiprocessors
Efficient Query Processing on Many-core Architectures: A Case Study with Intel Xeon Phi Processor
SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data

Recently, Intel Xeon Phi is emerging as a many-core processor with up to 61 x86 cores. In this demonstration, we present PhiDB, an OLAP query processor with simultaneous multi-threading (SMT) capabilities on Xeon Phi as a case study for parallel ...
Energy-Efficient Query Processing on Embedded CPU-GPU Architectures
DaMoN'15: Proceedings of the 11th International Workshop on Data Management on New Hardware

Energy efficiency is a major design and optimization factor for query co-processing of databases in embedded devices. Recently, GPUs of new-generation embedded devices have evolved with the programmability and computational capability for general-...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

DAC '16: Proceedings of the 53rd Annual Design Automation Conference

June 2016

1048 pages

ISBN:9781450342360

DOI:10.1145/2897937

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

DAC '16

DAC '16: The 53rd Annual Design Automation Conference 2016

June 5 - 9, 2016

Texas, Austin

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
190
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guo BWu JPu YZhang JYu J(2024)Energy consumption estimation and profiling for queries in distributed database systems based on a bottom-up comprehensive energy modelFuture Generation Computer Systems10.1016/j.future.2024.04.059159:C(379-394)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.future.2024.04.059
Guo BYu JYang DLeng HLiao B(2022)Energy-Efficient Database Systems: A Systematic SurveyACM Computing Surveys10.1145/353822555:6(1-53)Online publication date: 7-Dec-2022
https://dl.acm.org/doi/10.1145/3538225
Budhkar PAbsalyamov IZois VWindh SNajjar WTsotras V(2019)Accelerating In-Memory Database Selections Using Latency Masking Hardware ThreadsACM Transactions on Architecture and Code Optimization10.1145/331022916:2(1-28)Online publication date: 9-Apr-2019
https://dl.acm.org/doi/10.1145/3310229
Hoppner SVogginger BYan YDixius AScholze SPartzsch JNeumarker FHartmann SSchiefer SEllguth GCederstroem LPlana LGarside JFurber SMayr C(2019)Dynamic Power Management for Neuromorphic Many-Core SystemsIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2019.291189866:8(2973-2986)Online publication date: Aug-2019
https://doi.org/10.1109/TCSI.2019.2911898
Yan YKappel DNeumarker FPartzsch JVogginger BHoppner SFurber SMaass WLegenstein RMayr C(2019)Efficient Reward-Based Structural Plasticity on a SpiNNaker 2 PrototypeIEEE Transactions on Biomedical Circuits and Systems10.1109/TBCAS.2019.290640113:3(579-591)Online publication date: Jun-2019
https://doi.org/10.1109/TBCAS.2019.2906401
Balkesen CKunal NGiannikis GFender PSundara SSchmidt FWen JAgrawal SRaghavan AVaradarajan VViswanathan AChandrasekaran BIdicula SAgarwal NSedlar EDas GJermaine CBernstein P(2018)RAPIDProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3190655(1407-1419)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3183713.3190655
Haas SSeifert TNöthen BScholze SHöppner SDixius AAdeva EAugustin TPauls FMoriam SHasler MFischer EChen YMatúš EEllguth GHartmann SSchiefer SCederström LWalter DHenker SHänzsche SUhlig JEisenreich HWeithoffer SWehn NSchüffny RMayr CFettweis G(2017)A Heterogeneous SDR MPSoC in 28 nm CMOS for Low-Latency Wireless ApplicationsProceedings of the 54th Annual Design Automation Conference 201710.1145/3061639.3062188(1-6)Online publication date: 18-Jun-2017
https://dl.acm.org/doi/10.1145/3061639.3062188
Haas SArnold OScholze SHöppner SEllguth GDixius AUngethüm AMier ENöthen BMatúš ESchiefer SCederstroem LPilz FMayr CSchüffny RLehner WFettweis G(2016)A database accelerator for energy-efficient query processing and optimization2016 IEEE Nordic Circuits and Systems Conference (NORCAS)10.1109/NORCHIP.2016.7792904(1-5)Online publication date: Nov-2016
https://doi.org/10.1109/NORCHIP.2016.7792904

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten