skip to main content
10.1145/2933349.2933351acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Customized OS support for data-processing

Published: 26 June 2016 Publication History

Abstract

For decades, database engines have found the generic interfaces offered by the operating systems at odds with the need for efficient utilization of hardware resources. As a result, most engines circumvent the OS and manage hardware directly. With the growing complexity and heterogeneity of modern hardware, database engines are now facing a steep increase in the complexity they must absorb to achieve good performance. Taking advantage of recent proposals in operating system design, such as multi-kernels, in this paper we explore the development of a light weight OS kernel tailored for data processing and discuss its benefits for simplifying the design and improving the performance of data management systems.

References

[1]
S. Arumugam, A. Dobra, C. M. Jermaine, N. Pansare, and L. Perez. The DataPath system: a data-centric analytic processing engine for large data warehouses. SIGMOD '10, pages 519--530.
[2]
A. Basu, J. Gandhi, J. Chang, M. D. Hill, and M. M. Swift. Efficient Virtual Memory for Big Memory Servers. ISCA '13, pages 237--248.
[3]
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A New OS Architecture for Scalable Multicore Systems. SOSP '09, pages 29--44.
[4]
A. Belay, G. Prekas, A. Klimovic, S. Grossman, C. Kozyrakis, and E. Bugnion. IX: A Protected Dataplane Operating System for High Throughput and Low Latency. OSDI'14, pages 49--65.
[5]
S. Blanas and J. M. Patel. Memory Footprint Matters: Efficient Equi-join Algorithms for Main Memory Data Processing. SOCC '13, pages 19:1--19:16.
[6]
P. A. Boncz, M. L. Kersten, and S. Manegold. Breaking the Memory Wall in MonetDB. Commun. ACM, 51(12):77--85.
[7]
D. Bovet and M. Cesati. Understanding The Linux Kernel. Oreilly & Associates Inc, 2005.
[8]
I. El Hajj, A. Merritt, G. Zellweger, D. Milojicic, R. Achermann, P. Faraboschi, W.-m. Hwu, T. Roscoe, and K. Schwan. SpaceJMP: Programming with Multiple Virtual Address Spaces. ASPLOS'16.
[9]
H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger. Dark silicon and the end of multicore scaling. ISCA '11, pages 365--376, 2011.
[10]
A. Fedorova, M. Seltzer, and M. D. Smith. Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler. PACT '07, pages 25--38.
[11]
S. Gerber, G. Zellweger, R. Achermann, K. Kourtis, T. Roscoe, and D. Milojicic. Not your parents' physical address space. In HotOS'15.
[12]
B. Gerofi, M. Takagi, Y. Ishikawa, R. Riesen, E. Powers, and R. W. Wisniewski. Exploring the Design Space of Combining Linux with Lightweight Kernels for Extreme Scale Computing. ROSS '15, pages 5:1--5:8.
[13]
M. Giampapa, T. Gooding, T. Inglett, and R. W. Wisniewski. Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. SC '10, pages 1--10.
[14]
G. Giannikis, G. Alonso, and D. Kossmann. SharedDB: killing one thousand queries with one stone. PVLDB, 5(6):526--537, 2012.
[15]
J. Giceva, G. Alonso, T. Roscoe, and T. Harris. Deployment of Query Plans on Multicores. PVLDB, 8(3):233--244, 2014.
[16]
J. Giceva, T.-I. Salomie, A. Schüpbach, G. Alonso, and T. Roscoe. COD: Database/Operating System Co-Design. In CIDR, 2013.
[17]
G. Graefe, H. Volos, H. Kimura, H. Kuno, J. Tucek, M. Lillibridge, and A. Veitch. In-memory Performance for Big Data. PVLDB'14, pages 37--48.
[18]
J. Gray. Notes on Data Base Operating Systems. In R. Bayer, R. M. Graham, and G. Seegmüller, editors, Operating Systems: An Advanced Course, pages 393--481. Springer-Verlag, 1977.
[19]
S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. QPipe: a simultaneously pipelined relational query engine. SIGMOD '05, pages 383--394.
[20]
T. Harris, M. Maas, and V. J. Marathe. Callisto: Co-scheduling Parallel Runtime Systems. EuroSys '14, pages 24:1--24:14.
[21]
T. Hoefler, T. Schneider, and A. Lumsdaine. Characterizing the Influence of System Noise on Large-Scale Applications by Simulation. SC '10, pages 1--11.
[22]
S. Hong, H. Chafi, E. Sedlar, and K. Olukotun. Green-Marl: A DSL for Easy and Efficient Graph Analysis. ASPLOS'12, pages 349--362.
[23]
R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki, and B. Falsafi. Shore-MT: A Scalable Storage Manager for the Multicore Era. EDBT '09, pages 24--35.
[24]
S. Kaestle, R. Achermann, T. Roscoe, and T. Harris. Shoal: Smart Allocation and Replication of Memory for Parallel Programs. USENIX ATC '15, pages 263--276.
[25]
S. M. Kelly and R. Brightwell. Software architecture of the light weight kernel, catamount. In Cray User Group'05, pages 16--19.
[26]
A. Kemper and T. Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE, pages 195--206, 2011.
[27]
T. Kiefer, B. Schlegel, and W. Lehner. Experimental evaluation of NUMA effects on database management systems. In BTW'13, pages 185--204.
[28]
C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs. PVLDB, 2(2):1378--1389, 2009.
[29]
H. Kimura. FOEDUS: OLTP Engine for a Thousand Cores and NVRAM. SIGMOD '15, pages 691--706.
[30]
H. Kwak, C. Lee, H. Park, and S. Moon. What is Twitter, a social network or a news media? In WWW '10, pages 591--600.
[31]
R. Lee, X. Ding, F. Chen, Q. Lu, and X. Zhang. MCC-DB: minimizing cache conflicts in multi-core processors for databases. PVLDB, 2(1):373--384, 2009.
[32]
V. Leis, P. Boncz, A. Kemper, and T. Neumann. Morsel-driven Parallelism: A NUMA-aware Query Evaluation Framework for the Many-core Age. In SIGMOD'14, pages 743--754, 2014.
[33]
V. Leis, A. Kemper, and T. Neumann. Exploiting hardware transactional memory in main-memory databases. In ICDE'14, pages 580--591.
[34]
C. Li, C. Ding, and K. Shen. Quantifying the Cost of Context Switch. ExpCS '07.
[35]
Y. Li, I. Pandis, R. Müller, V. Raman, and G. M. Lohman. NUMA-aware algorithms: the case of data shuffling. In CIDR, 2013.
[36]
D. B. Lomet, S. Sengupta, and J. J. Levandoski. The Bw-Tree: A B-tree for New Hardware Platforms. ICDE '13, pages 302--313.
[37]
J. Lozi, B. Lepers, J. R. Funston, F. Gaud, V. Quéma, and A. Fedorova. The Linux scheduler: a decade of wasted cores. In EuroSys'16, page 1, 2016.
[38]
D. Makreshanski, J. J. Levandoski, and R. Stutsman. To Lock, Swap, or Elide: On the Interplay of Hardware Transactional Memory and Lock-Free Indexing. PVLDB, 8(11):1298--1309.
[39]
H. Mühe, A. Kemper, and T. Neumann. How to Efficiently Snapshot Transactional Data: Hardware or Software Controlled? DaMoN '11, pages 17--26.
[40]
S. Peter, J. Li, I. Zhang, D. R. K. Ports, D. Woos, A. Krishnamurthy, T. Anderson, and T. Roscoe. Arrakis: The Operating System is the Control Plane. OSDI'14, pages 1--16.
[41]
S. Phillips. M7: Next Generation SPARC. Presented at Hot Chips (HC 26): A symposium on High Performance Chips, August, 2014.
[42]
D. Porobic, E. Liarou, P. Tözün, and A. Ailamaki. ATraPos: Adaptive transaction processing on hardware Islands. In ICDE, pages 688--699, 2014.
[43]
D. Porobic, I. Pandis, M. Branco, P. Tözün, and A. Ailamaki. OLTP on Hardware Islands. PVLDB, 5(11):1447--1458, 2012.
[44]
I. Psaroudakis, T. Scheuer, N. May, and A. Ailamaki. Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads. In ADMS, pages 36--45, 2013.
[45]
I. Psaroudakis, T. Scheuer, N. May, A. Sellami, and A. Ailamaki. Scaling Up Concurrent Main-memory Column-store Scans: Towards Adaptive NUMA-aware Data and Task Placement. PVLDB, 8(12):1442--1453.
[46]
R. Riesen, A. B. Maccabe, B. Gerofi, D. N. Lombard, J. J. Lange, K. Pedretti, K. Ferreira, M. Lang, P. Keppel, R. W. Wisniewski, R. Brightwell, T. Inglett, Y. Park, and Y. Ishikawa. What is a Lightweight Kernel? ROSS '15, pages 9:1--9:8.
[47]
N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey. Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort. In SIGMOD'10, pages 351--362.
[48]
T. Shimosawa, B. Gerofi, M. Takagi, G. Nakamura, T. Shirasawa, Y. Saeki, M. Shimizu, A. Hori, and Y. Ishikawa. Interface for heterogeneous kernels: A framework to enable hybrid OS designs targeting high performance computing on manycore architectures. In HiPC'14, pages 1--10.
[49]
M. Stonebraker. Operating System Support for Database Management. Commun. ACM, pages 412--418, 1981.
[50]
The Barrelfish Project. www.barrelfish.org, accessed 2016-03-22.
[51]
P. Unterbrunner, G. Giannikis, G. Alonso, D. Fauser, and D. Kossmann. Predictable performance for unpredictable workloads. PVLDB '09, pages 706--717.
[52]
D. Wentzlaff and A. Agarwal. Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS'09, pages 76--85.
[53]
R. W. Wisniewski, T. Inglett, P. Keppel, R. Murty, and R. Riesen. mOS: An Architecture for Extreme-scale Operating Systems. ROSS '14, pages 2:1--2:8.
[54]
R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. GraphX: A Resilient Distributed Graph System on Spark. GRADES '13, pages 2:1--2:6.
[55]
Y. Ye, K. A. Ross, and N. Vesdapunt. Scalable aggregation on multicore processors. In DaMoN'11, pages 1--9.
[56]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster Computing with Working Sets. HotCloud'10.
[57]
G. Zellweger, S. Gerber, K. Kourtis, and T. Roscoe. Decoupling Cores, Kernels, and Operating Systems. In OSDI'14, pages 17--31.
[58]
X. Zhang, E. Tune, R. Hagmann, R. Jnagal, V. Gokhale, and J. Wilkes. CPI2: CPU Performance Isolation for Shared Compute Clusters. EuroSys '13.
[59]
S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing Shared Resource Contention in Multicore Processors via Scheduling. ASPLOS XV, pages 129--142, 2010.

Cited By

View all
  • (2022)Tell-Tale Tail Latencies: Pitfalls and Perils in Database BenchmarkingPerformance Evaluation and Benchmarking10.1007/978-3-030-94437-7_8(119-134)Online publication date: 1-Jan-2022
  • (2021)Performance Analysis of Array Database Systems in Non-Uniform Memory Architecture2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP52278.2021.00034(169-176)Online publication date: Mar-2021
  • (2021)NileOS: A Distributed Asymmetric Core-Based Micro-Kernel for Big Data ProcessingIEEE Access10.1109/ACCESS.2020.30480829(3696-3711)Online publication date: 2021
  • Show More Cited By

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DaMoN '16: Proceedings of the 12th International Workshop on Data Management on New Hardware
June 2016
89 pages
ISBN:9781450343190
DOI:10.1145/2933349
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2016

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'16
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)7
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Tell-Tale Tail Latencies: Pitfalls and Perils in Database BenchmarkingPerformance Evaluation and Benchmarking10.1007/978-3-030-94437-7_8(119-134)Online publication date: 1-Jan-2022
  • (2021)Performance Analysis of Array Database Systems in Non-Uniform Memory Architecture2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP52278.2021.00034(169-176)Online publication date: Mar-2021
  • (2021)NileOS: A Distributed Asymmetric Core-Based Micro-Kernel for Big Data ProcessingIEEE Access10.1109/ACCESS.2020.30480829(3696-3711)Online publication date: 2021
  • (2020)Understanding the effect of data center resource disaggregation on production DBMSsProceedings of the VLDB Endowment10.14778/3397230.339724913:9(1568-1581)Online publication date: 26-Jun-2020
  • (2020)PeafowlProceedings of the 11th ACM Symposium on Cloud Computing10.1145/3419111.3421298(150-164)Online publication date: 12-Oct-2020
  • (2020)The Art of Efficient In-memory Query Processing on NUMA Systems: a Systematic Approach2020 IEEE 36th International Conference on Data Engineering (ICDE)10.1109/ICDE48307.2020.00073(781-792)Online publication date: Apr-2020
  • (2020)mxkernel: A Novel System Software Stack for Data Processing on Modern HardwareDatenbank-Spektrum10.1007/s13222-020-00357-5Online publication date: 6-Oct-2020
  • (2019)Artificial Intelligence in Operating SystemProceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence10.1145/3374587.3374635(313-317)Online publication date: 6-Dec-2019
  • (2019)A Tickless AMP Distributed Core-Based Microkernel for Big DataProceedings of the Future Technologies Conference (FTC) 201910.1007/978-3-030-32520-6_41(556-577)Online publication date: 13-Oct-2019
  • (2018)SolrosProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190523(1-15)Online publication date: 23-Apr-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media