skip to main content
10.1145/2814228.2814230acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Columnar objects: improving the performance of analytical applications

Published: 21 October 2015 Publication History

Abstract

Growing volumes of data increase the demand to use it in analytical applications to make informed decisions. Unfortunately, object-oriented runtimes experience performance problems when dealing with large data volumes. Similar problems have been addressed by column-oriented in-memory databases, whose memory layout is tailored to analytical workloads. As a result, data storage and processing are often delegated to such a database. However, the more domain logic is moved to this separate system, the more benefits of object-orientation are lost. We propose modifications to dynamic object-oriented runtimes to store collections of objects in a column-oriented memory layout and leverage a jit to take advantage of the adjusted layout by mapping object traversal to array operations. We implemented our concept in PyPy, a Python interpreter equipped with a tracing jit. Finally, we show that analytical algorithms, expressed through object-oriented code, are up to three times faster due to our optimizations, without substantially impairing the paradigm. Hopefully, extending these concepts will mitigate some problems originating from the paradigm mismatch between object-oriented runtimes and databases.

References

[1]
A. Ailamaki, D. J. DeWitt, M. D. Hill, and M. Skounakis. Weaving relations for cache performance. In VLDB, volume 1, pages 169–180, 2001.
[2]
F. Alted. Out-of-core columnar datasets, 2014. URL http: //blosc.org/docs/bcolz-EuroPython-2014.
[3]
pdf. EuroPython 2014, Berlin. Accessed: 2015-03-18.
[4]
C. F. Bolz. Efficiently implementing Python objects with maps, 2010. URL http: //morepypy.blogspot.de/2010/11/ efficiently-implementing-python-objects. html. Accessed: 2014-03-13.
[5]
C. F. Bolz, A. Cuni, M. Fijalkowski, and A. Rigo. Tracing the meta-level: PyPy’s tracing JIT compiler. In Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems, pages 18–25. ACM, 2009.
[6]
C. F. Bolz, A. Cuni, M. Fijakowski, M. Leuschel, S. Pedroni, and A. Rigo. Allocation removal by partial evaluation in a tracing JIT. In Proceedings of the 20th ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, PEPM ’11, pages 43–52, New York, NY, USA, 2011. ACM.
[7]
C. Chambers, D. Ungar, and E. Lee. An efficient implementation of SELF a dynamically-typed object-oriented language based on prototypes. In Conference Proceedings on Object-oriented Programming Systems, Languages and Applications, OOPSLA ’89, pages 49–70, New York, NY, USA, 1989. ACM. ISBN 0-89791-333-7.
[8]
A. Elo. The rating of chessplayers, past and present. Arco Publishing, 1978. ISBN 9780668047210.
[9]
A. Goldberg and D. Robson. Smalltalk-80: The Language and Its Implementation. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1983. ISBN 0-201-11371-6.
[10]
Google. Chrome V8 design elements, 2012. URL https: //developers.google.com/v8/design. Accessed: 2014-03-18.
[11]
D. Ingalls, T. Kaehler, J. Maloney, S. Wallace, and A. Kay. Back to the future: The story of Squeak, a practical Smalltalk written in itself. In ACM SIGPLAN Notices, volume 32, pages 318–326. ACM, 1997.
[12]
E. L. Kaplan and P. Meier. Nonparametric estimation from incomplete observations. Journal of the American statistical association, 53(282):457–481, 1958.
[13]
S. N. Khoshafian and G. P. Copeland. Object identity. In Conference Proceedings on Object-oriented Programming Systems, Languages and Applications, OOPLSA ’86, pages 406– 416, New York, NY, USA, 1986. ACM. ISBN 0-89791-204-7.
[14]
D. Maier, J. Stein, A. Otis, and A. Purdy. Development of an object-oriented DBMS. In Conference Proceedings on Object-oriented Programming Systems, Languages and Applications, OOPLSA ’86, pages 472–482, New York, NY, USA, 1986. ACM. ISBN 0-89791-204-7.
[15]
R. Meier and A. Rigo. A way forward in parallelising dynamic languages. In Proceedings of the 9th International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems PLE, page 4. ACM, 2014.
[16]
P. Mougin and S. Ducasse. OOPAL: Integrating array programming in object-oriented programming. In R. Crocker and G. L. S. Jr., editors, Proceedings of the 2003 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications, OOPSLA 2003, October 26-30, 2003, Anaheim, CA, USA, pages 65–77. ACM, 2003.
[17]
T. Neward. The Vietnam of computer science. The Blog Ride, Ted Newards Technical Blog, 2006.
[18]
M. E. Noth. Exploding Java Objects for Performance. PhD thesis, University of Washington, 2003.
[19]
Y. Ohshima. An End-User Programming System for Constructing Massively Parallel Simulations. PhD thesis, 2006.
[20]
H. Plattner. A Course in In-Memory Data Management. Springer. ISBN 978-3-642-36523-2.
[21]
H. Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD ’09, pages 1–2, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-551-2.
[22]
H. Plattner. SanssouciDB: An in-memory database for processing enterprise workloads. In Proceedings of the GIFachtagung Datenbanksysteme für Business, Technologie und Web 2011, volume 20, pages 2–21, 2011.
[23]
A. Rigo and S. Pedroni. PyPy’s approach to virtual machine construction. In Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications, pages 944–953. ACM, 2006.
[24]
F. Schneider. Compiling Dart to efficient machine code, 2012. URL https: //www.dartlang.org/slides/2013/04/ compiling-dart-to-efficient-machine-code. pdf. Accessed: 2015-03-18.

Cited By

View all
  • (2023)Evaluating IP Blacklists Effectiveness2023 10th International Conference on Future Internet of Things and Cloud (FiCloud)10.1109/FiCloud58648.2023.00056(336-343)Online publication date: 14-Aug-2023
  • (2022)Automatically Transforming Arrays to Columnar Storage at Run Time✱Proceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3560805(141-143)Online publication date: 14-Sep-2022
  • (2022)Automatic Array Transformation to Columnar Storage at Run TimeProceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546919(16-28)Online publication date: 14-Sep-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Onward! 2015: 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!)
October 2015
307 pages
ISBN:9781450336888
DOI:10.1145/2814228
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Column-oriented Object Layout
  2. Data Science
  3. Dynamic Languages
  4. Just-in-Time Compilation

Qualifiers

  • Research-article

Conference

SPLASH '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 40 of 105 submissions, 38%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Evaluating IP Blacklists Effectiveness2023 10th International Conference on Future Internet of Things and Cloud (FiCloud)10.1109/FiCloud58648.2023.00056(336-343)Online publication date: 14-Aug-2023
  • (2022)Automatically Transforming Arrays to Columnar Storage at Run Time✱Proceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3560805(141-143)Online publication date: 14-Sep-2022
  • (2022)Automatic Array Transformation to Columnar Storage at Run TimeProceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546919(16-28)Online publication date: 14-Sep-2022
  • (2020)Performance Analysis of Financial Institution Operations in a NoSQL Columnar Database2020 15th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI49556.2020.9140981(1-6)Online publication date: Jun-2020
  • (2020)Reshape your layouts, not your programs: A safe language extension for better cache localityScience of Computer Programming10.1016/j.scico.2020.102481(102481)Online publication date: May-2020
  • (2019)Massively parallel GPU memory compactionProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329979(14-26)Online publication date: 23-Jun-2019
  • (2019)Fully Reflective Execution Environments: Virtual Machines for More Flexible SoftwareIEEE Transactions on Software Engineering10.1109/TSE.2018.281271545:9(858-876)Online publication date: 1-Sep-2019
  • (2018)Extending SHAPES for SIMD ArchitecturesProceedings of the 13th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems10.1145/3242947.3242951(23-29)Online publication date: 17-Jul-2018
  • (2018)Ikra-CppProceedings of the 2018 4th Workshop on Programming Models for SIMD/Vector Processing10.1145/3178433.3178439(1-9)Online publication date: 24-Feb-2018
  • (2017)Fast access to columnar, hierarchically nested data via code transformation2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8257933(253-262)Online publication date: Dec-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media