Elsevier

Data & Knowledge Engineering

Volume 68, Issue 12, December 2009, Pages 1493-1512
Data & Knowledge Engineering

Tailor-made data management for embedded systems: A case study on Berkeley DB

https://doi.org/10.1016/j.datak.2009.07.013Get rights and content

Abstract

Applications in the domain of embedded systems are diverse and store an increasing amount of data. In order to satisfy the varying requirements of these applications, data management functionality is needed that can be tailored to the applications’ needs. Furthermore, the resource restrictions of embedded systems imply a need for data management that is customized to the hardware platform. In this paper, we present an approach for decomposing data management software for embedded systems using feature-oriented programming. The result of such a decomposition is a software product line that allows us to generate tailor-made data management systems. While existing approaches for tailoring software have significant drawbacks regarding customizability and performance, a feature-oriented approach overcomes these limitations, as we will demonstrate. In a non-trivial case study on Berkeley DB, we evaluate our approach and compare it to other approaches for tailoring DBMS.

Introduction

Today 98% of all computing systems are embedded systems [1]. Frequently cited examples are sensors, smartcards, and cellphones. Applications for these systems have different requirements on data management, ranging from simple data storage functionality, over stream processing, to complex data management including transactions, recovery, and replication. Separation of data management and application logic is needed to avoid redevelopment of data management in these systems. This can be achieved with a general data management infrastructure [2].

There are several challenges for data intensive applications in embedded systems, as we will illustrate in the following by the example of automotive systems. The amount of data that is processed in automobiles increases by 7–10% per year [3]. Data is captured in sensors and distributed and stored in working or persistent memory. Depending on the application scenario, different data management functions are needed including transaction management, query processing, and security mechanisms. The data stored in such systems ranges from single values and simple structures like arrays, as in sensors [4], to few tables or complete databases, as in navigation systems. Also advanced mechanisms like transaction processing or recovery are required in certain situations. The actually needed functionality depends on the application scenario and alternative implementations of functionality are required to support diverse hardware, e.g., special algorithms for storing data in EEPROM.

As in automobiles, data management is required in most computing systems. In contrast to contemporary desktop and server systems, the resources of embedded devices are very limited. This includes memory, computing power, and power consumption. When developing data intensive applications such constraints have to be taken into account.

The memory limitations in embedded systems range from a few kilobytes (e.g., in sensors) to moderate restrictions of a few megabytes of memory (e.g., in cellphones). This situation is similar to the situation in desktop computing systems in the early ’80s as well as for business applications in the ’70s. For all of these systems data management was needed that had to operate under restricted resources. This trend seems to be continued with ubiquitous computing [5] and in the future with developments such as smart dust [6]. It may continue up to the smallest possible devices limited by the currently known physical laws [7]. We formulate this trend as the law of scale invariance of data management:

There will be always small computing devices that operate with very constrained resources and independent of the size of these systems there is a need for dedicated data management.

Considering the limited resources and very special requirements on data management, traditional database management systems (DBMS) are inappropriate for use in embedded environments [8], [9], [10], [11]. There have been approaches that try to scale down data management technology for embedded systems [9], [10]. These approaches concentrate on supporting special hardware or application scenarios by manually creating customized solutions. When developing such customized systems, data management is often reinvented to satisfy computational and memory constraints as well as new kinds of requirements [11]. This practice leads to an increased time to market, high development costs, poor quality of software, and bugs. We argue that appropriate techniques are needed to develop tailor-made DBMS that attain high customizability and reuse without loss of performance.

Customizable software can be built with a number of different approaches such as components or preprocessors [12]. All these approaches have benefits and drawbacks. For example, components allow to modularize functionality but often degrade performance [11]. Furthermore, the crosscutting structure of some features hinders the use of components [13]. For example, encapsulation of a transaction management system as well as a B-tree into dedicated components is difficult to achieve because the transaction management system cuts across many other components of a DBMS including the B-tree. Preprocessors, e.g., the C preprocessor using #ifdef statements, do not have these problems but are known to pollute the source code and complicate maintenance and evolution of software [14], [15]. As a result, neither components nor preprocessors are an optimal solution to build applications for embedded systems and new programming paradigms might be employed [13], [16].

We argue that feature-oriented programming (FOP) [17], [18] is a promising technique to develop highly customizable DBMS for embedded systems in order to avoid the problems mentioned above. Using FOP, the functional requirements on a DBMS are represented by features that can be implemented in a modular way. FOP enables us to generate different applications by composing features which results in a product line of similar applications. We will show that FOP can be used to build a product line of data management software that (i) fulfills special requirements of a diverse set of applications, (ii) allows for generating different DBMS variants in a short period of time, and (iii) decreases resource consumption by including only required functionality. We also show that, in contrast to other approaches, the customizability has no negative impact on performance and resource consumption. We evaluate our approach by a refactorization and analysis of Oracle’s embedded database system Berkeley DB.1 Using FOP, we could decrease the binary size of a minimal Berkeley DB variant by about 50% and increase performance by about 16% for a reading benchmark. We could achieve these improvements while increasing customizability, i.e., we provide customizability also of small features in the refactored DBMS.

Section snippets

Tailor-made data management systems

The development of tailor-made data management software has been in the focus of research in recent years. Approaches for tailoring DBMS can be separated into solutions that focus on manual tailoring and solutions that achieve customizability of DBMS. Especially for embedded systems there are a number of manually tailored data management systems that address special application scenarios. For example, Bobineau et al. have developed PicoDBMS, a DBMS that supports special algorithms for

Methodology

In this section, we introduce feature-oriented programming (FOP) [18], [17], a programming paradigm for the development of customizable software. With FeatureC++ [36] we developed an FOP language extension for the C++ programming language. This allows us to apply FOP to software systems intended for resource constrained environments. Software development based on features was applied successfully in different domains [35], [37], [38], [39], [40], [31], [27], [41], [42], [43], [30], but there

Berkeley DB: a case study

Berkeley DB is an embedded DBMS for use in server systems but also in embedded systems. In our case study, we refactored the C version of Berkeley DB3 into an SPL. In the following, we give a short overview of Berkeley DB and describe the refactoring process. Since we used the C version of Berkeley DB, we propose a two step refactoring process: (i) the conversion from C to C++ and (ii) the conversion from C++ to FeatureC++.

Evaluation

For evaluation we analyze the feature-refactored version of Berkeley DB with respect to customizability and resource consumption. We compare several variants with the original C version of Berkeley DB.7

Discussion

Our evaluation has shown that we are able to preserve the performance characteristics of Berkeley DB when transforming it from C into FeatureC++ and even when applying a more fine-grained decomposition. This also shows that C++ and FeatureC++ do not necessarily have a negative impact on performance. In contrast to dynamic configuration, e.g., using if statements and function pointers, a performance improvement is possible by using static composition of features. However, also qualities of the

Conclusion and perspective

We have presented an approach to customize and downsize DBMS in order to tailor data management for embedded systems. We used FOP to generate specialized DBMS based on a common architecture and code base. The fine-grained customizability supported by FOP is the basis for tailoring DBMS to satisfy the resource constraints of embedded systems.

We have evaluated our approach by means of the medium-sized commercial DBMS Berkeley DB. While other approaches for developing customizable data management

Acknowledgements

We thank Christian Kästner, Martin Kuhlemann, and Norbert Siegmund for comments on drafts of this paper. Marko Rosenmüller is funded by German Ministry of Education and Research (BMBF), project number 01IM08003C. Sven Apel’s work is funded partly by the German Research Foundation (DFG), project AP 206/2-1. The presented work is part of projects Fame-Dbms,12 ViERforES,13 and FeatureFoundation.14

Marko Rosenmüller received his Diploma in Computer Science from the University of Magdeburg, Germany in 2005. From 2000 to 2006 he was a software developer at the icubic AG in Magdeburg. Since 2006 he is a Ph.D. student at the University of Magdeburg. His research interests include software product lines, tailor-made data management, and programming languages for product line development.

References (75)

  • M. Haustein et al.

    Optimizing lock protocols for native xml processing

    Data and Knowledge Engineering (DKE)

    (2008)
  • D. Tennenhouse

    Proactive computing

    Communications of the ACM (CACM)

    (2000)
  • T. Härder, DBMS architecture – still an open problem, in: Datenbanksysteme in Business, Technologie und Web (BTW),...
  • L. Casparsson, A. Rajnak, K. Tindell, P. Malmberg, Volcano – a revolution in on-board communications, in: Volvo...
  • D. Nyström et al.

    Data management issues in vehicle control systems: a case study

  • M. Weiser

    Some computer science issues in ubiquitous computing

    Communications of the ACM (CACM)

    (1993)
  • B. Warneke et al.

    Smart dust: communicating with a cubic-millimeter computer

    Computer

    (2001)
  • R.P. Feynman

    There’s plenty of room at the bottom

  • M. Stonebraker, U. Cetintemel, One size fits all: an idea whose time has come and gone, in: Proceedings of the...
  • C. Bobineau et al.

    PicoDMBS: scaling down database techniques for the smartcard

  • R. Sen et al.

    Efficient data management on lightweight computing devices

  • S. Chaudhuri et al.

    Rethinking database system architecture: towards a self-tuning RISC-style database system

  • K. Czarnecki et al.

    Generative Programming: Methods Tools and Applications

    (2000)
  • D. Nyström et al.

    COMET: a component-based real-time database for automotive systems

  • H. Spencer, G. Collyer, #ifdef Considered harmful, or portability experience with C news, in: Proceedings of the USENIX...
  • I.D. Baxter et al.

    Preprocessor conditional removal by simple partial evaluation

  • A. Tešanović et al.

    Application-tailored database systems: a case of aspects in an embedded database

  • C. Prehofer

    Feature-oriented programming: a fresh look at objects

  • D. Batory et al.

    Scaling step-wise refinement

    IEEE Transactions on Software Engineering (TSE)

    (2004)
  • S.R. Madden, M.J. Franklin, J.M. Hellerstein, W. Hong, Tinydb: an acquisitional query processing system for sensor...
  • A. Geppert, S. Scherrer, K.R. Dittrich, KIDS: Construction of Database Management Systems based on Reuse, Tech. Rep....
  • K.R. Dittrich et al.

    Component Database Systems

    (2001)
  • T.J. Biggerstaff

    The library scaling problem and the limits of concrete component reuse

  • G. Kiczales et al.

    Aspect-oriented programming

  • D. Batory et al.

    P2: a lightweight DBMS generator

    Journal of Intelligent Information Systems (JIIS)

    (1997)
  • D. Batory et al.

    GENESIS: an extensible database management system

    IEEE Transactions on Software Engineering (TSE)

    (1988)
  • Y. Coady et al.

    Using AspectC to improve the modularity of path-specific customization in operating system code

  • Y. Coady et al.

    Back to the future: a retroactive study of aspect evolution in operating system code

  • D. Lohmann et al.

    A quantitative analysis of aspects in the eCos kernel

  • C. Zhang et al.

    Quantifying aspects in middleware platforms

  • C. Zhang et al.

    Resolving feature convolution in middleware systems

  • A. Colyer et al.

    Large-scale AOSD for middleware

  • M. Odersky et al.

    Scalable component abstractions

  • K.J. Lieberherr et al.

    Aspectual collaborations – combining modulesand aspects

    The Computer Journal

    (2003)
  • M. Mezini et al.

    Variability management with feature-oriented programming and aspects

  • S. Apel et al.

    Aspectual feature modules

    IEEE Transactions on Software Engineering (TSE)

    (2008)
  • S. Apel et al.

    FeatureC++: on the symbiosis of feature-oriented and aspect-oriented programming

  • Cited by (0)

    Marko Rosenmüller received his Diploma in Computer Science from the University of Magdeburg, Germany in 2005. From 2000 to 2006 he was a software developer at the icubic AG in Magdeburg. Since 2006 he is a Ph.D. student at the University of Magdeburg. His research interests include software product lines, tailor-made data management, and programming languages for product line development.

    Sven Apel is a post-doctoral associate at the Chair of Programming at the University of Passau, Germany. He received a Ph.D. in Computer Science from the University of Magdeburg, Germany in 2007. His research interests include advanced programming paradigms, software product lines, and algebra for software construction.

    Thomas Leich is currently working toward the Ph.D. degree in Computer Science at the University of Magdeburg, Germany. He is the head of the Department of Applied Informatics at the Metop Research Institute, Magdeburg, Germany. His research interests are tailor-made and embedded data management and software product lines.

    Gunter Saake is a full professor of Computer Science. He is the head of the Database and Information Systems Group at the University of Magdeburg, Germany. His research interests include database integration, tailor-made data management, object-oriented information systems, and information fusion. He is a member of the IEEE Computer Society.

    View full text