Tailor-made data management for embedded systems: A case study on Berkeley DB
Introduction
Today 98% of all computing systems are embedded systems [1]. Frequently cited examples are sensors, smartcards, and cellphones. Applications for these systems have different requirements on data management, ranging from simple data storage functionality, over stream processing, to complex data management including transactions, recovery, and replication. Separation of data management and application logic is needed to avoid redevelopment of data management in these systems. This can be achieved with a general data management infrastructure [2].
There are several challenges for data intensive applications in embedded systems, as we will illustrate in the following by the example of automotive systems. The amount of data that is processed in automobiles increases by 7–10% per year [3]. Data is captured in sensors and distributed and stored in working or persistent memory. Depending on the application scenario, different data management functions are needed including transaction management, query processing, and security mechanisms. The data stored in such systems ranges from single values and simple structures like arrays, as in sensors [4], to few tables or complete databases, as in navigation systems. Also advanced mechanisms like transaction processing or recovery are required in certain situations. The actually needed functionality depends on the application scenario and alternative implementations of functionality are required to support diverse hardware, e.g., special algorithms for storing data in EEPROM.
As in automobiles, data management is required in most computing systems. In contrast to contemporary desktop and server systems, the resources of embedded devices are very limited. This includes memory, computing power, and power consumption. When developing data intensive applications such constraints have to be taken into account.
The memory limitations in embedded systems range from a few kilobytes (e.g., in sensors) to moderate restrictions of a few megabytes of memory (e.g., in cellphones). This situation is similar to the situation in desktop computing systems in the early ’80s as well as for business applications in the ’70s. For all of these systems data management was needed that had to operate under restricted resources. This trend seems to be continued with ubiquitous computing [5] and in the future with developments such as smart dust [6]. It may continue up to the smallest possible devices limited by the currently known physical laws [7]. We formulate this trend as the law of scale invariance of data management:
There will be always small computing devices that operate with very constrained resources and independent of the size of these systems there is a need for dedicated data management.
Considering the limited resources and very special requirements on data management, traditional database management systems (DBMS) are inappropriate for use in embedded environments [8], [9], [10], [11]. There have been approaches that try to scale down data management technology for embedded systems [9], [10]. These approaches concentrate on supporting special hardware or application scenarios by manually creating customized solutions. When developing such customized systems, data management is often reinvented to satisfy computational and memory constraints as well as new kinds of requirements [11]. This practice leads to an increased time to market, high development costs, poor quality of software, and bugs. We argue that appropriate techniques are needed to develop tailor-made DBMS that attain high customizability and reuse without loss of performance.
Customizable software can be built with a number of different approaches such as components or preprocessors [12]. All these approaches have benefits and drawbacks. For example, components allow to modularize functionality but often degrade performance [11]. Furthermore, the crosscutting structure of some features hinders the use of components [13]. For example, encapsulation of a transaction management system as well as a B-tree into dedicated components is difficult to achieve because the transaction management system cuts across many other components of a DBMS including the B-tree. Preprocessors, e.g., the C preprocessor using #ifdef statements, do not have these problems but are known to pollute the source code and complicate maintenance and evolution of software [14], [15]. As a result, neither components nor preprocessors are an optimal solution to build applications for embedded systems and new programming paradigms might be employed [13], [16].
We argue that feature-oriented programming (FOP) [17], [18] is a promising technique to develop highly customizable DBMS for embedded systems in order to avoid the problems mentioned above. Using FOP, the functional requirements on a DBMS are represented by features that can be implemented in a modular way. FOP enables us to generate different applications by composing features which results in a product line of similar applications. We will show that FOP can be used to build a product line of data management software that (i) fulfills special requirements of a diverse set of applications, (ii) allows for generating different DBMS variants in a short period of time, and (iii) decreases resource consumption by including only required functionality. We also show that, in contrast to other approaches, the customizability has no negative impact on performance and resource consumption. We evaluate our approach by a refactorization and analysis of Oracle’s embedded database system Berkeley DB.1 Using FOP, we could decrease the binary size of a minimal Berkeley DB variant by about 50% and increase performance by about 16% for a reading benchmark. We could achieve these improvements while increasing customizability, i.e., we provide customizability also of small features in the refactored DBMS.
Section snippets
Tailor-made data management systems
The development of tailor-made data management software has been in the focus of research in recent years. Approaches for tailoring DBMS can be separated into solutions that focus on manual tailoring and solutions that achieve customizability of DBMS. Especially for embedded systems there are a number of manually tailored data management systems that address special application scenarios. For example, Bobineau et al. have developed PicoDBMS, a DBMS that supports special algorithms for
Methodology
In this section, we introduce feature-oriented programming (FOP) [18], [17], a programming paradigm for the development of customizable software. With FeatureC++ [36] we developed an FOP language extension for the C++ programming language. This allows us to apply FOP to software systems intended for resource constrained environments. Software development based on features was applied successfully in different domains [35], [37], [38], [39], [40], [31], [27], [41], [42], [43], [30], but there
Berkeley DB: a case study
Berkeley DB is an embedded DBMS for use in server systems but also in embedded systems. In our case study, we refactored the C version of Berkeley DB3 into an SPL. In the following, we give a short overview of Berkeley DB and describe the refactoring process. Since we used the C version of Berkeley DB, we propose a two step refactoring process: (i) the conversion from C to C++ and (ii) the conversion from C++ to FeatureC++.
Evaluation
For evaluation we analyze the feature-refactored version of Berkeley DB with respect to customizability and resource consumption. We compare several variants with the original C version of Berkeley DB.7
Discussion
Our evaluation has shown that we are able to preserve the performance characteristics of Berkeley DB when transforming it from C into FeatureC++ and even when applying a more fine-grained decomposition. This also shows that C++ and FeatureC++ do not necessarily have a negative impact on performance. In contrast to dynamic configuration, e.g., using if statements and function pointers, a performance improvement is possible by using static composition of features. However, also qualities of the
Conclusion and perspective
We have presented an approach to customize and downsize DBMS in order to tailor data management for embedded systems. We used FOP to generate specialized DBMS based on a common architecture and code base. The fine-grained customizability supported by FOP is the basis for tailoring DBMS to satisfy the resource constraints of embedded systems.
We have evaluated our approach by means of the medium-sized commercial DBMS Berkeley DB. While other approaches for developing customizable data management
Acknowledgements
We thank Christian Kästner, Martin Kuhlemann, and Norbert Siegmund for comments on drafts of this paper. Marko Rosenmüller is funded by German Ministry of Education and Research (BMBF), project number 01IM08003C. Sven Apel’s work is funded partly by the German Research Foundation (DFG), project AP 206/2-1. The presented work is part of projects Fame-Dbms,12 ViERforES,13 and FeatureFoundation.14
Marko Rosenmüller received his Diploma in Computer Science from the University of Magdeburg, Germany in 2005. From 2000 to 2006 he was a software developer at the icubic AG in Magdeburg. Since 2006 he is a Ph.D. student at the University of Magdeburg. His research interests include software product lines, tailor-made data management, and programming languages for product line development.
References (75)
- et al.
Optimizing lock protocols for native xml processing
Data and Knowledge Engineering (DKE)
(2008) Proactive computing
Communications of the ACM (CACM)
(2000)- T. Härder, DBMS architecture – still an open problem, in: Datenbanksysteme in Business, Technologie und Web (BTW),...
- L. Casparsson, A. Rajnak, K. Tindell, P. Malmberg, Volcano – a revolution in on-board communications, in: Volvo...
- et al.
Data management issues in vehicle control systems: a case study
Some computer science issues in ubiquitous computing
Communications of the ACM (CACM)
(1993)- et al.
Smart dust: communicating with a cubic-millimeter computer
Computer
(2001) There’s plenty of room at the bottom
- M. Stonebraker, U. Cetintemel, One size fits all: an idea whose time has come and gone, in: Proceedings of the...
- et al.
PicoDMBS: scaling down database techniques for the smartcard
Efficient data management on lightweight computing devices
Rethinking database system architecture: towards a self-tuning RISC-style database system
Generative Programming: Methods Tools and Applications
COMET: a component-based real-time database for automotive systems
Preprocessor conditional removal by simple partial evaluation
Application-tailored database systems: a case of aspects in an embedded database
Feature-oriented programming: a fresh look at objects
Scaling step-wise refinement
IEEE Transactions on Software Engineering (TSE)
Component Database Systems
The library scaling problem and the limits of concrete component reuse
Aspect-oriented programming
P2: a lightweight DBMS generator
Journal of Intelligent Information Systems (JIIS)
GENESIS: an extensible database management system
IEEE Transactions on Software Engineering (TSE)
Using AspectC to improve the modularity of path-specific customization in operating system code
Back to the future: a retroactive study of aspect evolution in operating system code
A quantitative analysis of aspects in the eCos kernel
Quantifying aspects in middleware platforms
Resolving feature convolution in middleware systems
Large-scale AOSD for middleware
Scalable component abstractions
Aspectual collaborations – combining modulesand aspects
The Computer Journal
Variability management with feature-oriented programming and aspects
Aspectual feature modules
IEEE Transactions on Software Engineering (TSE)
FeatureC++: on the symbiosis of feature-oriented and aspect-oriented programming
Cited by (0)
Marko Rosenmüller received his Diploma in Computer Science from the University of Magdeburg, Germany in 2005. From 2000 to 2006 he was a software developer at the icubic AG in Magdeburg. Since 2006 he is a Ph.D. student at the University of Magdeburg. His research interests include software product lines, tailor-made data management, and programming languages for product line development.
Sven Apel is a post-doctoral associate at the Chair of Programming at the University of Passau, Germany. He received a Ph.D. in Computer Science from the University of Magdeburg, Germany in 2007. His research interests include advanced programming paradigms, software product lines, and algebra for software construction.
Thomas Leich is currently working toward the Ph.D. degree in Computer Science at the University of Magdeburg, Germany. He is the head of the Department of Applied Informatics at the Metop Research Institute, Magdeburg, Germany. His research interests are tailor-made and embedded data management and software product lines.
Gunter Saake is a full professor of Computer Science. He is the head of the Database and Information Systems Group at the University of Magdeburg, Germany. His research interests include database integration, tailor-made data management, object-oriented information systems, and information fusion. He is a member of the IEEE Computer Society.