Selecting highly optimal architectural feature sets with Filtered Cartesian Flattening

https://doi.org/10.1016/j.jss.2009.02.011Get rights and content

Abstract

Feature modeling is a common method used to capture the variability in a configurable application. A key challenge developers face when using a feature model is determining how to select a set of features for a variant that simultaneously satisfy a series of resource constraints. This paper presents an approximation technique for selecting highly optimal feature sets while adhering to resource limits. The paper provides the following contributions to configuring application variants from feature models: (1) we provide a polynomial time approximation algorithm for selecting a highly optimal set of features that adheres to a set of resource constraints, (2) we show how this algorithm can incorporate complex configuration constraints; and (3) we present empirical results showing that the approximation algorithm can be used to derive feature sets that are more than 90%+ optimal.

Introduction

Feature models are a modeling technique that were originally developed to model the decision space of configuring a customizable software application (Kang et al., 1998). Since their original development, the feature modeling formalisms have been used to capture other types of software configuration concerns, such as low-level variations in implementation (Batory et al., 2003, Metzger et al., 2007, Benavides et al., 2005). Feature models have become widely used in numerous domains, such as the automotive industry, and are important mechanism for managing application configuration.

Feature models describe a configuration/decision space using a tree structure (e.g., shown in Fig. 1) where each node in the tree represents a point of variation or increment of functionality. The feature model for an application provides a compact representation of all possible variants of the application. Each unique configuration of the application—called a variant—is described as a set of selected features. Any given feature selection can be validated against its underlying feature model to check if it represents a valid configuration of the application.

Configurable applications allow software to be reused for different requirement and constraint sets. For example, a face recognition system to identify known cheaters in a casino can provide a number of different face recognition algorithms that can be configured depending on the desired accuracy, system cost, and processing power of the hosting infrastructure. Customers with smaller budgets can choose cheaper variants of the system that employ less accurate algorithms capable of running effectively on commodity hardware. For more expensive variants, algorithms with greater accuracy and correspondingly increased resource consumption can be paired with more expensive custom hardware.

A core part of building a configurable application is documenting the rules governing the configuration of the application’s constituent components. For example, although running two face recognition algorithms in parallel might produce the highest accuracy, a system may not be capable of simultaneously employing two different face recognition algorithms. It is therefore crucial to capture these constraints that guide the configuration of the architecture. Feature modeling (Kang et al., 1998), although not originally designed for this purpose, has become a commonly used technique to capture these configuration rules.

Choosing the correct set of features for an application is hard because even small numbers of design variables (i.e., small feature sets) can produce an exponential number of design permutations. For example, 300 different application configurations can be derived from the relatively simple feature model shown in Fig. 2.

Resource constraints, such as the maximum available memory or total budget for a system, also add significant complexity to the design process. As shown in Section 4, finding an optimal variant that adheres to both the feature model constraints and a system’s resource constraints is an NP-hard problem (Cormen et al., 1990). The manual processes commonly used to select architectural feature sets scale poorly for NP-hard problems.

For large-scale systems—or in domains where optimization is critical—algorithmic techniques are needed to help software engineers make informed feature selections. For example, developers can choose the features that are deemed critical for the system or driven by physical concerns that are hard to quantify (such as camera types and their arrangement). An algorithmic technique can then be used to make the remaining feature selections that maximize accuracy while not exceeding the remaining budgetary allocation. Moreover, developers may want to evaluate tradeoffs in architectures, e.g., use a specific camera setup that minimizes memory consumption as opposed to maximizing accuracy.

Existing algorithmic techniques for aiding developers in the selection of variants rely on exact methods, such as integer programming, that exhibit exponential time complexity and poor scalability. In Section 6, we present experiments on deriving feature selections from models of varying sizes with an exponential technique. A model with 70 features requires less than 1000 ms to derive a feature selection for. Doubling the model size to 140 features requires roughly 200,000 ms to find a feature selection. A further doubling to 280 features would not be tractable in a realistic time-frame. Since industrial-size feature models can contain thousands of features, these exact techniques are impractical for providing algorithmic design guidance, such as automated feature selection optimization. For large problem sizes, this slow solving time makes it hard for developers to evaluate highly optimized design variations rapidly.

This paper presents a polynomial time approximation algorithm, called Filtered Cartesian Flattening, that can be used to derive an optimal application variant subject to resource constraints. Using Filtered Cartesian Flattening, developers can quickly derive and evaluate different architectural variants that both optimize varying system capabilities and honor resource limitations. Moreover, each architectural variant can be derived in seconds as opposed to the days, hours, or longer that would be required with an exact technique, thereby allowing the evaluation of more design variations in a shorter time frame.

This paper provides the following contributions to the study of applying the Filtered Cartesian Flattening algorithm to assist developers in selecting application variants:

  • (1)

    We prove that optimally selecting feature sets that adhere to resource constraints is an NP-hard problem.

  • (2)

    We present a polynomial time approximation algorithm for optimizing the selection of application variants subject to resource constraints.

  • (3)

    We show how any arbitrary multi-dimensional multiple-choice Knapsack (MMKP) algorithm (Moser et al., 1997, Pisinger, 1995, Sinha and Zoltners, 1979) can be used as the final step in Filtered Cartesian Flattening, which allows for fine-grained control of tradeoffs between solution optimality and solving speed.

  • (4)

    We present empirical results from experiments performed on over 500,000 feature model instances that show how Filtered Cartesian Flattening averages 92.56%+ optimality on feature models with 1000 to 10,000 features.

  • (5)

    We provide metrics that can be used to examine a feature selection problem instance and determine if Filtered Cartesian Flattening should be applied.

The remainder of this paper is organized as follows: Section 2 provides a brief overview of feature modeling; Section 3 presents a motivating example used throughout the paper; Section 4 describes the challenges of optimally selecting a set of features subject to a set of resource constraints; Section 5 presents the Filtered Cartesian Flattening approximation algorithm for optimally selecting feature sets; Section 6 presents empirical results showing that our algorithm averages more than 90%+ optimality on feature models ranging from 1000 to 10,000 features; Section 7 compares our work to related research; and Section 8 presents concluding remarks.

Section snippets

Overview of feature modeling

Feature modeling (Kang et al., 1998) is a modeling technique that can be used to describe the variability in a configurable application with a set of features arranged in a tree structure. Each feature represents an increment in functionality or variation in the application’s configuration. For example, Fig. 1 shows a feature model describing the algorithmic variability in a system for identifying faces (Phillips et al., 2000) in images. Each box represents a feature. For example, Linear

Motivating example

A key need with feature modeling is determining how to select a good set of features for a requirement set. For example, given a face recognition system that includes a variety of potential camera types, face recognition algorithms, image formats, and camera zoom capabilities, what is the most accurate possible system that can be constructed with a given budget? The challenge is that with hundreds or thousands of features—and a vastly larger number of feature selection permutations—it is hard

Challenges of feature selection problems with resource constraints

To make well-informed configuration decisions, developers need the ability to easily create and evaluate different architecture variations tuned to maximize or minimize specific system capabilities, such as minimizing total cost or required memory. Generating and evaluating a range of configurations allows developers to gain insights into not only what variants optimize a particular system concern, but also other design aspects, such as patterns that tend to lead to more or less optimal

Filtered Cartesian Flattening

This section presents the Filtered Cartesian Flattening (FCF) approximation technique for optimal feature selection subject to resource constraints. Filtered Cartesian Flattening transforms an optimal feature selection problem with resource constraints into an approximately equivalent MMKP problem, which is then solved using an MMKP approximation algorithm. The MMKP problem is designed such that any correct answer to the MMKP problem is also a correct solution to the feature selection problem

Results

This section presents empirical results from experiments we performed to evaluate the types of architectural feature selection problem instances on which Filtered Cartesian Flattening performs well and those for which it does not. When using an approximation algorithm, such as Filtered Cartesian Flattening, that does not guarantee an optimal answer a key question is how close the algorithm can get to the optimal answer. The solutions produced by the approximation algorithm are valid and

Related work

This section describes related work on algorithmic techniques for feature selection and resource allocation and compares it with our Filtered Cartesian Flattening algorithm.

Concluding remarks

To make the sound architectural feature selection decisions, developers need the ability to algorithmically generate architectural variants that optimize desired system properties. A key challenge, however, is that selecting a set of architectural features that maximizes a system capability while adhering to resource constraints is an NP-hard problem. Although there are numerous approximation algorithms for other NP-hard problems, they do not directly support optimal feature selection subject

Dr. Jules White is a Research Assistant Professor in the Electrical Engineering and Computer Science Department at Vanderbilt University. Dr. White’s research focuses on applying a combination of modeling and constraint-based optimization techniques to the deployment and configuration of complex software systems. Dr. White is the head of the Eclipse Foundation’s Generic Eclipse Modeling System Project. URL: www.dre.vanderbilt.edu/~jules.

References (38)

  • Benavides, D., Trinidad, P., Ruiz-Cortés, A., 2005. Automated reasoning on feature models. In: 17th Conference on...
  • R. Capilla et al.

    Modelling variability with features in distributed architectures

    Lecture Notes in Computer Science

    (2002)
  • Coffman, E., Garey, M., Johnson, D., 1996. Approximation algorithms for bin packing: a survey. Approximation algorithms...
  • J. Coplien et al.

    Commonality and variability in software engineering

    IEEE Software

    (1998)
  • T.H. Cormen et al.

    Introduction to Algorithms

    (1990)
  • Dudley, G., Joshi, N., Ogle, D., Subramanian, B., Topol, B., 2004. Autonomic self-healing systems in a cross-product IT...
  • Etxeberria, L., Sagardui, G., 2008. Variability driven quality evaluation in software product lines. In: Software...
  • O. Ibarra et al.

    Fast approximation algorithms for the Knapsack and sum of subset problems

    Journal of the ACM (JACM)

    (1975)
  • Immonen, A., 2005. A method for predicting reliability and availability at the architectural level. In: Käkölä, T.,...
  • Cited by (100)

    View all citing articles on Scopus

    Dr. Jules White is a Research Assistant Professor in the Electrical Engineering and Computer Science Department at Vanderbilt University. Dr. White’s research focuses on applying a combination of modeling and constraint-based optimization techniques to the deployment and configuration of complex software systems. Dr. White is the head of the Eclipse Foundation’s Generic Eclipse Modeling System Project. URL: www.dre.vanderbilt.edu/~jules.

    Brian Dougherty is a Ph.D. candidate in the Electrical Engineering and Computer Science Department at Vanderbilt University. His research work investigates new techniques for using heuristic algorithms to automate the deployment of real-time software systems.

    Dr. Douglas C. Schmidt is a Full Professor in the Electrical Engineering and Computer Science Department and Associate Chair of the Computer Science and Engineering program at Vanderbilt University, Nashville, TN. During the past two decades he has led pioneering research on patterns, optimization techniques, and empirical analyses of object-oriented and component-based frameworks and model-driven development tools that facilitate the development of middleware and applications for distributed real-time and embedded (DRE) systems. Dr. Schmidt is an expert on DRE computing patterns and middleware frameworks and has published over 400 technical papers and 9 books that cover a range of topics including high-performance communication software systems, parallel processing for high-speed networking protocols, quality-of-service (QoS)-enabled distributed object computing, object-oriented patterns for concurrent and distributed systems, and model-driven development tools. Dr. Schmidt received his Ph.D. in Computer Science from the University of California, Irvine in 1994. URL: www.dre.vanderbilt.edu/~schmidt.

    View full text