Measuring the health of open source software ecosystems: Beyond the scope of project health

https://doi.org/10.1016/j.infsof.2014.04.006Get rights and content

Abstract

Background

The livelihood of an open source ecosystem is important to different ecosystem participants: software developers, end-users, investors, and participants want to know whether their ecosystem is healthy and performing well. Currently, there exists no working operationalization available that can be used to determine the health of open source ecosystems. Health is typically looked at from a project scope, not from an ecosystem scope.

Objectives

With such an operationalization, stakeholders can make better decisions on whether to invest in an ecosystem: developers can select the healthiest ecosystem to join, keystone organizers can establish which governance techniques are effective, and end-users can select ecosystems that are robust, will live long, and prosper.

Method

Design research is used to create the health operationalization. The evaluation step is done using four ecosystem health projects from literature.

Results

The Open Source Ecosystem Health Operationalization is provided, which establishes the health of a complete software ecosystem, using the data from collections of open source projects that belong to the ecosystem.

Conclusion

The groundwork is done, by providing a summary of research challenges, for more research in ecosystem health. With the operationalization in hand, researchers no longer need to start from scratch when researching open source ecosystems’ health.

Introduction

“Ruby or Python?” “SugarCRM or a closed-source competitor?” “Drupal or Joomla?” “RedHat or Ubuntu?” These are questions often asked by developers, professionals, entrepreneurs, architects, and stakeholders related to software producing organizations. Choosing between different ecosystems is a complex task and such a decision will be determining many of the future developments within an organization. At present the only way to answer such a question is by doing sufficient reading, asking around, and finding out what the risks are of choosing to enter an ecosystem. One indicator of whether an ecosystem is alive or not can be determined by looking at the health of the keystone project, for instance by looking at the activity surrounding the Ubuntu project. Such activity consists of commits, recent releases, fixes, number of downloads, response times in forums and bug trackers, activity on e-mail lists, and contributions from non-developers. However, project health  ecosystem health.

Ecosystem health is operationalized in this work by taking a combined view at a keystone project and its surrounding projects. This work stands on the shoulders of two relevant contributions in the field of ecosystem health measurement. First, the work by Crowston et al. [3], who have provided a first operationalization of open source software project health, is used to establish health factors on the project level. Their work is also fundamental to OSSMole,1 a collection of meta-data about projects in some of the main repositories, like Github and SourceForge. Secondly, the work of den Hartigh et al. [6], where an operationalization of health measurement of a commercial ecosystem is provided, is followed as closely as possible.

Software ecosystems are sets of actors functioning as a unit and interacting with a shared market for software and services, together with the relationships among them [15]. A healthy unit should thus express qualities typically associated with health: liveliness, activity, longevity, etc. For this work, we take a simple definition for software ecosystem health: longevity and a propensity for growth [19]. The definition is only the first step, as both longevity and propensity for growth can be operationalized in different ways with a plethora of different metrics.

There is a distinct need for an Open Source Ecosystem Health Operationalization (OSEHO). Manikas and Hansen [23] recently published a call to action for the creation of such an operationalization, and laid the groundwork for it. Also, in our research agenda for software ecosystems [17], we call for more research into ecosystem health. Others have attempted to create their own operationalization, but these typically get stuck in the concept phase [31], [3], [30]. In this article, an OSEHO is provided and evaluated using four research projects into open source ecosystem health.

We continue this work with a description of the literature on health measurement in ecosystems and open source projects. Section 3 discusses the creation of the OSEHO and its evaluation challenges. In Section 4, the OSEHO that provides methods for measuring health of open source software ecosystems is presented, consisting of a generic ecosystem health model and a set of methods for analyzing open source ecosystem health. In Section 5 four research projects are presented that apply parts of the model in practice. Furthermore, an analysis of the research projects and their aims (provide insight mostly), the indicators most frequently used (active developers, projects), and the research methods applied (mining repositories, web scraping) are presented. Section 6 presents a set of challenges that are met when applying the model and that were found in the four research projects, mostly having to do with data selection, preparation, and analysis. The article ends with a discussion on the applicability of an OSEHO and a summary of the conclusions and future research challenges.

Section snippets

Literature about ecosystem health

There is surprisingly little literature available about open source ecosystem health. Different perceptions exist and frequently ecosystem and project health are used interchangeably, such as in the work of Lundell et al. [20], who discuss open source ecosystems as being equal to one project. In the continued work of Gamalielsson et al. [9], [8], the responsiveness of developers on the mailing list of the Nagios community is measured as an indicator for open source community health, but does

Research approach

The goal of this research is to provide a comprehensive overview of the health metrics that can be used to determine the health of an open source ecosystem. It does so by creating an inventory of all metrics mentioned in literature that could potentially indicate ecosystem health and then placing these metrics in a framework. The framework can be applied by researchers who aim to reach a goal associated with ecosystem health, such as improve activity in an ecosystem, evaluate the health of one

Open Source Ecosystem Health Operationalization (OSEHO)

Fig. 1 represents the OSEHO. The framework is built up out of three pillars, being the productivity, robustness, and niche creation pillars, which are addressed in the discussion of the literature in Section 2. The pillars are separated into three layers, being the theory level, the network level, and the project level. At the top level is displayed what the theoretical model of Den Hartigh prescribes to use as guidelines for operationalizing the health concept, which in turn is inspired by

Analysis of the research projects

Four research projects have been selected to illustrate the use of the OSEHO. The selection criteria have been listed in Section 3.

The first project applies ecosystem health metrics to determine how healthy the ecosystems surrounding commercial Platform as a Service providers are [19]. The goal was to provide stakeholders in these ecosystem with insight into their ecosystem development and the most important metrics that indicate success in these ecosystems. The data source was GitHub and

Repository mining research challenges

The research challenges from the projects are listed in Table 1 and are collected and summarized to form common research challenges into a research agenda. Each of the terms in bold can be considered a challenge for any new ecosystem (health) study that involves repository mining. The challenges are split into data selection challenges and data preparation and analysis challenges.

Discussion

The framework is evaluated using the research projects described in the previous sections. There are currently few works on ecosystem health available and the selection of just four research projects is somewhat meager. As these research projects do not fully cover the metrics in the framework, the work cannot be considered completely evaluated. The OSEHO can be further evaluated in the future with more projects that study ecosystem health. The framework, however, is the most complete framework

References (31)

  • S. Jansen et al.

    Shades of gray: opening up a software producing organization with the open software enterprise model

    J. Syst. Software

    (2012)
  • A. Baars et al.

    A framework for software ecosystem governance

  • O. Barbosa et al.

    A systematic mapping study on software ecosystems through a three-dimensional perspective

  • K. Crowston et al.

    Information systems success in free and open source software development: theory and measures

    Software Process: Improvement and Practice

    (2006)
  • K. Crowston et al.

    Open source software projects as virtual organisations: competency rallying for software development

    IEEE Software

    (2002)
  • M. Cusumano

    Staying Power: Six Enduring Principles for Managing Strategy and Innovation in an Uncertain World (Lessons from Microsoft, Apple, Intel, Google, Toyota and More)

    (2012)
  • E. den Hartigh, M. Tol, W. Visscher, The health measurement of a business ecosystem, in: S. Jansen, M. Cusumano, S....
  • D. Dhungana et al.

    Guiding principles of natural ecosystems and their applicability to software ecosystems

  • J. Gamalielsson et al.

    The nagios community: an extended quantitative analysis

  • J. Gamalielsson, B. Lundell, B. Lings, Responsiveness as a measure for assessing the health of oss ecosystems, in:...
  • G. Gousios

    The ghtorent dataset and tool suite

  • N. Haenni, M. Lungu, N. Schwarz, O. Nierstrasz, Categorizing developer information needs in software ecosystems, in:...
  • R. Hoving et al.

    Python: characteristics identification of a free open source software ecosystem

  • M. Iansiti et al.

    The Keystone Advantage: What the New Dynamics of Business Ecosystems Mean for Strategy, Innovation, and Sustainability

    (2004)
  • M. Iansiti et al.

    Strategy as ecology

    Harvard Business Review

    (2004)
  • Cited by (132)

    View all citing articles on Scopus
    View full text