An Open Source Eco-System?

To summarise the prior writings from my doctoral thesis: I've shown how there are both notable differences and similarities between certain measures of FLOSS repositories (such as number of contributors attracted, rate of contributions, complexity control work, etc.) The pattern of similarities and differences that emerged clearly differentiated one group (containing Debian, GNOME, and KDE) from the other (Savannah and SourceForge), which I tentatively named "controlled" and "open" repositories respectively. Furthermore, the result demonstrated that Debian deserved to be differentiated somewhat from GNOME and KDE, although to a lesser extent.

What is also interesting about this is that when we compare repositories that are grouped together (GNOME to KDE, or Savannah to SourceForge) we observe very great similarities in these same measures. This raises intriguing possibility number one: that one can establish "types", by which repositories can be classified. Not only defined by the sorts of measures we can expect, but also the way it functions and organises itself. For example, Savannah and SourceForge often incubate any type of newer software and are open to any contributors. Conversely, GNOME and KDE deliver more well-defined programs and erect meritocratic barriers to entry for contributors and new software.

Now, when you consider that individual projects in these repositories can transit between them, this raises the second intriguing possibility: these types could be arranged into an evolutionary framework, an eco-system of repositories, if you like. This diagram is an attempt to visualise it.

Framework of progression for FLOSS projects through various types of repositories

Definitions

  • Open Repository: A repository with a low barrier to entry. That is, the process of joining and adding software is essentially trivial and uncontrolled. Projects tend to be independent and no guiding policies cover the development or organisation process (aside from any terms of service agreements). Examples: Savannah, SourceForge.
  • Controlled Repository: A repository with a higher barrier to entry. That is, joining as a developer or adding new projects are subject to the control of existing members. There is likely a set of guiding policies and development standards enforced throughout the community, as well as goals/roadmaps. Within this group, one can further differentiate:
    • Distributions: E.g. Debian. The projects hosted here are part of the larger GNU/Linux operating system. From the point of view of the process, Debian developers are not typically programming for other Debian projects.
    • Meta-Projects: E.g. GNOME and KDE. The projects contained in a meta-project are each part of a wider system (in these examples, each is a complete desktop environment), and there are typically several glue projects also.
  • Transition: Moving a project from repository to another. This could be a migration (whereby the storage location of a project is changed), or an inclusion (whereby another repository distributes the project, but its location remains the same).
    • Bold arrows: These are typically observed pathways between repository types taken by FLOSS projects.
    • Dashed arrows: These represent atypical transitions, observed much less frequently. No empirical study has been performed on this transition.

Consequences

The evidence I have gathered points this to being an evolutionary framework, because the differences observed are mostly between rates of activity rather than absolute differences in quality. Note that these statements pertain to the average project observed in each repository; there are overlaps present in the process and product attributes. Therefore it should not be expected that all projects in a repository will necessarily perform at the levels established as average for that repository.

What might reasonably be expected to be seen of an arbitrary project's measured attributes is no doubt dependent on a number of other factors that influence a free software project's success, such as the understandability of the initial version, the existence of other alternative projects with similar or identical functionality etc. While this framework insists that the repository is an important driver of determining project evolution (especially at the macro level), for the individual project it is one of a number of factors. When considering the consequences of a repository’s effects, this contextual detail determines how the framework may be useful.

Understandably, a software developer is expected to be primarily interested in how their own individual project may be influenced, yet within this framework that is informed by measurement of a collection of projects. The range of values of a metric observed act as some indicator of the expected value for a repository type, or even a specific repository. It is feasible that any desired quantifiable attribute of a project can be measured and built up into a set of values similarly to the work shown in previous posts. Critically, this provides a developer with quantifiable means for use in a comparative judgement of where a project is best placed for the project's particular evolutionary needs, as well as for judging how and why a transition should be brought about in future. All the time this planning will need to be balanced with the typical participatory requirements of each repository type.

For researchers it is useful to understand a repository's influence at both the individual and collective project levels. When embarking upon a study involving empirical research of free software projects, knowledge of the expected evolutionary effects of a repository, or that typical of a repository type, on a project is useful as it may have a bearing on how results are interpreted.

 

karl

 

4 thoughts on “An Open Source Eco-System?

  1. > For researchers it is useful to understand a
    > repository’s influence at both the individual and
    > collective project levels.
    Here it could be interesting to look into different types of repositories: There is no black/white (open/controlled).
    Whereas in SF.net it is quite easy to start new projects, GNA and seul.org have higher-barriers. Does this higher barrier makes those sites in-between/hybrid repositories?

    I think it is worth keeping the difference in mind between starting a (sub)project and contributing to a project. On SF.net, it is easy to start a new one, but it may be difficult to contribute to one in case there is low activity and no-one reacts to your bug reports or messages. Meta-projects (having KDE in mind) are more careful in accepting new (sub)projects as those have to fulfill some (informal) quality standards (history proofing active maintainership, not watering the brand etc). Contributions are usually more systematically processed and accepted if deemed worth as there is a well-established and actively used/monitored system of bug trackers, review boards, IRC etc.

  2. @Thomas: Thanks for the comments. In my thesis I deal with both the issue of “hybrid” repos and a deeper discussion of exactly what you refer to as the way meta-projects work vs. the forges.

    Sadly, I have to limit my discussion on the blog, I can’t reproduce my 150-page thesis here verbatim, although it is available via Uni. of Lincoln or the British Library.

    And nice choice of design on your blog by the way :)

Leave a Reply

Your email address will not be published. Required fields are marked *