Overview
Note: The term “sustainability index” is tentative and might change.
Goal
The goal of this project is to develop a set of metrics or indicators to evaluate the “sustainability” of open source projects. We want to answer the question: “What are the quantitative metrics that indicate an open source project is stable and sustainable?”
The SI is meant to be an assessment tool, not a judgment or a ranking. If we are successful, the SI and associated metrics or model will help identify:
-
Which OSS projects are most in need of support—and to some extent the scale of support needed.
-
How the sustainability of OSS projects changes over time.
How this supports the NumFOCUS Sustainability Program
Such information directly supports the work of the NumFOCUS sustainability program because it will:
-
Help me prioritize the work of the Sustainability Program.
-
Provide an objective/quantitative assessment of the program’s work.
-
Give an objective/quantitative way to communicate about the sustainability needs of current NF projects and potentially evaluate projects applying to join NF.
-
Over time, provide a method by which to measure the impact of NF’s overall sustainability efforts.
How this supports the larger OSS community
While the SI will initially be applied to NF projects, the methodology and index should also be applicable to many other OSS projects.
What do we mean by sustainability?
Broadly speaking, by “sustainability” we mean that projects have sufficient and appropriate resources and knowledge to:
-
Maintain software such that it continues to meet the needs of its users and contributors, such as fixing bugs, security vulnerabilities, and adding new features
-
respond to new challenges and computer architectures
Scope of applicable projects
During the initial phases of the SI project, we will likely limit our analysis and model development to OSS projects of a certain maturity, size, or adoption threshold. The exact parameters for this are to be determined, but the idea is to focus on OSS generally considered to be “well-established.”
Project details
Multiple phases
Determining the metrics and algorithms that will be used to develop the indicators and subsequently the SI is a major aspect of this project. Most likely it will include multiple phases, each one building upon the previous.
The first phase of this project will be to identify a first relevant indicator that we can, with reasonable effort, collect, calculate, compare, and report across all NumFOCUS projects and then build a preliminary mechanism for doing so on an ongoing basis. Currently within NumFOCUS no such infrastructure exists and building it itself will not be a trivial task. Thus, focusing on only one metric at first, mindfully chosen, is the most efficient way to build a proof of concept, or a “minimum viable product.”
Once we have built a centralized reporting mechanism, we can take what we learned doing that for one indicator and improve the process for additional indicators in subsequent phases. Once we have identified sufficient indicators and have a working way to collect, analyze and report on them, we can work towards an aggregated/computed sustainability index/model.
Ultimately, creating a robust sustainability index or model will require analyzing many OSS “reference” projects in addition to NumFOCUS sponsored ones.
Potential initial indicators
Some indicators we are considering for phase one:
-
“Bus factor” - or, number of “key persons” whose departure from the project would cause activity to halt.
-
Ratio of paid to unpaid contributors or contributions.
-
Distribution of organizational affiliation or similar network analysis.
-
“Velocity” of work, e.g.: Volume and rate or commits over time, volume and rate of pull requests or patches submitted and accepted or rejected, volume and activity of issues and their resolution states.
Of course code contribution activity is not representative of all the work that goes into producing open source software. As such, we also want to find a way to track and meaningfully analyze non-code contributions such as: mailing list posts, chat activity, documentation, training and outreach, product management, UX, design, release management, issue triage, and more. Because there is less centralization about how this work occurs and is coordinated, collecting information about it presents a greater challenge than with code contributions, which happen almost entirely on GitHub. We plan to consider these indicators in subsequent phases of this project.
Working in the open
As much as possible, work in the SI project will be done “in the open” with public documentation and open meetings and calls for participation. I think this is critical for not only producing a high quality outcome, but also for ensuring that our project community is engaged and “bought in” to the work.
Next Steps
This project brief serves as the kick-off for this project. As lead, my next steps are:
-
Announce the project and ask for participation. The brief will serve as the basis for the announcement and call for participation and be distributed to: SAB, NumFOCUS project leads, NumFOCUS board, and Josh Greenberg at Sloan. After a short time to collect initial feedback, I would also like to post on the NF blog as well as my personal blog.
-
Create project management infrastructure: GitHub repo with roadmap, issues, and milestones; Google group, etc. I will likely be working on this while collecting feedback from the groups mentioned above and have it ready before announcing the project more widely.
Once the project is announced, the first milestone to work towards will be identifying a key metric we’re able to collect with a reasonable amount of effort. Subsequent milestones will be to figure out the best way to collect, analyze and report on that metric across NF projects.
How to get involved
The SI project represents a significant undertaking, and anyone who is interested is welcome to contribute. We will have regular open meetings and a number of ways to follow along and contribute asynchronously.
Specifically, I would like to form a working group composed of folks with experience in the following:
-
Data visualization and analysis
-
Developing indicators and indexes (altmetrics);
-
Maths and computation, statistical modeling, etc.
-
Knowledge of OSS community management, software production methodologies, and business models to help identify meaningful metrics and indicators and also to help with messaging.
-
Programmers (including but not limited to a few already familiar with NumFOCUS projects) who can help to implement data collection, analysis, and reporting.
-
A co- or assistant organizer who could back me up on community organizing tasks.
-
Other areas of expertise I haven’t yet considered. Suggestions welcome!
To get involved, drop me an email (sustainability@numfocus.org) and let me know what you’d like to help with and what your availability is like.