POSTS
The Open Source Software Health Index Project
- 3 minute readAs the website for the Open Source Software Health Index Project (OSS Health) explains,
“The Harvard Institute for Quantitative Social Science is developing a framework for evaluating open source software applications and their communities and will propose approaches for systematically building scales to quantify the health and sustainability of academic open source projects.”
I joined the team a little late but in early 2019 I did take the survey about what I think is important to look at when evaluating the health of an open source project and later attended the June 2019 workshop where the survey results were discussed with a panel of open source experts. Slides are available from that workshop, including a presentation of the results of the survey.
All this is summarized toward the end of a blog post on the project website, but personally I was very happy to meet several people who were already famous to me in open source or academia:
- Carol Willing, leader in Python and Jupyter circles
- Nadia Eghbal, author of Roads and Bridges and serial podcaster on open source topics
- Arfon Smith, creator of the Journal of Open Source Software (JOSS)
- Matt Turk, creator of the yt project
- Dan Katz, leader in software citation
The workshop also gave me the opportunity to meet some new (to me) people in open source:
- Anna Filippova, data scientist at GitHub
- Karthik Ram, creator of rOpenSci
- Sean Goggins, founding member of CHAOSS
- Matt Germonprez, founding member of CHAOSS
CHAOSS, which stands for “Community Health Analytics Open Source Software,” turns out to have been a key organization for the project. Sean and Matt gave a talk entitled CHAOSS community metrics and tooling, and we decided that their Augur software was especially interesting.
Fast forward to September 2019 when the OSS Health team at Harvard met with Sean and Matt from CHAOSS for an entire day. We spend most of the time looking at how the factors in the results from our survey could map to metrics that have been defined by CHAOSS.
We also talked about which 20 open source projects to study. These projects all needed to have something to do with research or academic libraries. Here's the list, as mentioned in the blog post:
- Archi
- Archivematica
- Bioconductor
- Blacklight
- CORAL
- Dataverse
- Districtbuilder
- DSpace
- Fedora Commons
- JabRef
- Jupyter Notebook
- LOCKSS
- Mirador
- Omeka
- Open Journal Systems
- Parsl
- R Markdown
- Scikit-learn
- Stencila
- Zotero
Fast forward to late 2019 and today and we've meet with Sean a couple times to talk more about the installation of Augur he set up for us and the database that underlies it. He's providing us database queries and we're having fun exploring the data.