Open Source at Harvard
The other day I mentioned in passing how Open Source at Harvard has a new website and when I wrote about GitHub Actions that was me taking a look at how the website is deployed.
But is Open Source at Harvard really a thing? If so, what it is? Let’s dig in.
(Today I threw together the logo above in 60 seconds. Patches welcome!)
I’ve been fascinated with open source since I first discovered in the late 90s.
HarvardWIT+ Mentoring
HarvardWIT+ asked for mentors to participate in their mentoring program and I signed up. My mentee is very nice and we met today.
She has a variety of ideas of where she’s like to take her career and has expressed an interest in learning Python or R. I asked her to this of something from work that she could use as a first project.
She explained that when looking at a record in ALMA with a number like 99153754014903941 what she really wants is the older HOLLIS number (better for searching) that’s embedded within in. She has figured out that you can remove the first two digits (99) and then use only the next 9 digits to get the HOLLIS number, like this:
Quarkus
Today Daniel Oh from Red Hat came to IQSS to tell us about Quarkus.
Backstory
There’s some backstory to this. Last summer I was really fired up about Quarkus after hearing about it on various podcasts and then hearing Daniel give a talk about it at DevConf. (That talk is on YouTube and his slides are available.) I went home and tried Quarkus (and GraalVM) myself. It really worked. Awesome.
Shortly after, I sent the following message (slightly edited below) to the Dataverse internal mailing list. In short I was suggesting that we could rewrite a Django app into Quarkus.
The Turing Way
The other day a contributor to The Turing Way said they enjoyed the videos I made some noise about last week. She even asked if she could add them to their newsletter.
“Of course!” I said, and today that newsletter was published.
You can read it on tinyletter but here’s the part I’m talking about:
The Whole Tale open source platform for reproducible research
The Whole Tale project is an NSF-funded initiative building a scalable, open source platform for reproducible research. Whole Tale supports the creation, publication, and execution of tales – executable research artifacts that capture, data, code and the complete software environment required to reproduce computational results. They along with Harvard Dataverse team have set up a platform to demonstrate tales using the Whole Tale platform as a dataset-level external tool on the demo instance at https://demo.dataverse.org.
Fernando Pérez
Fernando Pérez will be our keynote speaker at DataFest 2020 next Wednesday and I’m looking forward to introducing him to various groups that are eager to meet with him.
I’ve been looking through his recent tweets and I like what I see:
- Guix-Jupyter: Towards self-contained, reproducible notebooks
- How GESIS joined the Binder federation
- The lifecycle of open source scientific software
I love that he retweeted Péter Király and Carol Willing, whom I consider friends, as well as Dario Taraborelli, whom I met at an open science event a while back.
GitHub Actions
When the blog post went out about GitHub Actions I didn’t notice it but today I got a nice brain dump about them.
My understanding is that GitHub Actions are all about Docker. You specify which Docker images you want to run and which commands to run within them. The example we were looking at builds a site using Jekyll and then pushes the static files to the master branch:
name: Build and deploy
on:
push:
branches: develop
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- name: Build
uses: docker://jekyll/jekyll
with:
entrypoint: bash
args: -c "/usr/local/bin/bundle install && /usr/local/bin/bundle exec jekyll build"
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Deploy
run: |
sudo chown -R $(whoami):$(whoami) .
git config --global user.email "$GITHUB_ACTOR@users.noreply.github.com"
git config --global user.name "$GITHUB_ACTOR"
cp -r _site /tmp
git checkout master
rm -rf *
mv /tmp/_site/* .
git add --all --force
git commit --allow-empty-message --message "$(git log $(git rev-parse origin/develop) --oneline --format=%B -n1 | head -n1)"
git remote set-url origin "https://$GITHUB_ACTOR:${{ secrets.GITHUB_TOKEN }}@github.com/$GITHUB_REPOSITORY"
git push --force origin master
Helper Bees
This afternoon the Helper Bees team is having its biggest meeting since our kickoff.

With Helper Bees we are matching student volunteers to neighborhood donors. At least that’s the description I put for the repo on GitHub, which has the following sequence diagram that a friend said is worth a thousand words:

Here’s a rough timeline of Helper Bees from my perspective:
- October 6, 2019: Initial idea, helperbees.org registered
- October 21, 2019: Onboarding product owner
- October 25, 2019: Prototype at Halloween Bash
- November 5, 2019: Signing up kids to volunteer at the Flatbreads Pizza and Brighton Bowl party
- November 20, 2019: Launch
- November 27, 2019: Talking it up with the principal and others at the DFL Superbowl
- December 10, 2019: Helping parents use the live site at Craft Fair
- January 12, 2020 (today): Passing the baton for next year.
See, there are four of us on core Helper Bees team but I’m the only one who has a younger child in the school. The other three will understandably be moving on.
Fennel
I really enjoyed episode 30 of Libre Lounge where Serge Wroclawski interviewed Phil Hagelberg about his language Fennel, which is a Lisp that runs on Lua runtimes.
I’m not very familiar with Lua so it was interesting to hear that the reference implementation is only 200 kilobytes. I also like the idea that Lua is “relentlessly simple” with only a table as a data structure. The fact that you have to invent your own object system (if you want it), reminds me of Perl.
CS50 and Kubernetes
Today I went to the 411th consecutive ABCD main meeting to hear David Malan and Kareem Zidane talk about CS50’s use of Kubernetes. The talk was fantastic and the slides are already available. It looks like a previous similar talk is on YouTube.
As luck would have it, I crossed paths with David on the way to the venue so we got a chance to chat a bit. I took CS75 in 2009 with him in person and really enjoyed it. It was taught in PHP and my final project is still on GitHub. Fast forward to late 2019, about a month ago, and I reminded him who I am and explained an Open Source at Harvard dataset I’ve been curating for a few years. Within a week he had created a site based on that data. It sounds like Kareem played an important role so I was glad to meet him and plan to pick his brain about all this soon.
Graphic design
Today was the 15th and 20th work anniversary of two of my long time colleagues. Speeches were made. Cake was eaten. I heard about catacombs I didn’t know existed under a nearby building. It was a great time.
I sat next to our resident logo designer (one of many hats he wears) and explained that I told my daughter Erika about him the other day when we were driving home from cello practice, that if she’s having fun designing logos, that it’s a real job.