January 28, 2020

PIDapalooza 2020 day 0

I mentioned my upcoming trip for PIDapalooza and now it’s here! I flew out of Boston yesterday evening, couldn’t sleep on the red eye, stood in an excruciatingly long line at customs in the Lisbon airport and had a variety of tribulations including nearly getting pick pocketed, but I made it!

After a power nap I took it as a sign that since I woke up in enough time to make it to the Research Organization Registry (ROR) that I should go, even though it was sold out and I’d be a few minutes late. It was gorgeous and sunny and I found everyone in a packed basement conference room. I haven’t found a good crown shot on Twitter but I’d say there were a good 40 people at least. Here’s a pic of the organizers. This community is absolutely passionate about PIDs, as if the fingernail polish weren’t a clue. I crowded into one of the few open seats left.

January 27, 2020

LibraryCloud

CURIOSity Digital Collections and Harvard Digital Collections are both built upon APIs available from LibraryCloud but this is only mentioned in passing in some docs. Visually beautiful things can be built on top of APIs.

Back on January 6th I went to a talk on LibraryCloud and met the team. They’re doing great work and it’s open source!

January 26, 2020

The Open Source Software Health Index Project

As the website for the Open Source Software Health Index Project (OSS Health) explains,

“The Harvard Institute for Quantitative Social Science is developing a framework for evaluating open source software applications and their communities and will propose approaches for systematically building scales to quantify the health and sustainability of academic open source projects.”

I joined the team a little late but in early 2019 I did take the survey about what I think is important to look at when evaluating the health of an open source project and later attended the June 2019 workshop where the survey results were discussed with a panel of open source experts. Slides are available from that workshop, including a presentation of the results of the survey.

January 25, 2020

Searchable Linkable Open Public Indexed (SLOPI) Communication or Why open source projects should avoid Slack

Free and open source software projects have a long history of transparent communication. GNU and Linux were announced in publicly archived USENET posts in 1983 and 1991, respectively, that we can still reflect on today. The Debian Social Contract says, “We will keep our entire bug report database open for public view at all times.” Apache “requires all communications related to code and decision-making to be publicly accessible.”

Today many open source projects are turning away from transparency, adopting tools such as Slack for the bulk of their communication. Slack is a great tool for telling co-workers on your floor that you brought in donuts. Slack may be a fine place to discuss security vulnerabilities or code of conduct violations. Open source projects can make good use of Slack. However, if nearly all communication takes place on Slack, transparency suffers.

January 24, 2020

European Dataverse Workshop 2020 day 2

I’m still not in Tromsø, as I mentioned in my post about day 1, but it’s fun to follow along with European Dataverse Workshop 2020 via #dataverse2020.

I was delighted to see poikilotherm looking like such a badass. I guess he was wearing all black when we met in Berlin as well but this is a man who is serious about Dataverse and earns tweets like “A look into the future of DevOps.” You want this guy in your corner.

January 23, 2020

European Dataverse Workshop 2020 day 1

Ok, I’m not actually at European Dataverse Workshop 2020 but I’m enjoying the tweets from afar. I love how #dataverse2020 is trending in Norway right now. I love how Oliver Bertuch describes Dataverse as a vibrant community. Shout outs are always appreciated, of course.

I think it’s cool that the list of participants is online. Without it, I wouldn’t have been able to throw the data in Dataverse and Data Explorer to quickly get a histogram of the 19 countries are participating. Very impressive!

January 22, 2020

Harvard DataFest 2020 Day 2

Day 1 of #HarvardDataFest was fantastic and the schedule for day 2 is looking good.

Fernando Pérez

Page of Galileo’s notebook, a precursor to Jupyter notebooks.

Jupyter Meets the Earth: An Open, Collaborative Approach for Earth Data Science by Fernando Pérez

Yes, the logo is what happens when you hand the original drawings from Galileo to a designer.

Fouder of Berkeley Institute for Data Science.

The world is literally on fire.

In geoscience, physical models and noisy real world data. Every discipline is flooded with data.

January 21, 2020

Harvard DataFest 2020 Day 1

The schedule for #HarvardDataFest is jam packed with awesome talks. I had to make a lot of tough choices today.

Unfortunately, I had to run out of Data Visualization with Tableau with Jess Cohen-Tanugi to take a call but she did a fabulous job showing how to visualize storms data (hurricanes, tropical storms, etc.), creating maps and “blending” data sources. I plopped myself down next to Alyssa Goodman and I’m so glad I did. She had downloaded the storms data and was playing with it in Glue. After the talk, she gave a 15 minute demo to me and Jess and I was amazed. I had seen Glue years ago (actually, I spun up the initial VMs when working at FAS Research Computing back in 2011 or so) and hadn’t realized how much it had matured, how it’s no longer a tool only for astrophysics. Glue is a visualization tool for all of science, and it’s open source!

January 20, 2020

PIDapalooza

In about a week I’m flying to Lisbon for PIDapalooza 2020.

Looking through the schedule and list of speakers it should be a good conference. I added myself to the list of attendees.

As I mentioned in a thread on the Dataverse mailing list, my talk about Dataverse wasn’t accepted. Oh well, at the conference itself you can propose lightning talks so we’ll see. At least I learned that the creator of OAI-PMH, Herbert van de Sompel, will be there. Awesome.

January 19, 2020

Life 3.0

My digital copy of Life 3.0 by Max Tegmark was just auto returned to the library again. I think I was somewhere in chapter 4.

I’m enjoying it, but when my friend Tony was over the other day I couldn’t even articulate the three stages, so here’s my attempt:

Life 1.0, biological stage: life’s hardware and software are fixed, except through evolution
Life 2.0, cultural stage: life’s hardware is fixed but software is not fixed because learning is common
Life 3.0, technological stage: life designs its own hardware

These bullets seem to be fairly similar to a post on Science Friday. Also, I see the author and book appeared on the MIT Artificial Intelligence Podcast, which I’m a fan of, so maybe I’ll give it a listen.