Advancing Science with Dataverse
As planned, on Sunday I gave a talk at FOSDEM called “Advancing science with Dataverse: Publication, discovery, citation, and exploration of research data.”
I was eager to introduce a bunch of open source hackers like myself to the related concepts of open science and open access. Afterward, Adina Wagner told me I did a good job of connecting these worlds. Phew.
I’m also happy, of course, to get the word out about Dataverse.
Immediately after my talk a few people hung around to chat. I can’t remember what the first guy said but I remember giving him a card. Ed Thomson from GitHub, whom I remembered from a GitHub issue, wanted to pick my brain about reproducibility.
Guix-Jupyter
The one talk I attended in person in the HPC devroom on Sunday at FOSDEM was Interactive HPC but now I’m working my way through the recordings.
Towards reproducible Jupyter notebooks caught my eye and it was really interesting. The software is called Guix-Jupyter and I already mentioned it in a previous blog post where I pointed out a tweet that links to a post on the Jupyter forum.
To motivate the discussion, we first see a tweet by Daniel Katz:
Interactive HPC
Today I told a couple members of the Sid team about a talk at FOSDEM I attended on Sunday called “Interactive applications on HPC systems Jupyterhub, Galaxy, RStudio, XPRA.”
I pointed out that the slides show that they’ve looked at solutions like OnDemand but built their own thing.
I’m tempted to put this new project on my list of things that look like Sid but I don’t think it has a name and it isn’t open source.
Google Summer of Code
Today it was announced that Dataverse will apply to participate in Google Summer of Code 2020. Awesome.
Update: I suggested a data driven approach of using Druthers to figure out what users want.
SMACKI
Today my friend Slava tweeted the following at me:
“I’m thinking for a long time how to turn all SLOPI communication channels into #knowledgebase with some NLP pipelines and archive it in @dataverseorg. Probably another crazy idea for Hackathon in June during community meeting.”
In my reply, I tried to explain that yes, I definitely want a knowledge base, but from a recent conversation with a friend, I’m wondering if it should be build on Blacklight.
FOSDEM day 2
I mentioned that my FOSDEM talk was accepted a while back. Awesome.
Here are some devrooms I’m planning on checking out:
FOSDEM day 1
I mentioned that my FOSDEM talk was accepted a while back. Awesome.
Here are some devrooms I’m planning on checking out:
Dataverse at the State Archives of Belgium
Quick news:
- Dataverse Community Meeting Trello board:
- DataverseTV
- Dataverse Meetups at dataverse.org/events
- GDCC Developer Jim Myers
- Curation services from Harvard Dataverse
New community stuff
- Dataverse map (dataverse-installations)
- Druthers
- Installation personas
- dataverse-sample-data
New features:
- File previewers
- Data Curation Tool
- OIDC
- Move dataset
- BagIt export
- Schema.org JSON-LD (Google Dataset Search)
- OpenAIRE export
- Make Data Count
- Help update list of features?
Code in Dataverse:
- AJPS
- software citation
- (Dorthea) software metadata block
- 5 Dockerfiles
Reproducibility:
PIDapalooza 2020 day 2
I gave a 7 minute lightning talk with the title “Dataverse: A community dedicated to publishing research data.” My slides.
In-Text Reference Pointer Identifier (InTRePID)
New RA. Something to do with supply chain.
Update: Other people are blogging about PIDapalooza 2020:
PIDapalooza 2020 day 1
In the opening keynote: "Towards the Circular Science: PIDs for a new generation of knowledge creation and management paradigm in Portugal: from vision to reality" (Maria Fernanda Rollo, NOVA FSCH) challenges and trends, including new paradigm of data-driven science. drawing. reducing bureaucracy in science. student IDs allow aggregation of information. Science ID and Ciência Vitae
George Duimovich media concentration
I shall be released or how to stop worrying about new versions Maria Helena Vieira Room Martin Fenner Bob Dylan 1968 delima. federation or specificity/verifiability credit vs verifiability