Date: 26 Apr 2019 Title: Frictionless Data Tool Fund
This is the original idea I submited to the Frictionless Data Tool Fund.
What are you proposing to build?
A tool that automatically converts Darwin Core Archive into Tabular Data Package. Darwin Core is a list of standardized terms, widely used by biodiversity research community, enabling open data, interoperability and re-use. Darwin Core Archive is a standardized container for DawrinCore data + metadata. The suggested tool will :
- convert DwC meta.xml into datapackage.json
- convert DwC metadata (eml.xml) into README.md
- if neccessary, adapt DwC core and extension(s) data files into csv files
The tool, to be written in Go language, will depend on datapackage-go.
Why should we choose to work with you?
I’ve been involved in GBIF and Biodiversity Informatics for more 10 years, working with Darwin Core and Open Biodiversity data on daily basis. Eight months ago, I discovered Frictionless Data and published a first Tabular Data Package. It showcases the historical data assembled for my website on Napoleonic Campaign of Belgium, in June 1815. Now, I’d like to build a bridge between these two open data ecosystems.
What do you think are the advantages of the Frictionless Data approach towards removing the friction in working with data?
Frictionless specifications are light weighted, still they offer essential metadata description such as data types, primary or foreign keys, integrity constraints… making data packages self-explanatory and re-usable.
When are you able to start and complete the work?
Between May and December 2019, I will dedicate up to one day per week to this project.
Is there anything else you would like to let us know?
Additionally, I will investigate possible synergies and incompatibilities between Darwin Core and Data Package standards that could benefit to the GBIF community.