A CKAN extension for assigning a digital object identifier (DOI) to datasets, using the DataCite DOI service.
This extension assigns a digital object identifier (DOI) to datasets, using the DataCite DOI service.
When a new dataset is created it is assigned a new DOI. This DOI will be in the format:
https://doi.org/[prefix]/[random 7 digit integer]
If the new dataset is active and public, the DOI and metadata will be registered with DataCite.
If the dataset is draft or private, the DOI will not be registered with DataCite. When the dataset is made active & public, the DOI will be submitted. This allows datasets to be embargoed, but still provides a DOI to be referenced in publications.
You will need an account with a DataCite DOI service provider to use this extension.
Uses DataCite Metadata Schema v 3.1 https://schema.datacite.org/meta/kernel-3.1/index.html
Dataset package fields and CKAN config settings are mapped to the DataCite Schema
CKAN Dataset Field | DataCite Schema |
---|---|
dataset:title | title |
dataset:creator | author |
config:ckanext.doi.publisher | publisher |
dataset:notes | description |
resource formats | format |
dataset:tags | subject |
dataset:licence (title) | rights |
dataset:version | version |
dataset:extras spacial | geo_box |
DataCite title and author are mandatory metadata fields, so dataset title and creator fields are made required fields. This has been implemented in the theme layer, with another check in IPackageController.after_update, which raises a DOIMetadataException if the title or author fields do not exist.
It is recommended plugins implementing DOIs add additional validation checks to their schema.
Path variables used below:
$INSTALL_FOLDER
(i.e. where CKAN is installed), e.g./usr/lib/ckan/default
$CONFIG_FILE
, e.g./etc/ckan/default/development.ini
- Clone the repository into the
src
folder:
cd $INSTALL_FOLDER/src
git clone https://github.com/NaturalHistoryMuseum/ckanext-doi.git
- Activate the virtual env:
. $INSTALL_FOLDER/bin/activate
- Install the requirements from requirements.txt:
cd $INSTALL_FOLDER/src/ckanext-doi
pip install -r requirements.txt
- Run setup.py:
cd $INSTALL_FOLDER/src/ckanext-doi
python setup.py develop
- Add 'doi' to the list of plugins in your
$CONFIG_FILE
:
ckan.plugins = ... doi
There are a number of options that can be specified in your .ini config file.
These will be given to you by your DataCite provider.
ckanext.doi.account_name = DATACITE-ACCOUNT-NAME
ckanext.doi.account_password = DATACITE-ACCOUNT-PASSWORD
ckanext.doi.prefix = DATACITE-PREFIX
You also need to provide the name of the institution publishing the DOIs (e.g. Natural History Museum).
ckanext.doi.publisher = PUBLISHING-INSTITUTION
If test mode is set to true, the DOIs will use the DataCite test prefix 10.5072.
ckanext.doi.test_mode = True or False
Name | Description | Default |
---|---|---|
ckanext.doi.site_url |
Used to build the link back to the dataset | ckan.site_url |
ckanext.doi.site_title |
Site title to use in the citation |
This extension will only work if you have signed up for an account with DataCite.
You will need a development/test account to use this plugin in test mode, and a live account to mint active DOIs.
delete-tests
: delete all test DOIs.paster --plugin=ckanext-doi doi delete-tests -c $CONFIG_FILE
This plugin implements a build_metadata interface, so the metadata can be customised. See ckanext-nhm for an implementation of this interface.
{% snippet "doi/snippets/package_citation.html", pkg_dict=g.pkg_dict %}
{% snippet "doi/snippets/resource_citation.html", pkg_dict=g.pkg_dict, res=res %}
Test coverage is currently extremely limited.
To run the tests, use nosetests inside your virtualenv. The --nocapture
flag will allow you to see the debug statements.
nosetests --ckan --with-pylons=$TEST_CONFIG_FILE --where=$INSTALL_FOLDER/src/ckanext-doi --nologcapture --nocapture