ESS-DIVE Dataset API

The ESS-DIVE Dataset API is a service that enables projects to programmatically submit and manage datasets with ESS-DIVE. Scripts with the Dataset API can be written in R, Python, or Java.

Dataset API Capabilities

  • Search & Download one specific dataset

  • Search & Download many public datasets

  • Search & Download many private datasets

  • Submit datasets (<10GB)

  • Submit datasets (>10GB)

  • Stream dataset submissions (Python only)

  • Collaborate on datasets (Python only)

  • Add External Links to datasets (Python only)

  • Update datasets

For additional information about the api, you can also review the technical documentation at https://api-sandbox.ess-dive.lbl.gov

Getting started with ESS-DIVE Dataset API (Package Service API version 1)

The ESS-DIVE Dataset API is a service that enables projects to programmatically submit and manage datasets with ESS-DIVE. This is an alternative to using the ESS-DIVE Online form for data uploads. This service encodes metadata using the JSON-LD specification. JSON-LD is a schema to encode linked Data using JSON, and in the future will be used by Google to index metadata for searches. The use of the standardized JSON-LD schema will dramatically increase the visibility of datasets, and also enable projects to create one-time code that can be reused for periodic uploads of datasets to ESS-DIVE.

Current Maximum Upload Limit: 500 GB per upload attempt per dataset* Please contact ess-dive-support@lbl.gov to submit datasets above this upload limit.

* i.e. If uploading multiple files at once to one dataset, then the cumulative file volume can be up to 500GB. If uploading one file to one dataset, then that one file can be up to 500GB.

The ESS-DIVE Dataset API allows you to test JSON-LD dataset submissions to ESS-DIVE’s sandbox instance and check whether metadata are mapped correctly onto ESS-DIVE’s dataset metadata schema. The Dataset API is able to check the mapping by validating the JSON-LD submission; if the JSON_LD is invalid, details about the errors will be given.

Dataset metadata refers to the top level metadata that enables a dataset to be “discoverable” in search results. Examples of top-level metadata include the title, abstract, authors, variables and keywords. Other file-level metadata, such as those that describe the data file structure or variables are not included in this service.

See the Metadata Submission Guide for descriptions, expectations, & JSON-LD equivalents for each metadata field.

You can get access to functioning code examples on ESS-DIVE's Dataset API repository (essdive-package-service-examples repository).

Provide feedback on this service to ess-dive-support@lbl.gov.

Get Authentication Token

Anonymous Search & Download

  1. Sign in with Orcid

  2. Click your Name in the right hand corner and select My Profile (Figure 1)

  3. Now Click the Settings>Authentication Token (Figure 2)

  4. Scroll down and click Copy on the “Token” tab to get your authentication token (Figure 2)

Submit & Manage Datasets

  1. If you are not already registered to submit data with ESS-DIVE, follow the steps on the New Contributor Registration guide.

  2. Sign in with Orcid

  3. Click your Name in the right hand corner and select My Profile (Figure 1)

  4. Now Click the Settings>Authentication Token (Figure 2)

  5. Scroll down and click Copy on the “Token” tab to get your authentication token (Figure 2)

Prepare and Test

ESS-DIVE is a REST API which allows you to interact through the tools of your choice. You may start learning the expected schema for the metadata on https://api-sandbox.ess-dive.lbl.gov. This documentation allows you to get familiar with the interface, to understand the available operations, possible errors and the structure of the JSON produced and expected.

You can also review the expected JSON-LD structure from the example JSON-LDs on ESS-DIVE coding examples github repository.

To review the expectations for submitted metadata, refer to our Package Level Metadata Guide.

After you have familiarized yourself with the dataset REST API, you can use the programming language of your choice to submit JSON-LD metadata to be validated. Two of the most common programming languages in the ESS space are Python and R. Either of these languages can be used to write scripts for submitting dataset metadata to be validated.

  • Python: dataset JSON-LD metadata can be submitted using the requests module

  • R: dataset metadata can be submitted using the httr and jsonlite packages

  • Java: datasets JSON-LD metadata can be submitted using Apache HTTPComponents

Choose preferred coding language:

pageR ExamplepagePython ExamplepageJava Example

Ready to Publish

Once you've familiarized yourself with ESS-DIVE's metadata and dataset API schema, use our production domain https://api.ess-dive.lbl.gov/ to submit datasets to ESS-DIVE for publishing and review.

Note that the https://data.ess-dive.lbl.gov/ domain is for data submission via web user interface only.

Be sure to use ESS-DIVE's test instance, Sandbox (https://api-sandbox.ess-dive.lbl.gov), while building scripts and making corrections to JSON_LD dataset submissions. Only switch to ESS-DIVE's production instance (https://api.ess-dive.lbl.gov/) after your scripts are complete.

Last updated