Code Examples

This page provides quick coding examples for searching for datasets on ESS-DIVE using Python, R, and Java.

Setup

The following code examples require installation of certain packages and authentication from your ESS-DIVE account. Follow the instructions for setting up the Dataset API for your preferred coding language before trying out the search code examples:


Search for Datasets

Anyone can search for public datasets on ESS-DIVE using the Dataset API. If you are registered to submit data, you can also search for your private datasets. Query your dataset searches by defining parameters.

Limited dataset metadata are returned in the response of this call. Additionally, this call cannot be used to download data files. To look up all dataset metadata and download data files, use the API to Download Datasets.

👉 Review parameters and responses for this call in the Dataset API technical documentation.

The following lines of code will return the most recent 25 records. If the results contain more than 25 packages, use the row_start and page_size query parameters to page through the results.

Python Search Example

Make sure you have followed the Setup instructions before running this code

Search for datasets using any of the available query parameters. The code example below demonstrates how to format the search query with all parameters. It also demonstrates how to search for one parameter, in this case providerName or the publishing project.

# Enter parameters
creator= "creator/submitter of datasets"
providerName =  ""\"Next-Generation Ecosystem Experiments (NGEE) Arctic"\" # Use exact match formatting to return datasets from only one project
text= "any text"
datePublished = "[YYYY-MM-DD]"
keywords = "Partial Match Keywords" # OR ""\"Exact Match Keywords"\"
# Contruct query URL
get_packages_url = "{}/{}?creator=\"{}\"&providerName=\"{}\"&text=\"{}\"&datePublished=\"{}\"&keywords=\"{}\"&isPublic=true".format(base,endpoint,creator,providerName,text,datePublished,keywords)

## Not interested in using all the parameters? Just remove them from the query like so
## This example only uses the "providerName" parameter:
# get_packages_url = "{}/{}?providerName=\"{}\"&isPublic=true".format(base,endpoint,providerName)
# Call GET /packages
get_packages_response = requests.get(get_packages_url, 
    headers={"Authorization":header_authorization})

if get_packages_response.status_code == 200:
   #Success
   print(get_packages_response.json())
else:
   # There was an error
   print(get_packages_response.text)

Download Dataset

Anyone can search for individual public datasets on ESS-DIVE using the Dataset API. If you are registered to submit data, you can also download your private dataset metadata.

This call will look up one dataset and return complete dataset metadata and file details. Metadata and data files can then be downloaded using standard request packages.

If you'd like to look up the dataset upload date, last modified date, or dataset access status, use the API to Search for Datasets.

👉 Review responses for this call in the Dataset API technical documentation.

Python Download Example

Make sure you have followed the Setup instructions before running this code

# ESS-DIVE Identifiers are in the format of: ess-dive-0f0348396e46261-20181022T131245032205
dataset_id = "<Enter an ESS-DIVE Identifier here>"

Download Metadata

# Send request
get_package_url = "{}{}/{}?&isPublic=true".format(base,endpoint, dataset_id)

get_package_response = requests.get(get_package_url, 
    headers={"Authorization":header_authorization})

if get_package_response.status_code == 200:
   #Success
   print(get_package_response.json())
else:
   # There was an error
   print(get_package_response.text)

Download Data Files

# Use the json message from get_package_response to grab dataset details 
dataset_detail = get_package_response.json()
dist = dataset_detail.get('distribution')
file_url = dist.get('contentUrl')
fn = dist.get('name')

# Define where you want to download the file locally
local_dir = "local_dir"
file_path = local_dir / fn

# Download the dataset locally
urlretrieve(file_url, file_path)

# Get the dataset citation
citation = dataset_detail.get('citation')

Last updated