Code Examples

This page provides quick coding examples for searching for datasets on ESS-DIVE using Python, R, and Java.

Setup

The following code examples require installation of certain packages and authentication from your ESS-DIVE account. Follow the instructions for setting up the Dataset API for your preferred coding language before trying out the search code examples:

Search for Datasets

Anyone can search for public datasets on ESS-DIVE using the Dataset API. If you are registered to submit data, you can also search for your private datasets. Query your dataset searches by defining parameters.

Limited dataset metadata are returned in the response of this call. Additionally, this call cannot be used to download data files. To look up all dataset metadata and download data files, use the API to Download Datasets.

👉 Review parameters and responses for this call in the Dataset API technical documentation.

The following lines of code will return the most recent 25 records. If the results contain more than 25 packages, use the row_start and page_size query parameters to page through the results.

For data users: Must pass isPublic=true to search for public datasets. This call searches for public datasets by default.

For data contributors: Must pass isPublic=false to search for private datasets.

Python Search Example

⭐ Make sure you have followed the Setup instructions before running this code

Search for datasets using any of the available query parameters. The code example below demonstrates how to format the search query with all parameters. It also demonstrates how to search for one parameter, in this case providerName or the publishing project.

# Enter parameters
creator= "creator/submitter of datasets"
providerName =  ""\"Next-Generation Ecosystem Experiments (NGEE) Arctic"\" # Use exact match formatting to return datasets from only one project
text= "any text"
datePublished = "[YYYY-MM-DD]"
keywords = "Partial Match Keywords" # OR ""\"Exact Match Keywords"\"

# Contruct query URL
get_packages_url = "{}/{}?creator=\"{}\"&providerName=\"{}\"&text=\"{}\"&datePublished=\"{}\"&keywords=\"{}\"&isPublic=true".format(base,endpoint,creator,providerName,text,datePublished,keywords)

## Not interested in using all the parameters? Just remove them from the query like so
## This example only uses the "providerName" parameter:
# get_packages_url = "{}/{}?providerName=\"{}\"&isPublic=true".format(base,endpoint,providerName)

# Call GET /packages
get_packages_response = requests.get(get_packages_url, 
    headers={"Authorization":header_authorization})

if get_packages_response.status_code == 200:
   #Success
   print(get_packages_response.json())
else:
   # There was an error
   print(get_packages_response.text)

R Search Example

⭐ Make sure you have followed the Setup instructions before running this code

Search for datasets using any of the available query parameters. At this time, this code example does not demonstrate how to format the query parameters. This example will return all public datasets on ESS-DIVE.

# Construct query URL
call_get_packages <- paste(base,endpoint, sep="/")

# Call GET /packages
get_packages = GET(call_get_packages,
       add_headers(Authorization=header_authorization))

# Transform the result into a data frame. (Ignore the warning)
get_packages_text <- content(get_packages, "text")
get_packages_json <- fromJSON(get_packages_text)
get_packages_df <- as.data.frame(get_packages_json)

# Check the errors and view the data frame on success

# Check for errors
if(!http_error(post_package) ){
  # Print the returned columns
  print(colnames(get_packages_df))

  # print the ESS-DIVE Ids
  print(get_packages_df['result.id'])

  # Iterator over the dataset column and print the data package name
  for ( d in get_packages_df['result.dataset']) { print(d['name']) }
}else {
  http_status(post_package)
}

Java Search Example

⭐ Make sure you have followed the Setup instructions before running this code

  try{
    String url = base + endpoint;
    HttpGet request = new HttpGet(url);
    StringEntity params = new StringEntity(JSON_LD.toString()); //Setting the JSON-LD Object to the request params
    request.addHeader("content-type", "application/json");
    request.addHeader("Authorization", header_authorization);
    
    HttpResponse response;
    response = httpClient.execute(request);
    
    HttpEntity entity = response.getEntity();
    String responseString = EntityUtils.toString(entity, "UTF-8");
    
    if(response.getStatusLine().getStatusCode() == 200){
      System.out.println(response.toString());
      System.out.println(responseString);
    } else {
      System.out.println(response.getStatusLine().getReasonPhrase());
      System.out.println(response.toString());
      System.out.println(responseString);
    }
  } catch (Exception ex) {
    System.out.print(ex.getMessage().toString());
  }

Download Dataset

Anyone can search for individual public datasets on ESS-DIVE using the Dataset API. If you are registered to submit data, you can also download your private dataset metadata.

This call will look up one dataset and return complete dataset metadata and file details. Metadata and data files can then be downloaded using standard request packages.

If you'd like to look up the dataset upload date, last modified date, or dataset access status, use the API to Search for Datasets.

👉 Review responses for this call in the Dataset API technical documentation.

Python Download Example

⭐ Make sure you have followed the Setup instructions before running this code

# ESS-DIVE Identifiers are in the format of: ess-dive-0f0348396e46261-20181022T131245032205
dataset_id = "<Enter an ESS-DIVE Identifier here>"

Download Metadata

# Send request
get_package_url = "{}{}/{}?&isPublic=true".format(base,endpoint, dataset_id)

get_package_response = requests.get(get_package_url, 
    headers={"Authorization":header_authorization})

if get_package_response.status_code == 200:
   #Success
   print(get_package_response.json())
else:
   # There was an error
   print(get_package_response.text)

Download Data Files

# Use the json message from get_package_response to grab dataset details 
dataset_detail = get_package_response.json()
dist = dataset_detail.get('distribution')
file_url = dist.get('contentUrl')
fn = dist.get('name')

# Define where you want to download the file locally
local_dir = "local_dir"
file_path = local_dir / fn

# Download the dataset locally
urlretrieve(file_url, file_path)

# Get the dataset citation
citation = dataset_detail.get('citation')

R Download Example

⭐ Make sure you have followed the Setup instructions before running this code

# ESS-DIVE Identifiers are in the format of: ess-dive-0f0348396e46261-20181022T131245032205
id <- "<Place an ESS-DIVE identifier here>"

Download Metadata

# Send request
call_get_package <- paste(base,endpoint,id, sep="/")
get_package = GET(call_get_package,
    add_headers(Authorization=header_authorization))

# Transform the result into a data frame. (Ignore the warning message)
get_package_text <- content(get_package, "text")
get_package_json <- fromJSON(get_package_text)

# Check for errors and view the data frame on success
# Check for errors
if(!http_error(post_package) ){
  print(get_package_json)
}else {
  http_status(post_package)
}

Download Data Files

The Dataset API search result will provide direct URLs that point to the data files. To download data files, you will need to grab the contentUrl and download it locally using a standard requests package.

and use a standard request or urlretrive package to download it locally. The data file URLs are stored in the JSON as such:

{
"dataset": {
  "distribution": [
      {
          "contentSize": 8.958984375,
          "contentUrl": "https://data.ess-dive.lbl.gov/catalog/d1/mn/v2/object/ess-dive-b8f2258b6f49e86-20210428T053005824019",
          "encodingFormat": "eml://ecoinformatics.org/eml-2.1.1",
          "identifier": "ess-dive-b8f2258b6f49e86-20210428T053005824019",
          "name": "SPRUCE_S1_Bog_Environmental_Monitoring_Data_2010_2016.xml"
        }
      ]
    }
}

Java Download Example

⭐ Make sure you have followed the Setup instructions before running this code

Download Metadata

  // ESS-DIVE Identifiers are in the format of: ess-dive-0f0348396e46261-20181022T131245032205
  
  try{
    // Replace `String id` with your dataset's ESS-DIVE Identifier
    String id = "<Enter an ESS-DIVE Identifier here>";
    
    String url = base + endpoint + File.separator + id;
    HttpGet request = new HttpGet(url);
    StringEntity params = new StringEntity(JSON_LD.toString()); //Setting the JSON-LD Object to the request params
    request.addHeader("content-type", "application/json");
    request.addHeader("Authorization", header_authorization);

    HttpResponse response;
    response = httpClient.execute(request);

    HttpEntity entity = response.getEntity();
    String responseString = EntityUtils.toString(entity, "UTF-8");

    if(response.getStatusLine().getStatusCode() == 200){
      System.out.println(response.toString());
      System.out.println(responseString);
    } else {
      System.out.println(response.getStatusLine().getReasonPhrase());
      System.out.println(response.toString());
      System.out.println(responseString);
    }
  } catch (Exception ex) {
    System.out.print(ex.getMessage().toString());
  }

Download Data Files

and use a standard request or urlretrive package to download it locally. The data file URLs are stored in the JSON as such:

{
"dataset": {
  "distribution": [
      {
          "contentSize": 8.958984375,
          "contentUrl": "https://data.ess-dive.lbl.gov/catalog/d1/mn/v2/object/ess-dive-b8f2258b6f49e86-20210428T053005824019",
          "encodingFormat": "eml://ecoinformatics.org/eml-2.1.1",
          "identifier": "ess-dive-b8f2258b6f49e86-20210428T053005824019",
          "name": "SPRUCE_S1_Bog_Environmental_Monitoring_Data_2010_2016.xml"
        }
      ]
    }
}

PreviousSearch with Dataset API NextSearch with Deep Dive API

Last updated 2 months ago