Links

Globus Data Transfer Service

Globus is an offline service used by ESS-DIVE to upload large data files to the ESS-DIVE repository.
ESS-DIVE can upload a wide variety of data files to dataset publications via our web interface or Dataset API. For large data files or for anyone experiencing issues uploading small data files, the Globus data transfer service can be used as an alternative upload method.
The Globus transfer service is setup offline with close assistance from the ESS-DIVE Team. Data contributors should get in touch with ESS-DIVE Support at [email protected] to begin the process of publishing data with Globus.
Globus should be used only when data cannot be uploaded with the web interface or the Dataset API. Contact the ESS-DIVE Support Team to learn if your data is suitable for Globus upload.

What is Globus

Globus is a free, cloud-based data transfer service designed to move significant amounts of data. ESS-DIVE uses this service to move data from your local desktop or existing Globus endpoint to ESS-DIVE's storage services.

Why use Globus

ESS-DIVE started using Globus to support data uploads to address two common use cases that our data contributors encounter: uploading data files greater than 500GB and resolving upload errors for smaller files. Both these use cases are not readily supported by ESS-DIVE's other upload methods.

Upload data greater than 500GB

Large data volumes that are greater than the maximum file upload limits of the web submission form (<10GB) and the Dataset API (<500GB) can be uploaded to ESS-DIVE using Globus. Certain data types can commonly create large volumes of data and require Globus support to archive, such as sensor data, model data and remote sensing data. Learn more about the file upload limits of ESS-DIVE's submission tools on the Get Started page:

Resolve upload errors for small files

Additionally, Globus can be used to resolve common upload errors with relatively small data volumes (10-500GB). If your data is less than 500GB and you've encountered any of the following issues, you may be interested in using Globus:
  • Unstable internet connection causes upload to timeout, error, or otherwise fail
  • You have many files that are too large to be uploaded all at once but it’s too tedious to upload them in tens of batches
  • Data is too large to download from storage location onto data manager’s local system for upload via web interface
  • Available local memory for upload is smaller than data volume stored locally
  • Lack of familiarity scripting with application programming interfaces (API)
  • Your cloud storage service provider’s API and tools aren’t made accessible to help you move data
Of the above reasons, an unstable internet connection is the most likely cause for large data upload failures.

How to use Globus

In order to upload data using Globus, you will need to first:
  1. 1.
    Create a dataset on ESS-DIVE and provide required dataset metadata
  2. 2.
    Create a Globus account using your ORCID or institutional login (if applicable)
Detailed information on these steps are provided in ESS-DIVE's Globus instructions. Head to our documentation page on setting up Globus to get started:

Where is data stored after Globus?

All data uploaded with Globus will be published on ESS-DIVE. Data can be stored for public access on either ESS-DIVE's Tier 1 or Tier 2 storage services. Tier 1 is the primary, underlying storage capability at ESS-DIVE where data and metadata can be accessed via dataset landing pages on https://data.ess-dive.lbl.gov/. Tier 2 is ESS-DIVE's extended storage service for large (>500GB) and/or hierarchical data files and is not used for storing dataset metadata.
Before the Globus upload process begins, the ESS-DIVE Team will help determine whether your data should be published on ESS-DIVE's Tier 1 or Tier 2 data storage. To learn more about ESS-DIVE's tiers of storage, how they differ, and how ESS-DIVE stores large data, see our Large Data Support documentation: