ESS-DIVE Documentation
  • ESS-DIVE Documentation
  • Frequently Asked Questions
  • Submit Data
    • Get Started
      • Large Data Support
    • Register to Submit Data
    • Data Reporting Formats
    • Submit Data with Online Form
    • Submit Data with Dataset API
    • Link to External Data Sources
  • Publish Data
    • Check Dataset Metadata Quality
    • Dataset and DOI Status Badges
    • Review Cycle and Criteria
      • Metadata Requirements
      • Reporting Format Requirements
    • Publish your Dataset
      • Request Publication
      • Reserve DOI Before Publication
      • Publish with Existing DOI
      • Troubleshooting
  • Manage Data
    • Register Dataset Citations
    • Create Data Portals
      • How to Create & Publish Portals
    • Share Data Permissions
      • Share Datasets
      • Share Portals
    • Manage Project Data
      • Project Data Managers
      • Project Information
      • Project Teams
  • Search & Download Data
    • Search for Data
    • Download Data
    • Access Data Portals
    • Search with Dataset API
      • Code Examples
    • Search with Deep Dive API
      • How to Query Data
  • Programmatic Tools
    • ESS-DIVE Dataset API
      • R Example
      • Python Example
      • Java Example
      • API Updates and Changes
    • Globus Data Transfer Service
      • Setup Globus
      • Upload Data with Globus
      • FAQs
Powered by GitBook
On this page
  • Terms of Use and Licensing
  • File Upload Limits
  • Large Data Uploads
  • Submission Tools
  • Data Submission Form
  • Dataset API
  • Globus Data Transfer Service
  • Organizing Your Dataset
  • Publication Process
  1. Submit Data

Get Started

Use this page to help decide how you will upload your dataset to ESS-DIVE by considering file upload limits and dataset organization.

PreviousFrequently Asked QuestionsNextLarge Data Support

Last updated 1 month ago

A condensed summary of our Data Submission Guidelines is available on our website. You can find more details throughout this Guide to Using ESS-DIVE.

Terms of Use and Licensing

By becoming an ESS-DIVE data contributor and submitting datasets, you are agreeing to ESS-DIVE's Terms of Use.

Important items to review:

  1. Visit ESS-DIVE's Terms of Use and complete the ESS-DIVE Data Contributor Checklist (Part A).

File Upload Limits

Use Table 1 to decide which submission tool is best suited for creating your dataset. The next section summarizes the tools in more detail.

ESS-DIVE has three tools available for uploading data and each tool has a limit to the amount of data that they can upload at one time. Consider uploading data files in batches if the sum of the data files in your dataset is more than the given upload limits listed in Table 1.

Submission Tool
Total Upload Size Limit
Why Use This Tool?

< 5 GB

Self-managed process using the ESS-DIVE data portal. Easiest for managing small numbers of files and datasets

< 5 GB

Self-managed process and most time efficient for uploading many data files at once (to one or multiple datasets) using programmatic tools

> 5 GB

User friendly web service for automated high performance transfers, including support for hierarchical folders and very large datasets

Table 1: ESS-DIVE's submission tools and their upload limits.

Large Data Uploads

Dataset submissions with large file volumes can take additional time to upload and publish. Additionally, submission attempts will fail if your data files are greater than ESS-DIVE's file upload limits (Table 1) or greater than your available computational resources. As such, it is important to consider these factors when choosing a submission tool.

ESS-DIVE has developed its Tier 2 data storage service to support very large, hierarchical datasets that can be directly accessed from the file system layer. ESS-DIVE uses Globus, a large data transfer service, to make it easier to upload and publish large data on ESS-DIVE. The Tier 2 and Globus services are setup offline with close assistance from the ESS-DIVE Team, data contributors should get in touch with ESS-DIVE Support at ess-dive-support@lbl.gov to begin the process of publishing data with either service.

Visit our documentation on large data uploads for more information on ESS-DIVE's Tier 2 data storage service:

ESS-DIVE can store datasets with data volumes more than a Terabyte in size.

If your project needs to upload more than a Terabyte of data, ESS-DIVE will need to prepare to accommodate this. Contact ess-dive-support@lbl.gov to inform us of your needs.

Submission Tools

This section gives a brief overview of the submission tools available for creating and editing datasets on ESS-DIVE.

For help troubleshooting common submission issues, visit ESS-DIVE's FAQ page:

Data Submission Form

Dataset API

Globus Data Transfer Service

If you have immediate questions about uploading data using Globus OR you are encountering upload issues with data volumes less than 500GB, contact the ESS-DIVE Support Team (ess-dive-support@lbl.gov) to discuss your upload options and potentially using Globus.

Organizing Your Dataset

Datasets on ESS-DIVE contain related data and metadata files. Each dataset should contain all the relevant data and metadata necessary for a general user to be able to understand and reuse the data.

Additionally, we ask that data contributors adhere to existing data standards or data reporting formats when applicable in order to make the data stored in ESS-DIVE as useful as possible. To learn more about reporting formats, visit the following page in our guide.

Publication Process

The ESS-DIVE publication process begins with a user gathering files to be included in a dataset and uploading them via the Data Submission Form or Dataset API. The user specifies metadata associated with the dataset, including author and citation information, as well as related references. The user then submits the dataset which saves the metadata and accompanying data files to ESS-DIVE as a private dataset. The dataset can be revised as frequently as needed. We recommend that you use the Automated Quality Reports to verify that your package will pass the ESS-DIVE criteria for publication.

When all the revisions are complete, you can publish the dataset. Published data are made available through the ESS-DIVE and DataONE search catalogs, can be downloaded by the public, and can be modified after publication. Publication requests need to be made by selecting the "Publish" button on the dataset landing page. After a publication request is received, the ESS-DIVE team reviews the dataset and assigns a unique Digital Object Identifier (DOI) if an existing DOI is not provided. The ESS-DIVE team will email you regarding any changes that need to be made to the dataset before publication. Publication times range from a few days to a few weeks, depending on the complexity and quality of the dataset. Incomplete submissions and slow response times will result in delayed publication.

Review and agree to the data contributor license (), and specify one of the standard ESS-DIVE data usage policies for serving data to the public.

Review the Confidentiality and Ethics section (); ensure that your dataset, including all metadata and data files, does not contain any and that you uphold ethical norms to safeguard sensitive information.

The ESS-DIVE data submission web form is the easiest way to submit small datasets. For step-by-step instructions on how to create a dataset refer to the how-to guide linked below and ESS-DIVE's . When completing the required metadata fields, use our , which was created using the NCEAS to ensure our repository contains high-quality and useful data.

ESS-DIVE's Dataset API allows you to programmatically submit many datasets at once. Detailed tutorials and example code for dataset submissions with the API are provided both in the Dataset API guide (linked below) and . Example code is available in Python, Java, and R. Examples of the expected metadata schema are available at .

is a cloud-based data transfer service designed to move significant amounts of data and does not require writing code. Globus can help you upload data to ESS-DIVE, but it cannot be used to curate metadata. You'll need to create your dataset and curate metadata via the Submission Form or the Dataset API. When using Globus, it is necessary to work with the ESS-DIVE Team to complete data file uploads.

All data generated in the scientific process may be worth preservation, including raw data, processed data that has gone through extensive QA/QC and transformations, and results of analyses. DataONE has a good summary of best practices in determining what data to preserve (

Our general recommendation is to publish data that has the greatest potential to be scientifically useful to others, and to deep archive the rest of the data for reproducibility of the results. Consider that your data may be reused in other studies, and include enough descriptive information so that others could understand your data in the future. The Digital Curation Centre (DCC) has a list of potential future purposes for data (), which also may be helpful in determining what data to publish.

Part B
Part C
controlled and prohibited information categories
Large Data Support
Frequently Asked Questions
Tutorial Videos
Dataset Requirements guide
FAIR data standards
Submit Data with Online Form
ESS-DIVE's API Examples GitHub repository
https://api.ess-dive.lbl.gov/
ESS-DIVE Dataset API
Globus
Globus Data Transfer Service
https://www.dataone.org/best-practices/decide-what-data-preserve).
http://www.dcc.ac.uk/resources/how-guides/five-steps-decide-what-data-keep#4
Data Reporting Formats
Data Submission Form
ESS-DIVE's Dataset API
Globus Transfer Service
DATA SUBMISSION GUIDELINESESS-DIVE
CONTRIBUTOR TERMS OF USEESS-DIVE
Logo
Logo