Metadata Requirements
This page provides an overview of the minimum metadata required for publication on ESS-DIVE. These requirements are used by the ESS-DIVE team to review datasets before approval for publication.
ESS-DIVE’s dataset metadata requirements allow you to fully describe your dataset so that others can more easily find and use relevant data from dataset searches. Metadata for each dataset submitted should meet the guidelines in the description of each field listed below, and metadata completeness will be assessed during the dataset publication process using both automated and manual review workflows. Ensuring that your dataset has complete metadata before requesting publication will expedite the publication process.
Dataset metadata fields marked with a red asterisk (*) are required to submit your dataset. All dataset metadata fields are reviewed using the requirements outlined in the Description of each field.
Controlled and Prohibited Information
ESS-DIVE will not accept any datasets that include controlled and prohibited information categories (section D.2), such as Protected Information (i.e., Personally Identifiable Information [PII] and Protected Health Information [PHI]).
JSON-LD Fields
Datasets that are created or edited using the Dataset API must use the JSON-LD schema. The JSON-LD rows indicates what each metadata field looks like in the JSON-LD schema.
Automated Checks
A set of automated checks are performed whenever a dataset is submitted. The results of these checks are compiled into an Assessment Report and used in the review process. Failed automated checks or warnings should be addressed by the dataset submitter before requesting publication.
Please note that assessment reports can take minutes, or up to 24 hours, to generate.
Working Offline? ESS-DIVE's Offline Metadata Template can be used to prepare your dataset metadata prior to submission. We recommend using the template to collaborate with your team members in Google Docs, then copying and pasting the completed fields into the ESS-DIVE Dataset Submission form when you are ready to create your dataset.
Overview
Title*
Format
Free Text
Description
Include a title between 7-20 words long which contains information such as the topic, geographic location, dates, and scale of data. If data is associated with a journal publication, the title may include the journal name.
Avoid unexplained acronyms or project-specific vocabulary. If there is an existing DOI for the data, use the same title.
Example
Raw sapflow and soil moisture data from January 2016-April 2016 in Manaus, Brazil
JSON-LD Field
name
Automated Check
Required: Dataset title is between 7 and 40 words in length
Existing DOI & Alternate Identifier
Format
Free Text
Description
If this dataset has been previously published elsewhere, enter the DOI or alternate identifier. Identifiers are used to locate the dataset within your project's data management system and can provide pertinent contextual information for users. Ensure the identifier correctly leads to the dataset that you are submitting.
Example
http://dx.doi.org/XXXX
JSON-LD Field
alternateName
Abstract*
Format
Free Text
Description
The abstract should be at least 100 words in length, written in full sentences, and understandable to anyone who has not seen related manuscripts. Describe the content of the dataset, and provide all necessary scientific context, avoid unexplained acronyms or project specific terms, and include specific details that promote the reproducibility of your data. This may include source data for synthesis work, software necessary to view the related files, ecosystem type involved, or measurement types. Include a statement about the purpose for why these data were generated and the research question it is intended to answer.
Example
This dataset contains raw output from a data logger connected to 9 sapflow and 5 soil moisture sensors in Manaus, Brazil. The file xxx.dat contains raw data and the metadata file (BR-Ma2_E-fieldlog_20160501.xls) has information on locations where the sensors were installed and other sensor maintenance details. No data processing or QA/QC was done on the raw datasets. Processed data will be uploaded as separate datasets on ESS-DIVE. This research was performed as a part of the NGEE Tropics project, which aims to advance model predictions of tropical forest carbon cycle responses to a changing climate over the 21st Century.
JSON-LD Field
description
Automated Check
Required: Abstract is at least 100 words in length
Keywords*
Format
Description
Add a minimum of three total keywords or data variables. As you begin typing in the web form field, GCMD controlled vocabulary terms will appear in a dropdown list. Selecting from the GCMD controlled keywords where possible is encouraged but not required. You can also enter your own keywords. Ensure that keyword terms differ from words in the title to increase the findability of your dataset in searches.
Example
Earth Science, Land Surface, Soils
JSON-LD Field
keywords
Automated Check
Required: At least three keywords Optional: Keywords differ from terms in dataset title
Data Variables
Format
Description
Add variables to increase the findability of your dataset in searches. Similarly to the keywords field, selecting variable terms from GCMD controlled vocabulary where possible is encouraged but not required.
Example
Soil Moisture
JSON-LD Field
variableMeasured
Publication Date
Format
YYYY or YYYY-MM-DD
Description
Specify a custom date or year when this dataset can be made publicly available. If this is not specified, it will default to the current date.
Example
2019 or 2019-04-19
JSON-LD Field
datePublished
Automated Check
Required: Publication date is present
Usage Rights*
Format
Select choice
Description
Choose how you wish your data to be shared and reused. Creative Commons Attribution (CC BY 4.0) requires that the dataset be cited by anyone using the data. Creative Commons Public Domain (CC BY 1.0) dedicates the data to the public domain without restriction. When using the API, enter the URL for the selected CC BY license.
Example
Select Creative Commons Attribution (CC BY 4.0) or Creative Commons Public Domain (CC BY 1.0)
JSON-LD Field
license
Automated Check
Optional: Usage rights is set to Creative Commons CC-BY license
Project *
Online Form only
Format
Controlled List
Description
Select the DOE project name from the drop down list, which will appear when you start typing in the project name or Principal Investigator (PI) name. If multiple projects were involved, enter the project that had the largest contribution to this dataset.
Example
Next-Generation Ecosystem Experiments (NGEE) Tropics [PI: Jeffrey Chambers]
JSON-LD Field
provider
Automated Check
Required: Project name from controlled list
API Only
Format
Value
Description
Example
JSON-LD Field
Funding Organization*
Format
Controlled List or Free Text
Description
List the organizations that funded the work. When using the web form, you can choose from the drop down list as you begin to enter the funding organization.
Example
[Example from dropdown list]: U.S. DOE > Office of Science > Biological and Environmental Research (BER)
JSON-LD Field
funder
Automated Check
Optional: Funding organization "U.S. DOE > Office of Science > Biological and Environmental Research (BER)" is present
DOE Contracts
Format
Controlled List or Free Text
Description
List the numbers of any DOE contract under which the data in the package was funded. Enter "NONE" if no DOE funding applies. If the dataset is a result of a joint effort between two or more DOE Site/Facility Management Contractors, etc., additional DOE contract numbers may be entered.
Example
AC0205CH11231
JSON-LD Field
award
Related References
Format
Free Text
Description
Include the full citations and DOIs of datasets or publications associated with your dataset. These related materials allow users to learn more about the dataset, processing methods, or how the data were used.
Example
Somebody J. (2018), Sapflow and soil moisture coupling in the Amazon, Journal. doi: xx.xxxx
JSON-LD Field
citation
People
Contact*
Format
Free Text
Description
List the person who should be contacted by users seeking further information for the data. Only one contact is allowed. Including the ORCID of this individual is strongly encouraged.
Example
First name, Last name, Organization, Email, ORCID (strongly encouraged)
JSON-LD Field
editor
Automated Check
Required: Contact is present Required: Contact ORCID is provided
Creators
Format
Free Text
Description
Include the main researchers involved in producing the data such as authors, owners, originators, and principal investigators. List creators in the order they should appear in the dataset citation. One or more creators is required and including email addresses is highly encouraged.
Example
First name, Last name, Organization, Email, ORCID (not required for creators)
JSON-LD Field
creator
Automated Check
Required: At least one creator is present
Contributors
Format
Free Text
Description
List any additional contributors involved in producing the data. These may include people who assisted in creating the dataset but are not considered authors. Contributors will not appear in the data citation. Including email addresses is highly encouraged.
Example
First name, Last name, Organization, Email, ORCID (not required for contributors)
JSON-LD Field
contributor
Dates
Start Date
Format
YYYY-MM-DD
Description
Earliest date of data collection included in the dataset.
Example
2017-04-16
JSON-LD Field
temporalCoverage
Automated Check
Required: Start date is present
End Date
Format
YYYY-MM-DD
Description
Last date of data collection included in the dataset. This field can be left blank if your dataset is open ended.
Example
2019-07-13
JSON-LD Field
temporalCoverage
Automated Check
Required: End date is present
Locations
Geographic Description
Format
Free Text
Description
A short description of the location(s) where data was collected. This may include the location name, known identifiers if associated with a specific project (e.g. Ameriflux site name), and ecosystem type involved. Multiple geographic descriptions can be added if necessary. A complete geographic description will increase the findability of your dataset, as all terms entered are searchable through the data portal.
Example
Br-Ma2, Manaus, Brazil: ZF2 K34 Tower. Eddy covariance site established in 1999 on kilometer 34 of the ZF2 highway. It was later expanded into an atmospheric and soil sampling hub. It is a 1.5m x 2.5 m- section aluminum tower, 50 m tall, on a medium-sized plateau (Araujo et al., 2002).
JSON-LD Field
spatialCoverage/description
Bounding Box Coordinates
Format
Latitude and Longitude in WGS 84 decimal degrees
Description
Latitude and Longitude of the location(s) this data represent in WGS84 decimal format. Enter only one coordinate pair for a single point and bounding box coordinates for non-point locations. Ensure coordinate accuracy before submitting your dataset. If the data location is better represented by a shape, you may also include a KML file in the file uploads.
Example
Northwest Coordinates [Lat Long]/Southeast Coordinates [Lat Long]
JSON-LD Field
spatialCoverage
Automated Check
Optional: Coordinates describing the point location or geographic area of the dataset are present
Methods
Methods
Format
Free Text
Description
Methods for a dataset should focus on all aspects of dataset production and should be thorough enough for your work to be reproduced. Include descriptions of the experimental design, laboratory and/or field collection methods (e.g. observations and/or devices used), source data for synthesis studies, data processing, and QA/QC procedures, and known issues or limitations of data where applicable. A complete methods section will improve findability of your data, as all text entered into methods will also be searchable for users through the data portal filters.
You may provide a citation for any methods used that were published previously, but methods related to data production must still be included.
Example
JSON-LD Field
measurementTechnique
Automated Check
Required: Methods description is more than 7 words in length
Additional Automated Checks
The below checks are run on each dataset upon submission as a part of the ESS-DIVE automated check suite. Informational checks appear on the assessment reports and are not pass/fail.
URLs in metadata resolve correctly
Required
Findable
Data file formats are non-proprietary
Optional
Reusable
Informational: Number of creators with email addresses provided
Informational
Findable
Informational: Number of contacts with email addresses provided
Informational
Findable
Informational: Count of data entities present
Informational
Interoperable
Last updated