Reporting Format Requirements
This page provides an overview of the review process for datasets utilizing reporting formats.
ESS-DIVE’s Reporting Formats are designed to make data and metadata published on ESS-DIVE more FAIR (Findable, Accessible, Interoperable, Reusable). Consistent formatting of data and metadata enables both machines and humans to better understand and reuse valuable data.
We use reporting formats to enable advanced search within data files. Specifically, the Fusion Database (Fusion DB; documentation coming soon) validates, extracts and indexes data within standardized files.
The contents of public data and metadata files successfully parsed by the FusionDB are made searchable by the Deep Dive API, which is separate from the ESS-DIVE main search and Dataset API. This currently requires the use of the File Level Metadata (FLMD) and Comma Separated Values (CSV) Guidelines Reporting Formats. These reporting formats are widely applicable to data types stored on ESS-DIVE and ensure that data files are described through standardized metadata fields and are machine-readable. The Fusion DB provides feedback to the ESS-DIVE Publication Review Team if any requirements are not met. These requirements are outlined below. For more detailed documentation of all Reporting Formats, please visit the ESS-DIVE Community GitHub.
We plan to expand the FusionDB to incorporate data-type specific reporting formats and associated automated validations in the future.
Reporting Format Checks
A series of checks are performed during the publication review process for datasets using reporting formats. Checks listed as required are necessary for machine readability and parsing, whereas strongly recommended and optional checks are recommended enhancements to metadata.
Example datasets that have passed all reporting format checks are available below.
Check Name | Requirement Level | Description |
---|---|---|
File Name | Required | File name uses only letters, numbers, and underscores. Do not include spaces and do not start with an underscore or hyphen. |
File Description | Required | A brief description (minimum 10 characters) is provided |
Column or Row Name | Required | Column or row names use only letters, numbers, hyphens, and underscores. Do not include spaces, and do not start with an underscore, hyphen, or number. |
Unit | Required | Unit is present |
Definition | Required | Description is present |
Character Set | Required | All characters are within US-ASCII character set without extensions or UTF-8 |
Delimiter | Required | Delimiter used for file is comma and saved as a CSV file |
Data Matrix | Required | Contents of the data portion of the file is organized in a logical and readable matrix format |
Column or Row Name Orientation | Required | Orientation of the file is either horizontal or vertical |
Consistent Values | Required | Text and numeric data are not mixed within the same Column or Row |
Missing Value Codes | Required | All cells in the data matrix have a value and missing data are represented with Missing Value Codes |
Temporal Data | Required | Date format follows ISO 8601 standard (YYYY-MM-DD, to known precision) and time format following Coordinated Universal Time (UTC) (YYYY-MM-DD hh:mm:ss, to known precision) |
Spatial Data | Required | Geographic coordinates are provided in WGS84 decimal format |
File naming conventions for File Level Metadata and Data Dictionary files | Required | A file within the dataset contains the following suffixes *_flmd.csv and *_dd.csv. |
Reporting Format Keywords | Required | ESS-DIVE reporting format keywords are used. The File Level Metadata reporting format keyword is required for the FusionDB to identify, validate and parse your dataset. |
Standard | Strongly Recommended | ESS-DIVE Standard field terms for reporting formats are used |
Data Orientation | Optional | Check whether “horizontal” or “vertical” is provided within File Level Metadata file |
Example Datasets Using Reporting Formats and Successfully Parsed By Fusion Database
Roley et al., (2023) Data and scripts associated with "Coupled primary production and respiration in a large river contrasts with smaller rivers and streams." doi:10.15485/1985922
Jastrow et al., (2022) Spatially Averaged Ice Contents of Ice-Wedge Polygon Cross-Sections to 3-m Depth, July 2013, Utqiagvik, Alaska doi:10.15485/1876898
Kaufman et al., (2023) Spatial Study 2022: Water Column, Sediment, and Total Ecosystem Respiration Rates across the Yakima River Basin, Washington, USA doi:10.15485/1987520
Gooseff et al., (2023) Riverbed and Near-Surface Water Quality Data, Hanford Reach, Columbia River, February 2021 - April 2022 doi:10.15485/2204421
Hassett et al., (2023) Carbon flux measurements from chambers collected between July to October 2022 at Old Woman Creek, Huron, Ohio doi:10.15485/2229438
Stolze et al., (2024) Aerobic respiration controls on shale weathering, Geochimica et Cosmochimica Acta, 2023: Dataset doi:10.15485/1987859
Wang et al., (2024) Continuous soil temperature measurements from 2019-10-4 to 2020-10-4, Teller road Mile 27, Seward Peninsula, Alaska doi:10.15485/2301692
Sala et al., (2024) Plot and Tree Characteristics from the 2022-2023 field experiment at Game Ridge, Missoula County, Montana, USA doi:10.15485/2371850
Williams et al., (2024) Anion Data for the East River Watershed, Colorado (2014-2023) doi:10.15485/1668054
Last updated