Reporting Format Requirements

This page provides an overview of the review process for datasets utilizing reporting formats.

ESS-DIVE’s Reporting Formats are designed to make data and metadata published on ESS-DIVE more FAIR (Findable, Accessible, Interoperable, Reusable). Consistent formatting of data and metadata enables both machines and humans to better understand and reuse valuable data.

We use reporting formats to enable advanced search within data files. Specifically, the Fusion Database (Fusion DB) validates, extracts and indexes data within standardized files.

The contents of public data and metadata files successfully parsed by the FusionDB are made searchable by the Deep Dive APIarrow-up-right, which is separate from the ESS-DIVE main search and Dataset API. This currently requires the use of the File Level Metadata (FLMD)arrow-up-right and Comma Separated Values (CSV) Guidelinesarrow-up-right Reporting Formats. These reporting formats are widely applicable to data types stored on ESS-DIVE and ensure that data files are described through standardized metadata fields and are machine-readable. The Fusion DB provides feedback to the ESS-DIVE Publication Review Team if any requirements are not met. These requirements are outlined below. For more detailed documentation of all Reporting Formats, please visit the ESS-DIVE Workspace GitHubarrow-up-right.

We plan to expand the FusionDB to incorporate data-type specific reporting formats and associated automated validations in the future.

Reporting Format Checks

A series of checks are performed during the publication review process for datasets using reporting formats. Checks listed as required are necessary for machine readability and parsing, whereas strongly recommended and optional checks are recommended enhancements to metadata.

Example datasets that have passed all reporting format checks are available below.

Check Name
Requirement Level
Description

File Name

Required

File name uses only letters, numbers, and underscores. Do not include spaces and do not start with an underscore or hyphen.

File Description

Required

A brief description (minimum 10 characters) is provided

Column or Row Name

Required

Column or row names use only letters, numbers, hyphens, and underscores. Do not include spaces, and do not start with an underscore, hyphen, or number.

Unit

Required

Unit is present

Definition

Required

Description is present

Character Set

Required

All characters are within US-ASCII character set without extensions or UTF-8

Delimiter

Required

Delimiter used for file is comma and saved as a CSV file

Data Matrix

Required

Contents of the data portion of the file is organized in a logical and readable matrix format

Column or Row Name Orientation

Required

Orientation of the file is either horizontal or vertical

Consistent Values

Required

Text and numeric data are not mixed within the same Column or Row

Missing Value Codes

Required

All cells in the data matrix have a value and missing data are represented with Missing Value Codes

Temporal Data

Required

Date format follows ISO 8601 standard (YYYY-MM-DD, to known precision) and time format following Coordinated Universal Time (UTC) (YYYY-MM-DD hh:mm:ss, to known precision)

Spatial Data

Required

Geographic coordinates are provided in WGS84 decimal format

File naming conventions for File Level Metadata and Data Dictionary files

Required

A file within the dataset contains the following suffixes *_flmd.csv and *_dd.csv.

Reporting Format Keywords

Required

ESS-DIVE reporting format keywordsarrow-up-right are used. The File Level Metadata reporting format keyword is required for the FusionDB to identify, validate and parse your dataset.

Standard

Strongly Recommended

ESS-DIVE Standard field termsarrow-up-right for reporting formats are used

Data Orientation

Optional

Check whether “horizontal” or “vertical” is provided within File Level Metadata file

Example Datasets Using Reporting Formats and Successfully Parsed By Fusion Database

Last updated