# Large Data Support

ESS-DIVE now has a <mark style="color:green;">**Tier 2 data storage**</mark> service to support publishing very large, hierarchical datasets that can be directly accessed from our repository. ESS-DIVE uses <mark style="color:green;">**Globus**</mark>, a data transfer service, to make it easier to upload large data to ESS-DIVE. The Tier 2 and Globus services are setup offline with close assistance from the ESS-DIVE Team.

#### How to contact ESS-DIVE about publishing large data:

For large data support, please email us at <ess-dive-support@lbl.gov> with the following information about your data:

> 1. What's the total file volume of your dataset?&#x20;
> 2. Approximately how many files are in your dataset and what's the range of file sizes?
> 3. Is the data structure hierarchical? If yes:
>    1. Can you easily flatten your data structure (i.e. move data out of folders)? Or
>    2. Can you compress the folders into ZIP files or will it be necessary to browse the folder hierarchies?
> 4. Where is your data stored currently (e.g. local desktop, cloud based server, Google Drive)?

## Globus: Upload Large Data

Globus (<https://www.globus.org/>) is a free, cloud-based data transfer service designed to move significant amounts of data. ESS-DIVE uses this service to move data from your local desktop or existing Globus endpoint to ESS-DIVE's storage services. This large data support tool can be used to resolve common upload errors or as the default upload method for data greater than 500GB.&#x20;

Learn more about and how to use Globus for publishing data on ESS-DIVE or resolving upload issues via our Globus documentation page.

<figure><img src="/files/AaWwY1cpNLavNaIdAUnk" alt="" width="375"><figcaption><p>Figure 1: The Globus file manager (pictured) is accessible via <br>browser and is used as the primary interface for transferring <br>data with Globus.  </p></figcaption></figure>

{% content-ref url="/pages/-MX-a6sa8lrluGrgpHq6" %}
[Globus Data Transfer Service](/programmatic-tools/large-data-support.md)
{% endcontent-ref %}

## Tier 2: Storage for Large Data

Tier 2 (Figure 2) is ESS-DIVE's extended storage resource that is used to store very large, hierarchical datasets, instead of storing the data directly on ESS-DIVE's dataset landing pages, or Tier 1 (Figure 3). Data greater than 500GB in volume will be archived on Tier 2 by default. Additionally, Tier 2 supports the functionality to browse hierarchical folders in your browser prior to download.&#x20;

Data stored on Tier 2 resources can be accessed and downloaded from the Tier 2 landing page (Figure 2). This is separate from ESS-DIVE's dataset landing page (Figure 3). You can choose to publish some or all of your dataset files on Tier 2.

Generally, data should be stored on Tier 1 whenever possible. ESS-DIVE is constantly expanding and improving features on Tier 1 that may not be supported on Tier 2. However data less than 500GB can be published on Tier 2 if necessary.

Any data contributor can take advantage of the Tier 2 service even if your data is less than 500GB.  Please contact ESS-DIVE at <ess-dive-support@lbl.gov> to discuss if your data is suitable for Tier 2 storage.

<div><figure><img src="/files/BeZYxHhTw1eKQOclvorN" alt=""><figcaption><p>Figure 2: Tier 2 landing page for large file exploration and download. Access to dataset metadata on Tier 1 is provided via link.</p></figcaption></figure> <figure><img src="/files/kghFau5hM0CqMzxmRlvC" alt=""><figcaption><p>Figure 3: Tier 1 dataset landing page where metadata and data can be discovered and downloaded. Access to files on Tier 2 are provided via external link. </p></figcaption></figure></div>

Data contributors must use the Globus transfer service to upload their data to Tier 2. Once uploaded to Globus, ESS-DIVE will organize the data and add additional file metadata on the Tier 2 landing page. The data contributor will review and approve the data on Tier 2 prior to publication. At the time of publication, the data will be publicly accessible on the Globus "ESS-DIVE Public Share" collection, as well as, on the Tier 2 website. Additionally, external links to both Tier 2 and Globus will be added to the dataset metadata landing page for access and download (demonstrated in Figure 3).&#x20;

### Management and Preservation of Tier 2 Data&#x20;

ESS-DIVE stores redundant copies of data published on Tier 2 resources to preserve and provide long-term access to Environmental Systems Science (ESS) research data.

Please be aware that, at this time, the following features are not available for Tier 2 data:

1. Will not be linked to the DataOne federation,&#x20;
2. Cannot be private, and&#x20;
3. Data downloads and views will not be factored into data package statistics.

### How to Download Tier 2 Data

{% content-ref url="/pages/-MfndH11d-0IoWKMHn9P" %}
[Download Data](/searching-and-accessing-data/accessing-data.md)
{% endcontent-ref %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ess-dive.lbl.gov/contributing-data/get-started/large-data-support.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
