Data

Open downloads

Canonical distribution is the versioned Zenodo deposit (DOI 10.5281/zenodo.19823584). Four download options below; pick the subset you need. MIT-licensed.

Everything (single zip) 178 MB Download

nps-open-climate-data-v1.0.0-all.zip

All four archives below in one download. Use this for offline analysis.

Daily CSVs (gzipped) 150 MB Download

nps-open-climate-data-v1.0.0-daily.zip

Raw 1980–2025 daily series per park in DAYMET / ERA5 native units (K, m, kg/m², W/m², Pa). Multipart parks ship one .csv.gz per polygon.

Park summaries (JSON) 24 MB Download

nps-open-climate-data-v1.0.0-summary.zip

Per-park summary JSONs — annual + seasonal aggregates, Mann–Kendall / Theil–Sen trends, monthly decomposition, climate stripes — plus parks.json index. Temperatures in °C.

Park boundaries (GeoJSON) 3 MB Download

nps-open-climate-data-v1.0.0-boundaries.zip

PAD-US 4.1 proclamation polygons dissolved per park, simplified to 50 m, in WGS84. Includes all_parks.geojson FeatureCollection.

Per-format usage

Unzip the archive — every zip extracts to a single nps-open-climate-data-v1.0.0/ folder:
```
unzip nps-open-climate-data-v1.0.0-summary.zip
```

Daily CSVs are gzipped — read directly with pandas (auto-detects the .gz extension) or decompress first:

# in Python
import pandas as pd
df = pd.read_csv("daily/yellowstone/yellowstone.csv.gz")

# from the shell
gunzip daily/yellowstone/yellowstone.csv.gz

Summary JSONs are plain JSON; load with any standard parser. Schema lives on the Methodology page.

import json
summary = json.load(open("summary/yellowstone.json"))
print(summary["headline_trends"]["tmean_c"])

GeoJSON boundaries open natively in QGIS, geopandas, Mapbox, Leaflet, etc.
```
import geopandas as gpd
gdf = gpd.read_file("boundaries/yellowstone.geojson")
```

Programmatic helper (Python)

The Python package ships a small Zenodo download helper. It pulls the archive on first use, caches under ~/.cache/nps_climate_data/, and returns ready- to-use objects:

python

# Pure-stdlib helper that pulls + caches the Zenodo archives, then
# returns a pandas DataFrame for any park's daily series:
import nps_climate_data as nps

df  = nps.fetch_daily("yellowstone")     # raw daily CSV → DataFrame
sj  = nps.fetch_summary("yellowstone")   # summary JSON → dict
arc = nps.fetch_archive("boundaries")    # downloads + extracts the zip

# Caches under ~/.cache/nps_climate_data so subsequent calls are local.

For agents

Point an AI assistant at /llms.txt for a structured pointer to the dataset, download URLs, Python helpers, schema, and limitations. The Python helper (nps.fetch_daily(slug), etc.) is the simplest path for an agent to actually fetch a park's series — it handles the Zenodo download, gunzip, and parsing in one call. The same information is also exposed as schema.org/Dataset JSON-LD on this page and the home page so structured-data crawlers pick it up too.

Single-park lookups (live site)

The site also serves individual files at predictable URLs — handy for fetching one park without downloading the whole Zenodo archive.

Per-park summary JSON

/NPS-Open-Climate-Data/data/parks/<slug>.json

Single park, served live. Same shape as the Zenodo summary archive.

Per-park boundary GeoJSON

/NPS-Open-Climate-Data/data/boundaries/<slug>.geojson

PAD-US 4.1 polygon for one park, in WGS84.

All-parks boundary FeatureCollection

/NPS-Open-Climate-Data/data/boundaries/all_parks.geojson

All 63 parks in one FeatureCollection, with each park's headline warming slope exposed as feature properties.

Use responsibly

Verify before you cite

These products are open-licensed under MIT and audited internally against literature anchors and arithmetic / range / consistency checks (see Methodology → QC and docs/DATA_QC.md). They are derived products — polygon-averaged ERA5-Land and DAYMET fields aggregated over PAD-US 4.1 boundaries, not station observations.

Re-derive the value. Re-run the pipeline from raw, or independently compute the variable of interest from DAYMET / ERA5-Land directly, and confirm it matches.
Cross-check stations. Compare against NOAA NCEI, NWS, or NPS monitoring records for the specific park and variable.
Correct your statistics. Apply autocorrelation handling (Hamed–Rao MK) and multiple- comparisons correction (Benjamini–Hochberg FDR or Bonferroni). The deployed significance flags do neither.
Read the variable-specific limitations. See methodology → limitations, especially pet_mm (ERA5-Land overestimate vs FAO Penman–Monteith), high-elevation cold bias, and area- naive multipart averaging.

Cite the methodology page, run your own QC for your use case, and treat the published numbers as a starting point.

Reproduce

End-to-end pipeline

The full auth → EE export → Drive pull → analysis → site build flow lives in pipeline.ipynb. This is the condensed terminal version.

bash

git clone https://github.com/anniebritton/NPS-Open-Climate-Data
cd NPS-Open-Climate-Data
pip install -e .

# 1. Submit serverless EE export tasks (needs a GCP project with EE enabled).
#    Runs on Google's servers; close the laptop after submit.
python -c "import ee; ee.Authenticate(auth_mode='localhost'); ee.Initialize(project='YOUR_PROJECT')"
python scripts/01_export_all_parks.py --start 1980-01-01

# 2. When the tasks show COMPLETED at code.earthengine.google.com/tasks,
#    pull the CSVs from Drive (needs gcloud ADC with Drive scope).
python scripts/07_download_from_drive.py --drive-folder NPS_Climate_Data

# 3. Aggregate, run trend tests, write site JSON.
python scripts/02_build_site_data.py

# 4. Extract real PAD-US boundaries from a local v4.1 GDB (optional —
#    committed boundaries are already in the repo).
python scripts/06_extract_padus_from_gdb.py
python scripts/05_generate_boundaries.py

# 5. Build and preview the static site locally.
cd site && npm install && npm run dev

API

Programmatic access

Direct Python access via the nps_climate_data package after pip install -e .. EE credentials must already be initialised.