Goodwell et al., 2026
📧 goodwel2@illinois.edu
This repository contains code and datasets associated with the manuscript:
“Detecting regimes of critical zone processes, drivers, and predictability with a data-driven framework”
📄 *Accepted at * AGU Advances (March 2026)
Time-series variables in the DATA folder are processed or clipped versions of the original datasets listed below.
-
GC Flux Tower Data
https://www.hydroshare.org/resource/0ef3eda3534f44a6bbd65786d57222ea/ -
US-Kon AmeriFlux Data
https://ameriflux.lbl.gov/sites/siteinfo/US-Kon -
Monticello RiverLab Data
https://www.hydroshare.org/resource/2c6c1d02c3ec4b97a767c787e1889647/ -
Orgeval RiverLab Data
https://doi.org/10.15454/9PUYPN -
Nebraska Root–Soil Gas Flux Data
https://www.hydroshare.org/resource/405c8669069147b690c04b7063cad6ce/
This codebase implements a data-driven framework for identifying and interpreting temporal regimes in multivariate environmental time series:
-
Gaussian Mixture Model (GMM) clustering
- Identifies distinct temporal regimes in multivariate datasets.
-
Principal Component Analysis (PCA)
- Characterizes dominant modes of variability within each regime. Explained variances of first two principal components go toward a "potential predictability" metric for each cluster.
-
Information Theory (IT) metrics
- Applied to PC projections to identify dominant predictors of CZ dynamics in each cluster, or for a given dataset. Dominant predictors are defined as the two time-series variables that jointly reduce the most uncertainty in either principal component projection (which represents the reduced response system dynamics).
- Contains processed datasets used in the analyses.
- The
Processed/subfolder includes outputs generated by DataPrep scripts that assemble and harmonize original data files for each case study.
- Analysis scripts automatically save generated figures to this folder.
- Example figures are included.
- Core module containing all functions for:
- Clustering
- Dimensionality reduction
- Information theory metrics
- Figure generation
- Imported by all analysis scripts.
- Case-study–specific data preparation scripts that concatenate, do minor gap-filling, and align time-series data for input into the GMM-PCA-IT analysis codes.
- Case-study–specific analysis scripts, including:
- Flux tower datasets
- Stream solute concentration datasets
- Root–soil gas concentration datasets
- Associated meteorological and soil drivers
- Outputs:
- Multiple figures (saved to
FIGS/) - A
.csvfile summarizing information theory results
- Multiple figures (saved to