Skip to content

cinet-cluster/CZ_ProcessesDriversPredictability

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CZ_ProcessesDriversPredictability

DOI

Goodwell et al., 2026
📧 goodwel2@illinois.edu


Overview

This repository contains code and datasets associated with the manuscript:

“Detecting regimes of critical zone processes, drivers, and predictability with a data-driven framework”

📄 *Accepted at * AGU Advances (March 2026)


Data Sources

Time-series variables in the DATA folder are processed or clipped versions of the original datasets listed below.


Methodology

This codebase implements a data-driven framework for identifying and interpreting temporal regimes in multivariate environmental time series:

  1. Gaussian Mixture Model (GMM) clustering

    • Identifies distinct temporal regimes in multivariate datasets.
  2. Principal Component Analysis (PCA)

    • Characterizes dominant modes of variability within each regime. Explained variances of first two principal components go toward a "potential predictability" metric for each cluster.
  3. Information Theory (IT) metrics

    • Applied to PC projections to identify dominant predictors of CZ dynamics in each cluster, or for a given dataset. Dominant predictors are defined as the two time-series variables that jointly reduce the most uncertainty in either principal component projection (which represents the reduced response system dynamics).

Repository Structure

📁 DATA/

  • Contains processed datasets used in the analyses.
  • The Processed/ subfolder includes outputs generated by DataPrep scripts that assemble and harmonize original data files for each case study.

📁 FIGS/

  • Analysis scripts automatically save generated figures to this folder.
  • Example figures are included.

Code Description

🔧 cluster_funcs.py

  • Core module containing all functions for:
    • Clustering
    • Dimensionality reduction
    • Information theory metrics
    • Figure generation
  • Imported by all analysis scripts.

📊 DataPrep_*.py

  • Case-study–specific data preparation scripts that concatenate, do minor gap-filling, and align time-series data for input into the GMM-PCA-IT analysis codes.

📊 Analysis_*.py

  • Case-study–specific analysis scripts, including:
    • Flux tower datasets
    • Stream solute concentration datasets
    • Root–soil gas concentration datasets
    • Associated meteorological and soil drivers
  • Outputs:
    • Multiple figures (saved to FIGS/)
    • A .csv file summarizing information theory results

About

Codes and data to reproduce results of Goodwell et al, 2026 "Detecting Regimes of Critical Zone Processes, Drivers, and Predictability with a data-driven framework"

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors