PIDReg: Explainable Multimodal Regression via Information Decomposition

Overview


Framework of Partial Information Decomposition for Multimodal Regression (PIDReg), illustrated with video and audio modalities, where $P(X_{1})$, $P(X_{2})$, and $P(Y)$ denote empirical data distributions that may deviate from Gaussianity (e.g., skewed or heavy-tailed).

PIDReg adaptively fuses two input modalities by learning how much each modality should contribute to prediction, rather than relying on fixed or hand-tuned weights. During training, PIDReg operates in the joint latent space of the modalities and decomposes the information into:

🔺 Unique Information
Modality-specific information that is captured by one modality but not the other.
🔶 Redundant Information
Overlapping information between modalities.
🔴 Synergistic Information
Information that emerges when both modalities are considered jointly, interactions that neither modality can provide alone.

By explicitly estimating these components, PIDReg can (i) quantify what each modality is truly contributing, and (ii) weight modalities accordingly during regression.

Key Features

🌈	Feature	Description
①	Dynamic Fusion Weights	Learns optimal fusion weights via PID-based decomposition, eliminating hand-crafted balancing.
②	Information Bottleneck	Enforces compact yet informative latent representations for each modality.
③	Conditional MI Minimization	Minimizes redundancy to enhance modality-specific informativeness.
④	Adaptive λ-Learning	Automatically adjusts information bottleneck strength through learnable λ parameters.
⑤	PID Stability Detection	Detects stabilization of PID terms and freezes weights to prevent overfitting.

Installation

Repository preparation:

git clone https://github.com/zhaozhaoma/PIDReg.git
cd PIDReg

Install dependencies:

pip install -r requirements.txt

Ensure you have CUDA-compatible PyTorch installed for GPU acceleration

Data Format

To make PIDReg directly runnable, we include a curated sample dataset (./data/sample.csv). This file contains 5,000 rows subsampled from the original Superconductivity dataset used in our experiments. Each row in sample.csv follows a unified and intuitive column layout:

Index Range	Description	Modality
0–80	Feature dimensions for the first input source	Modality 1
81–166	Feature dimensions for the second input source	Modality 2
167	Continuous target variable used for regression	Target Variable

Model Components

File	Purpose	Key Functionalities
`PIDRegModel.py`	Core model definition	• Implements modality-specific information bottlenecks • Computes PID-based fusion weights for adaptive modality contribution • Incorporates Gaussian reconstruction and Cauchy–Schwarz divergence regularizations
`PIDRegTrainer.py`	Training orchestration	• Uses dual optimizers (one for model parameters, one for λ) • Integrates a scheduler for adaptive learning rate control • Performs PID stability detection and automatic fusion-weight freezing once convergence reached
`CMICalculator.py`	Conditional mutual information module	• Estimates conditional mutual information (CMI) between modalities and targets • Guides training toward reducing information leakage and enhancing modality specificity
`csv_data_loader.py`	Data processing and preparation	• Automatically splits data into train / validation / test sets • Applies standard scaling to both features and target • Supports flexible dataset paths and batch construction

Usage

Basic Training

python main.py --data_path ./data --n_epochs 200 --batch_size 256

Advanced Configuration

python main.py \
    --data_path ./data \
    --result_dir ./results \
    --batch_size 256 \
    --n_epochs 200 \
    --window_size 5 \
    --early_stopping 30 \
    --lambda_lr 0.1 \
    --hidden_dim 256 \
    --latent_dim 64

Citation

If you utilize PIDReg in your work, we would appreciate your citation of the following paper 📃:

@article{ma2025explainable,
  title={Explainable Multimodal Regression via Information Decomposition},
  author={Ma, Zhaozhao and Yu, Shujian},
  journal={arXiv preprint arXiv:2512.22102},
  year={2025}
}

Contact

For any questions or feedback, please feel free to reach out to us via email: zhaozhaoma@zju.edu.cn

Built with ❤️, and ☀️🌙

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
modules		modules
src		src
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PIDReg: Explainable Multimodal Regression via Information Decomposition

Overview

Key Features

Installation

Data Format

Model Components

Usage

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PIDReg: Explainable Multimodal Regression via Information Decomposition

Overview

Key Features

Installation

Data Format

Model Components

Usage

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages