Feature/speed up resample by FBumann · Pull Request #455 · flixOpt/flixopt

FBumann · 2025-11-04T07:56:16Z

Description

Speed up resampling of a FlowSystem by 20-40x by concatenating to xr.DataArray before resampling

Type of Change

Bug fix
New feature
Documentation update
Code refactoring

Related Issues

Closes #(issue number)

Testing

I have tested my changes
Existing tests still pass

Checklist

My code follows the project style
I have updated documentation if needed
I have added tests for new functionality (if applicable)

Summary by CodeRabbit

Performance Improvements
- FlowSystem resampling operations are now significantly faster, delivering approximately 20-40x performance improvement through enhanced internal data processing.

coderabbitai · 2025-11-04T07:56:34Z

Walkthrough

The changes optimize the FlowSystem resampling operation by introducing dimension-group-aware resampling. A new private helper method groups time-dimensioned variables by their non-time dimensions and resamples each group independently to avoid unnecessary NaN broadcasting. Validation of resampling methods and preservation of timestep attributes are added to the resample flow.

Changes

Cohort / File(s)	Summary
Resample optimization `flixopt/flow_system.py`	Adds private `_resample_by_dimension_groups()` helper method that groups and resamples time-dimensioned variables by dimension structure. Refactors `resample()` to validate method, split variables by time-dimensionality, apply grouped resampling, merge results, and preserve timestep attributes. Imports `defaultdict` from `collections`.
Documentation update `CHANGELOG.md`	Documents ~20–40× speedup improvement in FlowSystem resampling through internal dataarray conversion.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant resample as resample()
    participant split as Split Variables
    participant group as _resample_by_dimension_groups()
    participant merge as Merge Results
    
    User->>resample: resample(time_dataset, method, ...)
    resample->>resample: Validate method
    resample->>split: Separate time-dimensioned vars<br/>from non-time-dimensioned vars
    split-->>resample: time_vars, other_vars
    resample->>group: Process time_vars by<br/>dimension groups
    group->>group: Group by non-time dimensions
    group->>group: Resample each group independently
    group-->>resample: resampled_time_dataset
    resample->>merge: Merge time_dataset + other_vars
    merge-->>resample: merged_dataset
    resample->>resample: Preserve/set timestep attributes
    resample-->>User: final_dataset

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

New private helper method introduces grouping logic with straightforward grouping and merging operations
Resample method refactoring is localized and follows a clear pattern: validate → split → process → merge
Changes are focused to a single file with one public method affected

Possibly related PRs

PR #442: Modifies FlowSystem.resample() to preserve and set hours_of_last_timestep and hours_of_previous_timesteps attributes, which overlaps with the attribute preservation logic introduced in this PR.

Poem

🐰 Hops through dimensions with graceful care,
Grouping time-vars without a spare,
No NaN broadcasts to slow us down—
Twenty-fold speedup, the fastest in town! 🚀

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Feature/speed up resample' accurately reflects the main change: a performance optimization to the resample method that speeds it up by 20-40x.
Description check	✅ Passed	The PR description includes the required sections from the template (Description, Type of Change, Related Issues, Testing, Checklist) with relevant information provided.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/speed-up-resample

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b50d538 and 7bf1925.

📒 Files selected for processing (1)

CHANGELOG.md (1 hunks)

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: test (3.12)
GitHub Check: test (3.10)
GitHub Check: test (3.13)
GitHub Check: test (3.11)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67e13a5 and aa04d29.

📒 Files selected for processing (1)

flixopt/flow_system.py (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

flixopt/flow_system.py (1)

flixopt/structure.py (1)

to_dataset (668-691)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: test (3.13)
GitHub Check: test (3.12)
GitHub Check: test (3.11)
GitHub Check: test (3.10)

flixopt/flow_system.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

flixopt/flow_system.py (1)
1033-1042: Critical: Address the previously identified crashes from concat and empty merge.

The issues flagged in the previous review are still present:

Line 1035-1037: xr.concat with default combine_attrs='no_conflicts' will raise MergeError when variables have different attributes (e.g., units, long_name), which is typical for FlowSystem datasets.

Line 1042: xr.merge([]) will raise ValueError when resampled_groups is empty, which occurs when the FlowSystem has no time-dependent variables.

The previous review provided a fix that resolves both issues by resampling datasets directly and handling the empty case.

Apply this diff to fix both issues:
+        if not dim_groups:
+            return getattr(time_dataset.resample(time=time, **kwargs), method)()
+
         resampled_groups = []
         for var_names in dim_groups.values():
-            stacked = xr.concat(
-                [time_dataset[name] for name in var_names],
-                dim=pd.Index(var_names, name='variable'),
-            )
-            resampled = getattr(stacked.resample(time=time, **kwargs), method)()
-            resampled_groups.append(resampled.to_dataset(dim='variable'))
+            grouped_dataset = time_dataset[var_names]
+            resampled_group = getattr(grouped_dataset.resample(time=time, **kwargs), method)()
+            resampled_groups.append(resampled_group)
 
         return xr.merge(resampled_groups)
This keeps the performance optimization while avoiding attribute conflicts and empty merge errors.

🧹 Nitpick comments (1)

flixopt/flow_system.py (1)

1077-1080: Consider moving method validation earlier for fail-fast behavior.

The method validation is correct and provides a clear error message. As a minor optimization, you could move this validation to the beginning of the method (right after line 1067) to fail before any data processing. However, the current placement is acceptable since it still fails before the expensive resampling operation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa04d29 and dfa6ef9.

📒 Files selected for processing (1)

flixopt/flow_system.py (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

flixopt/flow_system.py (1)

flixopt/structure.py (1)

to_dataset (668-691)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: test (3.12)
GitHub Check: test (3.10)
GitHub Check: test (3.13)
GitHub Check: test (3.11)

🔇 Additional comments (2)

flixopt/flow_system.py (2)

9-9: LGTM!

The defaultdict import is correctly placed and used for grouping variables by dimensions in the new resampling logic.

1071-1090: Clean separation of concerns.

The refactored resampling logic correctly:

Separates time-dependent from non-time variables

Validates the resampling method before processing

Resamples only time-dependent variables using the grouped helper

Preserves non-time variables unchanged

Merges results appropriately

This approach is sound and will work correctly once the critical issues in _resample_by_dimension_groups are resolved.

FBumann added 3 commits November 4, 2025 08:15

Resample a single concatenated dataarray instead of a Dataset

a504914

Performance improvements

92633a5

Use helper method for resampling speed up resampling

aa04d29

FBumann added 2 commits November 4, 2025 08:58

Improve docstring

dbd1405

Improve docstring

dfa6ef9

coderabbitai bot reviewed Nov 4, 2025

View reviewed changes

flixopt/flow_system.py Show resolved Hide resolved

coderabbitai bot reviewed Nov 4, 2025

View reviewed changes

FBumann added 3 commits November 4, 2025 10:10

avoiding attribute conflicts and empty merge errors

75cbb64

moving method validation earlier for fail-fast behavior

b50d538

Update CHANGELOG.md

7bf1925

FBumann merged commit 87bd422 into main Nov 4, 2025
12 checks passed

coderabbitai bot mentioned this pull request Nov 4, 2025

Feature/speed up resample again #458

Merged

9 tasks

FBumann deleted the feature/speed-up-resample branch November 19, 2025 13:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/speed up resample#455

Feature/speed up resample#455
FBumann merged 8 commits intomainfrom
feature/speed-up-resample

FBumann commented Nov 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FBumann commented Nov 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Testing

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FBumann commented Nov 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 4, 2025 •

edited

Loading