Rework orphan management to freeze orphans instead of removing them iteratively#269
Conversation
|
Caution Review failedPull request was closed or merged during review WalkthroughIntroduces a new Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/db_model_comparison.rst`:
- Around line 70-71: The "Model-Only Elements" / "Derived Sets" table is out of
sync: add a new entry for commodity_sink and update the commodity_carrier
description to match the earlier lines by defining commodity_sink as commodities
that exit the process network (C_d ∪ C_w) and commodity_carrier as physical
energy carriers and sinks (C_p ∪ C_k); locate the table entry for
commodity_carrier and replace its current "physical + demand + waste" text with
"physical and sink commodities (C_p ∪ C_k)" and insert a new row for
commodity_sink with "commodities that exit the process network (C_d ∪ C_w)" so
both definitions are consistent.
In `@temoa/components/technology.py`:
- Around line 347-357: Move the unmet-demand validation so it runs before the
carrier symmetry check: currently the symmetric_difference logic that computes
symdiff from model.commodity_carrier fires first and raises a generic
carrier-mismatch error (f_msg via logger.error and ValueError), which hides
missing producing processes because commodity_carrier now includes
commodity_demand; instead, run the explicit unmet-demand branch (the check that
inspects commodity_demand and missing demand outputs) prior to computing
symdiff/model.commodity_carrier so missing producers are reported directly.
Locate the symdiff block (uses symdiff, model.commodity_carrier, symdiff_str,
f_msg) and the unmet-demand check below it, and swap their order so the
unmet-demand validation executes first.
In `@temoa/model_checking/commodity_network_manager.py`:
- Around line 63-74: The loop currently only logs cn.unsupported_demands() but
does not record them in manager state, so analyze_network() still treats the
region-period as clean; modify the block to also add these unsupported demand
identifiers into a manager collection (e.g., self.demand_orphans or a new
self.unsupported_demands mapping keyed by (region, period)) by updating the set
with cn.unsupported_demands() (use set.update semantics similar to how
get_demand_side_orphans() and get_other_orphans() are added) so
analyze_network() will detect and fail regions with unsupported demands; update
any related initializations to ensure the mapping exists before update.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: caf59ff3-d22b-4b9e-9570-6068d835ce76
📒 Files selected for processing (13)
docs/source/db_model_comparison.rstdocs/source/mathematical_formulation.rsttemoa/components/commodities.pytemoa/components/flows.pytemoa/components/technology.pytemoa/core/model.pytemoa/data_io/component_manifest.pytemoa/model_checking/commodity_network_manager.pytests/testing_data/mediumville_sets.jsontests/testing_data/storageville.sqltests/testing_data/test_system.sqltests/testing_data/test_system_sets.jsontests/testing_data/utopia_sets.json
| # Log any demands that are not fully supported | ||
| for commodity in cn.unsupported_demands(): | ||
| logger.warning( | ||
| 'Demand %s is not supported back to a source in region %s, period %d', | ||
| commodity, | ||
| region, | ||
| pass_num - 1, | ||
| period, | ||
| ) | ||
| break # Exit the loop if the network is stable | ||
|
|
||
| logger.debug( | ||
| 'Pass %d for region %s: Found and removed %d orphan(s).', | ||
| pass_num, | ||
| region, | ||
| len(orphans_this_pass), | ||
| ) | ||
| for orphan in sorted(orphans_this_pass): | ||
| logger.warning('Removing orphan across all periods: %s', orphan) | ||
|
|
||
| # Remove all orphans found in this pass from all periods in the region | ||
| for period in self.periods: | ||
| data.available_techs[region, period] -= orphans_this_pass | ||
| else: | ||
| logger.error('Region %s analysis did not converge after 100 passes.', region) | ||
|
|
||
| # Add to the main collections, ensuring no duplicates | ||
| self.demand_orphans[region, period].update(cn.get_demand_side_orphans()) | ||
| self.other_orphans[region, period].update(cn.get_other_orphans()) |
There was a problem hiding this comment.
Record unsupported demands in manager state instead of only logging them.
Right now an unsupported demand with no orphan edge tuples only emits a warning here. analyze_network() still returns True at Lines 94-95 because it only inspects self.demand_orphans and self.other_orphans, so a clearly invalid region-period can be reported as clean.
Suggested fix
class CommodityNetworkManager:
def __init__(self, periods: Iterable[str | int], network_data: NetworkModelData) -> None:
self.analyzed: bool = False
self.periods: list[Period] = sorted([Period(int(p)) for p in periods])
self.orig_data: NetworkModelData = network_data
self.filtered_data: NetworkModelData | None = None
self.regions: set[Region] | None = None
@@
self.orig_tech = {k: v.copy() for k, v in network_data.available_techs.items()}
# Final collections of all orphans found, organized by (region, period)
self.demand_orphans: dict[RegionPeriodKey, set[EdgeTuple]] = defaultdict(set)
self.other_orphans: dict[RegionPeriodKey, set[EdgeTuple]] = defaultdict(set)
+ self.unsupported_demands_by_period: dict[RegionPeriodKey, set[str]] = defaultdict(set)
@@
for period in self.periods:
cn = CommodityNetwork(region=region, period=period, model_data=data)
cn.analyze_network()
# Log any demands that are not fully supported
- for commodity in cn.unsupported_demands():
+ unsupported = set(cn.unsupported_demands())
+ self.unsupported_demands_by_period[region, period].update(unsupported)
+ for commodity in unsupported:
logger.warning(
'Demand %s is not supported back to a source in region %s, period %d',
commodity,
region,
period,
)
@@
- orphans_found = any(self.demand_orphans.values()) or any(self.other_orphans.values())
- return not orphans_found
+ issues_found = (
+ any(self.demand_orphans.values())
+ or any(self.other_orphans.values())
+ or any(self.unsupported_demands_by_period.values())
+ )
+ return not issues_found📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Log any demands that are not fully supported | |
| for commodity in cn.unsupported_demands(): | |
| logger.warning( | |
| 'Demand %s is not supported back to a source in region %s, period %d', | |
| commodity, | |
| region, | |
| pass_num - 1, | |
| period, | |
| ) | |
| break # Exit the loop if the network is stable | |
| logger.debug( | |
| 'Pass %d for region %s: Found and removed %d orphan(s).', | |
| pass_num, | |
| region, | |
| len(orphans_this_pass), | |
| ) | |
| for orphan in sorted(orphans_this_pass): | |
| logger.warning('Removing orphan across all periods: %s', orphan) | |
| # Remove all orphans found in this pass from all periods in the region | |
| for period in self.periods: | |
| data.available_techs[region, period] -= orphans_this_pass | |
| else: | |
| logger.error('Region %s analysis did not converge after 100 passes.', region) | |
| # Add to the main collections, ensuring no duplicates | |
| self.demand_orphans[region, period].update(cn.get_demand_side_orphans()) | |
| self.other_orphans[region, period].update(cn.get_other_orphans()) | |
| # Log any demands that are not fully supported | |
| unsupported = set(cn.unsupported_demands()) | |
| self.unsupported_demands_by_period[region, period].update(unsupported) | |
| for commodity in unsupported: | |
| logger.warning( | |
| 'Demand %s is not supported back to a source in region %s, period %d', | |
| commodity, | |
| region, | |
| period, | |
| ) | |
| # Add to the main collections, ensuring no duplicates | |
| self.demand_orphans[region, period].update(cn.get_demand_side_orphans()) | |
| self.other_orphans[region, period].update(cn.get_other_orphans()) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@temoa/model_checking/commodity_network_manager.py` around lines 63 - 74, The
loop currently only logs cn.unsupported_demands() but does not record them in
manager state, so analyze_network() still treats the region-period as clean;
modify the block to also add these unsupported demand identifiers into a manager
collection (e.g., self.demand_orphans or a new self.unsupported_demands mapping
keyed by (region, period)) by updating the set with cn.unsupported_demands()
(use set.update semantics similar to how get_demand_side_orphans() and
get_other_orphans() are added) so analyze_network() will detect and fail regions
with unsupported demands; update any related initializations to ensure the
mapping exists before update.
There was a problem hiding this comment.
That's okay. The purpose of that warning was to make sure orphans weren't going through unnoticed. An unmet demand causes infeasibility and cannot be missed the same way.
There was a problem hiding this comment.
@idelder, that's a solid distinction — unsupported demands will always surface as infeasibility during solve, making additional manager-level tracking redundant. Orphans, on the other hand, can silently shrink the model without causing infeasibility, which is precisely why they warrant explicit tracking and warnings. The current approach makes sense.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/db_model_comparison.rst`:
- Around line 70-71: The entries for "commodity_sink" and "commodity_carrier"
use subscript notation (\text{C}_d, \text{C}_w, \text{C}_p, \text{C}_k) which is
inconsistent with the document's superscript form; update those to use
superscripts (\text{C}^d, \text{C}^w, \text{C}^p, \text{C}^k) so the strings for
commodity_sink and commodity_carrier match surrounding notation (and the earlier
uses of {}^*\text{C}^k / {}^*\text{C}^c).
In `@temoa/components/technology.py`:
- Around line 372-378: The comparison currently builds techs = {t for r, i, t,
v, o in model.efficiency.sparse_iterkeys()} and computes symdiff =
techs.symmetric_difference(model.tech_production); change that to compare
against model.tech_all instead (i.e., use model.tech_all in the
symmetric_difference) so efficiency-indexed technologies that are valid but not
in the production subset are not flagged; update any related message/context
that references technologies if needed (msg and the surrounding validation
block) to reflect that we're validating against tech_all.
- Around line 360-365: The error message built when symdiff is non-empty
currently says "Unused or unspecified physical carriers" but the check is
against model.commodity_carrier (which can include sink commodities and other
sets like commodity_demand/commodity_waste), so update the message string
constructed where symdiff, c_carrier and model.commodity_carrier are used to
refer generically to "carriers" (or to include the specific set name, e.g.,
commodity_carrier) instead of "physical carriers"; locate the message
construction around symdiff and c_carrier and change the wording to something
like "Unused or unspecified carriers. Either add or remove the following
elements to the Set commodity_carrier" so users are pointed to the correct
concept.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 109fa95d-9e02-44c5-a669-1d26b6116970
📒 Files selected for processing (2)
docs/source/db_model_comparison.rsttemoa/components/technology.py
| techs = {t for r, i, t, v, o in model.efficiency.sparse_iterkeys()} | ||
|
|
||
| symdiff = techs.symmetric_difference(model.tech_all) | ||
| symdiff = techs.symmetric_difference(model.tech_production) | ||
| if symdiff: | ||
| msg = ( | ||
| 'Unused or unspecified technologies. Either add or remove ' | ||
| 'the following technology(ies) to the tech_resource or ' | ||
| 'tech_production Sets.\n\n Technology(ies): {}' | ||
| 'the following technologies.\n\n Technologies: {}' |
There was a problem hiding this comment.
Compare efficiency technologies against tech_all, not tech_production.
Line 374 now rejects any technology that appears in efficiency but is not in model.tech_production. That is too narrow: efficiency is indexed by tech_all, and the model context explicitly allows specialized technologies to be added to tech_all outside the p% production subset. Valid storage/exchange-style technologies will be reported as “unused or unspecified” here.
Suggested fix
- symdiff = techs.symmetric_difference(model.tech_production)
+ symdiff = techs.symmetric_difference(model.tech_all)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| techs = {t for r, i, t, v, o in model.efficiency.sparse_iterkeys()} | |
| symdiff = techs.symmetric_difference(model.tech_all) | |
| symdiff = techs.symmetric_difference(model.tech_production) | |
| if symdiff: | |
| msg = ( | |
| 'Unused or unspecified technologies. Either add or remove ' | |
| 'the following technology(ies) to the tech_resource or ' | |
| 'tech_production Sets.\n\n Technology(ies): {}' | |
| 'the following technologies.\n\n Technologies: {}' | |
| techs = {t for r, i, t, v, o in model.efficiency.sparse_iterkeys()} | |
| symdiff = techs.symmetric_difference(model.tech_all) | |
| if symdiff: | |
| msg = ( | |
| 'Unused or unspecified technologies. Either add or remove ' | |
| 'the following technologies.\n\n Technologies: {}' |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@temoa/components/technology.py` around lines 372 - 378, The comparison
currently builds techs = {t for r, i, t, v, o in
model.efficiency.sparse_iterkeys()} and computes symdiff =
techs.symmetric_difference(model.tech_production); change that to compare
against model.tech_all instead (i.e., use model.tech_all in the
symmetric_difference) so efficiency-indexed technologies that are valid but not
in the production subset are not flagged; update any related message/context
that references technologies if needed (msg and the surrounding validation
block) to reflect that we're validating against tech_all.
There was a problem hiding this comment.
Incorrect. With the removal of the defunct technology_resource set, all technologies are now a subset of tech_production
There was a problem hiding this comment.
@idelder, thank you for the clarification — you're right, and my suggestion was incorrect. I wasn't aware that tech_resource had been removed, making tech_production the correct and complete superset for all technologies in the model. The comparison against model.tech_production is appropriate.
✏️ Learnings added
Learnt from: idelder
Repo: TemoaProject/temoa PR: 269
File: temoa/components/technology.py:372-378
Timestamp: 2026-03-09T13:24:27.000Z
Learning: In the TemoaProject/temoa codebase, the defunct `tech_resource` set has been removed. All technologies are now subsets of `tech_production`, making `tech_production` the correct set to validate against in `check_efficiency_indices` (temoa/components/technology.py). Do not suggest reverting efficiency technology validation back to `tech_all`; `tech_production` is the right reference set.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: idelder
Repo: TemoaProject/temoa PR: 180
File: temoa/components/limits.py:632-647
Timestamp: 2025-10-30T14:37:44.260Z
Learning: In the `limit_tech_input_split_average_constraint` function in `temoa/components/limits.py`, there was a bug where the `total_inp` summation incorrectly used the fixed variable `i` instead of the loop variable `S_i` when indexing `processOutputsByInput` and calling `get_variable_efficiency`. This pattern should be watched for in similar aggregation constraints: when summing over a loop variable like `S_i`, all references within that loop should use `S_i`, not a fixed outer variable.
bfb59d6 to
a1af15a
Compare
Currently, Temoa detects orphan commodities/processes (inputs that nothing produces or outputs that nothing consumes, source/demand/waste commodities excluded). It then removes these orphans and repeats until all orphan branches of the model are removed. This can lead to issues, especially in myopic, where large sections of the process network are removed and infeasibilities or other errors result from interacting model features.
This change instead "freezes" those orphans. The commodity balance constraint is constructed for the orphan commodity, which has a zero on one side of the equality, freezing connected flows at zero.
This has two key benefits:
One key downside is that, as these branches are frozen, it should in many cases be computationally equivalent to remove them, thereby saving all those variables and constraints from the LP. We lose here that computational efficiency for the sake of robustness and interpretability.
A resulting implication is that, even with network checking turned off, sources must now be properly tagged or they will be frozen.
test_systemandstoragevilleneeded fixes for this reason (tests are not network checked so this did not emerge earlier).Finally, this change also adds the
commodity_sinkset, a union ofcommodity_demandandcommodity_wasteto help clarify the common behaviour of those sets.Summary by CodeRabbit
New Features
Improvements
Tests