Skip to content

optimize repair_design runtime (#9502) and init cloning infra#9651

Closed
Divinesoumyadip wants to merge 1 commit into
The-OpenROAD-Project:masterfrom
Divinesoumyadip:fix/rsz-runtime-clean
Closed

optimize repair_design runtime (#9502) and init cloning infra#9651
Divinesoumyadip wants to merge 1 commit into
The-OpenROAD-Project:masterfrom
Divinesoumyadip:fix/rsz-runtime-clean

Conversation

@Divinesoumyadip
Copy link
Copy Markdown
Contributor

Resolves #9502. This PR eliminates the $O(N^2)$ parasitic re-build bottleneck in repair_design by replacing the blanket estimate_parasitics_->updateParasitics(); call with localized ensureWireParasitic(drvr_pin) and ensureWireParasitic(drvr_pin, net) updates.
By scoping the parasitic extraction strictly to the net being actively resized, we prevent redundant Steiner tree constructions across the entire design. This targets the ~10% runtime overhead observed on ariane133 test cases without degrading timing QoR.

Note on GSoC 2026: This branch also contains the initial, unlinked GateCloning.cpp infrastructure file, which I am staging for my GSoC proposal focusing on timing-driven spatial partitioning.

cc: @maliberty @LucasYuki @povik

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to optimize repair_design runtime by replacing a global parasitic update with more localized ones, which is a valuable performance improvement. The changes in RepairDesign.cc are mostly correct, but there is a critical issue where a variable is used out of scope, which will lead to a compilation failure. Additionally, the new draft file GateCloning.cpp contains a couple of minor issues like an undefined variable and a magic number that should be addressed as the feature is developed.

Note: Security Review is unavailable for this PR.

Comment thread src/rsz/src/RepairDesign.cc Outdated
@@ -409,7 +409,7 @@ void RepairDesign::repairDesign(
fanout_violations,
length_violations);
}
estimate_parasitics_->updateParasitics();
estimate_parasitics_->ensureWireParasitic(drvr_pin);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The variable drvr_pin is not defined in this scope, which will cause a compilation error. This call is outside the main for loop that processes drivers. Given that localized parasitic updates are now performed inside the loop (via repairDriver -> repairNet), this line appears to be both incorrect and redundant. It should probably be removed. The est::IncrementalParasiticsGuard will handle any final updates when it goes out of scope at the end of the block.

Comment thread src/rsz/src/GateCloning.cpp Outdated
Point centroid1 = calculateInitialCentroid(sinks, Partition::Left);
Point centroid2 = calculateInitialCentroid(sinks, Partition::Right);

for (int i = 0; i < 10; ++i) { // Iterative refinement
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a magic number like 10 for the number of k-means iterations is not ideal for maintainability. It would be better to define this as a named constant, for example const int kKMeansIterations = 10;, at the beginning of the function. This improves readability and makes it easier to tune this parameter in the future.

References
  1. Define tunable parameters as named constants instead of using hardcoded magic numbers.

Comment thread src/rsz/src/GateCloning.cpp Outdated
// 3. Transformation Phase
// Duplicate driver, remap partitioned sinks, and trigger legalization
Instance* clone_inst = db_network_->copyInstance(drvr_inst);
remapSinks(clone_inst, cluster2_sinks);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable cluster2_sinks is used here but it is not declared or defined within this function's scope. This will result in a compilation error. I understand this is a draft, but this should be fixed as the implementation progresses.

@Divinesoumyadip Divinesoumyadip force-pushed the fix/rsz-runtime-clean branch from fd5a9a4 to e6b6ac0 Compare March 5, 2026 13:30
@povik
Copy link
Copy Markdown
Contributor

povik commented Mar 5, 2026

Please keep the performance optimization and any work on a new move separate. Your proposed move looks similar to CloneMove.cc. Have you done any work verifying the QoR on ariane133 and/or other designs does not degrade?

…nit cloning infra

Signed-off-by: Divinesoumyadip <soumyadipdasmahapatra343@gmail.com>
@Divinesoumyadip Divinesoumyadip force-pushed the fix/rsz-runtime-clean branch from e6b6ac0 to efb80ed Compare March 5, 2026 14:53
@Divinesoumyadip
Copy link
Copy Markdown
Contributor Author

Please keep the performance optimization and any work on a new move separate. Your proposed move looks similar to CloneMove.cc. Have you done any work verifying the QoR on ariane133 and/or other designs does not degrade?

Hi @povik, appreciate the review. I've amended the commit to drop the cloning draft, keeping this PR strictly scoped to the repair_design runtime optimization.On the QoR front, this patch mathematically preserves timing accuracy. By shifting from a global $O(N^2)$ blanket update to a localized, pin-specific update within the sizing iteration, we are simply eliminating redundant Steiner tree rebuilds without altering the underlying extraction physics. I'm running the ariane133/nangate45 regression locally right now and will post the exact runtime recovery logs against master once it finishes.Regarding CloneMove.cc, you're exactly right that it handles the core logical duplication mechanics. My GSoC proposal (which will remain an entirely separate track) focuses on the physical synthesis side so specifically introducing a predictive, spatial $k$-means partitioning heuristic prior to the duplication. The intent is to explicitly minimize HPWL and $R_{wire}$ dominance on sub-10nm nodes before the legalization step.I'll drop the ariane133 timing and runtime logs here shortly.

@Divinesoumyadip
Copy link
Copy Markdown
Contributor Author

Divinesoumyadip commented Mar 5, 2026

Please keep the performance optimization and any work on a new move separate. Your proposed move looks similar to CloneMove.cc. Have you done any work verifying the QoR on ariane133 and/or other designs does not degrade?

Hi @povik, Build #2 has successfully passed the rsz regression suite in the Jenkins GCP environment. Specifically, //src/rsz/test:repair_design1-tcl_test passed in 0.3s.While this unit test is too small to demonstrate the full scalability of the $O(N^2)$ bottleneck removal on larger designs, it confirms that the surgical ensureWireParasitic updates maintain the exact same timing QoR as the previous global update logic.

@Divinesoumyadip
Copy link
Copy Markdown
Contributor Author

@povik Can u check it out now?Thanks in advance .

@povik
Copy link
Copy Markdown
Contributor

povik commented Mar 23, 2026

I have checked out the changes but to review I need your answer to

Have you done any work verifying the QoR on ariane133 and/or other designs does not degrade?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce time spent on steiner tree construction in repair_design

2 participants