Skip to content

JIT: Add TSPLIB dump capabilities to block layout#111005

Open
amanasifkhalid wants to merge 4 commits intodotnet:mainfrom
amanasifkhalid:tsplib-dump
Open

JIT: Add TSPLIB dump capabilities to block layout#111005
amanasifkhalid wants to merge 4 commits intodotnet:mainfrom
amanasifkhalid:tsplib-dump

Conversation

@amanasifkhalid
Copy link
Copy Markdown
Contributor

@amanasifkhalid amanasifkhalid commented Dec 30, 2024

Adds functionality for dumping the flowgraph to files that can be consumed by TSPLIB-based optimizers. This enables us to compare the JIT's 3-opt implementation to state-of-the-art optimizers.

TSPLIB consumes a parameter file specifying various options for the solver to leverage, as well as a problem file describing the flowgraph. The JIT communicates the flowgraph to TSPLIB via a full matrix of "distances" between blocks, which is computed with Compiler::ThreeOptLayout::GetCost(). Unfortunately, TSPLIB only accepts integral values for distances, so the layout costs are truncated -- this imprecision sometimes leads the external optimizer to different optima that aren't truly better than the JIT's answer.

TSPLIB gives each block a 1-indexed ordinal based on the order in which their distances are reported in the problem file. From the JIT side, I decided to use the lexical order of the optimized layout for this. This way, if we tell the optimizer to report its best traversal (via the JitOutputTSPTourFile switch), comparing it to the JIT's layout is simple: linearly-increasing traversals (1 2 3 etc) show agreement between the two.

This functionality is best used with SPMI: Just set JitDumpFlowGraphToTSPFile=1 when compiling a method, and then pass the resultant parameter file <spmi index>.par to the external optimizer.

I've used this functionality to analyze 3-opt's efficacy on benchmarks.run_pgo; I'll report my findings in a separate issue. I'll wait for #110922 to go in first before opening this up for review.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 30, 2024
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid amanasifkhalid marked this pull request as ready for review January 6, 2025 19:39
return;
}

if (numCandidateBlocks > 150)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proved to be a reasonable upper bound when running benchmarks.run_pgo: Very few methods had more blocks than 150, and the jump up was dramatic. I suppose we can adjust this locally as needed for other analyses.

@amanasifkhalid
Copy link
Copy Markdown
Contributor Author

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Thanks!

Copy link
Copy Markdown
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should think about what we want to do with this long term. How hard would it be to set up a weekend run that actually compares the results vs TSPLIB?

Or if that's not feasible and you have manual scripting to orchestrate the comparsion, think about adding it to jitutils or somewhere discoverable.

{
BasicBlock* const next = blockOrder[j];
const weight_t cost = (block == next) ? BB_ZERO_WEIGHT : GetCost(block, next);
fprintf(problemFile, "%llu ", (unsigned long long)cost);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the integer restriction, seems like we ought to scale up cost by 1000 or something?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to be the consensus for handling this restriction. I'll see how it changes my results once I rerun the analysis.

@amanasifkhalid
Copy link
Copy Markdown
Contributor Author

We should think about what we want to do with this long term. How hard would it be to set up a weekend run that actually compares the results vs TSPLIB?

That sounds feasible, though I imagine we can't run an arbitrary executable (the TSP optimizer, in this case) in CI. I'll probably go the route of cleaning up my scripts, and putting them in jitutils. I'm hoping to get profiles fed into block layout consistent soon, and rerun my analysis before checking this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants