Skip to content

Experiment/paralllel unit tests#14104

Closed
nojaf wants to merge 4 commits intodotnet:mainfrom
nojaf:experiment/paralllel-unit-tests
Closed

Experiment/paralllel unit tests#14104
nojaf wants to merge 4 commits intodotnet:mainfrom
nojaf:experiment/paralllel-unit-tests

Conversation

@nojaf
Copy link
Copy Markdown
Contributor

@nojaf nojaf commented Oct 13, 2022

Prototype to type check files that are unrelated in parallel.

)

return
(fun tcState ->
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how TcState is merged? Seems surprisingly easy

|> fun results ->
((sequentialTcState, sequentialPriorErrors, ImmutableQueue.Empty), results)
||> Array.fold (fun (tcState, priorErrors, partialResults) result ->
let partialResult, logger, nextTcState = result tcState
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@safesparrow here, I fold the callback over here and everything plays out as it would in sequence.

@T-Gro
Copy link
Copy Markdown
Member

T-Gro commented Oct 17, 2022

I really like this idea @nojaf .

I spent some time thinking about doable and opt-in way into this.
I know there is a longer thread about heuristics of deriving this automatically, but I have my doubts about the efforts (both human of computational) of doing this really correctly.

What would you think about checking a module-level attribute at the very first definition of the file. That is only a single check of the parsed tree, without any deep traversal.
I know that convention-based features are not good for the "Simple F#" theme, but would be a neat way of opting into this experiment.

[<Microsoft.FSharp.Core.ParalellCompilation(nameOfGroup:string)>] module MyUnitTests

Files that are within a linear chain of files and at the same time have the same nameOfGroup would be type-checked in parallel.

@safesparrow
Copy link
Copy Markdown
Contributor

safesparrow commented Oct 17, 2022

I'm in process of developing the heuristic with @nojaf's help and insights.

It already kinda works and is very quick.

Needs some polishing before I can share it.
Should be able to share it later this week, hopefully.

The way it works is for each file it checks what top-level modules/namespaces are reachable in that file based on module and type references in that file, then marks all files whose top-level modules/namespaces are in the set.

It obviously gives false positives, but that's expected and fine as long as it gives enough true positives. Atm it runs in 230ms for .fs files in FSharp.Compiler.Service project. For comparison parsing takes ~4s and compilation as a whole something around/under a minute.

@vzarytovskii
Copy link
Copy Markdown
Member

I'm in process of developing the heuristic with @nojaf's help and insights.

It already kinda works and is very quick.

Needs some polishing before I can share it.
Should be able to share it later this week, hopefully.

Sweet! Would you folks be open to meeting and chatting about it?

@dsyme
Copy link
Copy Markdown
Contributor

dsyme commented Oct 17, 2022

I don't really know how we'd know if things are independent but I'm looking forward to seeing the proposal! Make sure it's sound please :)

@vzarytovskii
Copy link
Copy Markdown
Member

I don't really know how we'd know if things are independent but I'm looking forward to seeing the proposal! Make sure it's sound please :)

We can set up a meeting and chat about it when it's ready.

I'm sure @auduchinok @baronfel @abelbraaksma @gusty and the team would love to talk about it!

We can have an open meetup even, for anyone interested, if folks won't mind.

@abelbraaksma
Copy link
Copy Markdown
Contributor

@vzarytovskii sure, would love that! Btw, talking of parallelism (which really has three l’s, not four, @nojaf 😆), I was hoping to chat with @dsyme, if you have time, on TaskSeq, not really parallelism, but related and which I’m turning into a full blown library. I’m having a gnarly bug with the resumable code, and perhaps a second pair of eyes would help (tldr, multiple iterations fail hard). Or maybe I can pair with you, @vzarytovskii, if you have time?

Sorry for the brief hijack. An open meeting sounds excellent!

@T-Gro
Copy link
Copy Markdown
Member

T-Gro commented Oct 17, 2022

I'm in process of developing the heuristic with @nojaf's help and insights.

It already kinda works and is very quick.

Needs some polishing before I can share it. Should be able to share it later this week, hopefully.

The way it works is for each file it checks what top-level modules/namespaces are reachable in that file based on module and type references in that file, then marks all files whose top-level modules/namespaces are in the set.

It obviously gives false positives, but that's expected and fine as long as it gives enough true positives. Atm it runs in 230ms for .fs files in FSharp.Compiler.Service project. For comparison parsing takes ~4s and compilation as a whole something around/under a minute.

That is really good.
If you need a test case, I would expect all/most of FSharp.Compiler.ComponentTests to be independent.
Even if this helps in the straightforward cases (and rest keeps working as is), this could be a big benefit for test projects, web controllers, general Utils projects etc.

@safesparrow
Copy link
Copy Markdown
Contributor

safesparrow commented Oct 17, 2022

Happy to see I managed to sell the heuristic idea 😅

While we (mostly me) are optimistic about it working soon, we don't have it working end-to-end atm.

Specifically the piece we're missing atm (but think is relatively easy, although we/I might be wrong) is the branching/merging of TcState when going through separate type-checking paths in the graph, and then merging them further up on next levels.

@vzarytovskii I'd be happy to take part in a chat about this and possibly related ideas/PoCs mentioned in this discussion.
Perhaps early next week, or this weekend.
At worst, if the idea proves infeasible after all, a chat will save me some useless effort 😅
Or we can wait until we think it's complete - up to you; whatever is more efficient from others' PoV.

Also, I send my apologies to @nojaf for hijacking this PR's discussion - I blame @T-Gro 's comment 😝

@nojaf
Copy link
Copy Markdown
Contributor Author

nojaf commented Oct 18, 2022

I know there is a longer thread about heuristics of deriving this automatically, but I have my doubts about the efforts (both human of computational) of doing this really correctly.

Yes, I share this concern about that approach.

[<Microsoft.FSharp.Core.ParalellCompilation(nameOfGroup:string)>] would make things explicit and leads fewer surprises.

I was also thinking along the lines of having something in the fsproj.

<ItemGroup>
    <Compile Include="FsUnit.fs" />
    <Compile Include="TestHelpers.fs" />
    <ParallelCompilationGroup>
      <Compile Include="UtilsTests.fs" />
      <Compile Include="QueueTests.fs" />
      <Compile Include="StringTests.fs" />
      <Compile Include="CommentTests.fs" />
      <Compile Include="ShebangTests.fs" />
      <Compile Include="OperatorTests.fs" />
      <Compile Include="ComparisonTests.fs" />
      <Compile Include="ControlStructureTests.fs" />
      <Compile Include="PipingTests.fs" />
      <Compile Include="AppTests.fs" />
      <Compile Include="ModuleTests.fs" />
      <Compile Include="UnionTests.fs" />
      <Compile Include="RecordTests.fs" />
    </ParallelCompilationGroup>
    <Compile Include="SomeOtherFile.fs" />
</ItemGroup>

But passing that information to the compiler might be tricky.

Would you folks be open to meeting and chatting about it?

Works for me. I do prefer to wait until it is ready.

Sorry for the brief hijack.
Also, I send my apologies to @nojaf for hijacking this PR's discussion

Apologies accepted, but please all do stop now.

@T-Gro
Copy link
Copy Markdown
Member

T-Gro commented Oct 18, 2022

I will put this on my list of things to check.
Kevin also suggested a pragma instead of an attribute - a pragma, by definition a signal to the compiler, might be semantically more appropriate.

Changes in the .fsproj might delay initial testing of this feature - integration of tooling, msbuild, FCS api a FSC command line api might make the list of changes and decisions to do a lot bigger. => My preference to be to have a experimental version for experts for testing and evaluation before those decisions have to be done.

@vzarytovskii
Copy link
Copy Markdown
Member

@vzarytovskii I'd be happy to take part in a chat about this and possibly related ideas/PoCs mentioned in this discussion.
Perhaps early next week, or this weekend.

It'd be great! Next week works for me, pretty sure for other folks too. Let me know what works for you and I can set something up.

@T-Gro
Copy link
Copy Markdown
Member

T-Gro commented Jan 10, 2023

Can I close? This was experimental anyway, right?

@nojaf nojaf closed this Jan 10, 2023
@nojaf
Copy link
Copy Markdown
Contributor Author

nojaf commented Jan 10, 2023

Yup, can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants