Conversation
kotlinx-coroutines-core/jvm/test/scheduling/BlockingCoroutineDispatcherThreadLimitStressTest.kt
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
dd7ab91 to
dd30af0
Compare
|
Does not seem that we need It is only read in other threads after Thread.join and in toString()
|
elizarov
left a comment
There was a problem hiding this comment.
Looks good. All the feedback I can give is mostly cosmetic.
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
|
Note, a bug was introduced somewhere in shutdown sequence. It can hang on shutdown now: https://teamcity.jetbrains.com/buildConfiguration/KotlinTools_KotlinxCoroutines_NightlyStressWindows/2652563 |
|
I was avoiding adding long-scanning (dd30af0) because it could have masked all the bugs on the rendezvous of parking/unparking. Ironically, it has exposed existing problems :) It seems like it is impossible to ensure strict liveness in loads like
|
|
@qwwdfsad
Hm. If the thread is executing CPU task right now, then it should not be unparked by Combining both |
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
kotlinx-coroutines-core/jvm/src/scheduling/CoroutineScheduler.kt
Outdated
Show resolved
Hide resolved
…CoroutineScheduler
258227b to
203580f
Compare
* WorkQueue.trySteal reports not only whether the steal was successful, but also a waiting time unless task becomes stealable
* CoroutineScheduler.trySteal attempts to steal from all the workers (starting from the random position) per iteration to have deterministic stealing
* Parking mechanics rework. After unsuccessful findTask, worker immediately adds itself to parking stack, then rescans all the queues to avoid missing tryUnparks and only then parks itself (parking duration depends on WorkQueue.trySteal result), terminating later
* Excessive spinning and parking is completely eliminated, significantly (x3) reducing CPU-consumption and making CoroutineScheduler on-par with FJP and FTP on Ktor-like workloads
* Downside of aggressive parking is a cost of slow-path unpark payed by external submitters that can be shown in degraded DispatchersContextSwitchBenchmark. Follow-up commits will fix that problem
* Retry on tryStealLastScheduled failures to avoid potential starvation
* Merge available CPU permits with controlState to simplify reasoning about pool state and make all state transitions atomic
* Get rid of synthetic accessors
* Work stealing: get rid of global queue for offloading during stealing because it never happens in fact
* Guard all critical invariants related to work-stealing with asserts
* New work signalling strategy that guarantees complete liveness in the face of "accidentally-blocking" CPU tasks
* Advanced double-phase unparking mechanism that mitigates the most expensive part of signalling an additional work
* Current limitation: blocking tasks are not yet properly signalled
203580f to
cab8deb
Compare
Invariants:
* Steal only one task per attempt to avoid missing steals that potentially may block the progress (check-park-check may miss tasks that are being stolen)
* New WorkQueue.add invariant: bufferSize < capacity => add is always successful
* Re-visited tests that expose a lot of problems
* Ability to steal from the middle of work queue in order to steal blocking tasks with ABA prevention
Changes:
* Instead of "blocking workers" use "blocking tasks" state that is incremented on each blocking submit and decrement only when task is completed
* On each work signalling try to compensate blocking tasks by enforcinf invariant "created threads == blocking tasks + up to core size workers"
* Now if worker was not spuriously woken up, it has a task dedicated for him that should be found. For that reason attempt to steal blocking tasks
(that may be in the middle of the work queue). Additionally, instead of scanning the whole global queue, just split it in two (one for blocking, one for regular tasks)
* Get rid of conditional remove from the global queue
* Avoid excessive unparks for threads that are not yet able to steal the task due to workstealing resolution: do not add such workers to the stack
cab8deb to
d6e7470
Compare
* Do not push worker to the stack during second pass of "min duration scanning"
* Locally cache whether local queue has any work to execute to save atomic getAndSet and a bunch of atomic loads
* More precise rendezvous for parking
* Long-scanning stealing to emulate spinning
* Properly handle interference of termination sequence and protection against spurious wakeups
* Documentation improvements, naming, proper spurious wakeup check
d6e7470 to
cff57ef
Compare
cff57ef to
7b614a7
Compare
…et rid of long-scanning sequence
7b614a7 to
4224e01
Compare
Fixes #840
Fixes #1046
Fixes #1286