Fix dropped kernel messages and stale state between driver loads#413
Fix dropped kernel messages and stale state between driver loads#413wjhwjhn wants to merge 1 commit into
Conversation
…apture state on start/stop Raised the KernelReader poll rate from 1Hz to 1kHz, drained the driver ring per poll instead of reading one buffer and returning, and reset verbose/pass-through plus issued DBGV_CLEAR_DISPLAY around capture so state doesn't leak between driver loads.
|
First of all, thank you for looking at this. The kernel-mode messages is a feature that I do not use myself and also don't really know how to stress-test, so I'm very happy that someone that actually uses it is looking into the implementation. I took a glance at your changes and I think it already looks good, I will review them and do some testing. I'm curious if you were losing exactly one message before, and if this change actually solved that? There was a recent bug resolved that was never displaying 1 out of every 5000 messages, see #407 . |
Thanks for the review! It was not exactly one message being dropped, nor a regular 1-in-N pattern. The drops were bursty: when kernel output was produced faster than the 1 Hz polling loop could drain it, several consecutive messages could be lost. So this seems different from #407. This PR does not fully eliminate drops. I checked Dbgv.sys: the driver uses a fixed NonPagedPool ring buffer, and when it is full, it overwrites old entries in place. There is no backpressure to DbgPrint, so a fast enough kernel producer can still overrun it. What this PR fixes is the reader-side gap: polling is increased from 1 Hz to ~1 kHz, and each tick drains the ring until empty. In normal workloads I no longer see drops, but a tight kernel-side DbgPrint loop can still overflow the driver buffer. |
Hi! First of all, thanks for keeping DebugView++ alive — it's been a daily tool for me for a long time.
On my machine I was seeing kernel-mode messages arrive with visible latency, and the occasional message going missing when the driver produced a burst. I spent some time looking at what
KernelReaderdoes today and comparing it with Sysinternals DbgView to understand what was going on, and I wanted to share what I found in case it's useful.The driver doesn't expose an event handle and
DBGV_READ_LOGis a synchronous IOCTL, so both viewers end up polling. A couple of small things inKernelReader's current shape looked like they could probably be tightened up:PolledLogSource::pollFrequencyis in Hz, so the current value1means one poll per second, which seems to explain most of the latency I was seeing. Bumping it to1000(≈1 ms interval) helps a lot, and I addedtimeBeginPeriod(1)/timeEndPeriod(1)so the 1 ms sleep isn't rounded up to the default scheduler tick. That's whywinmmis linked in CMake now — happy to split that out if you'd prefer.Poll()issued a singleDBGV_READ_LOGper tick, so if the driver produced more than one buffer-worth between polls, the rest got dropped. I changed it to drain the ring until the driver reports 0 bytes, which mirrors what DbgView's read loop does (it tight-loops the IOCTL until empty, bounded by a 300 ms budget, before yielding).DBGV_CLEAR_DISPLAYis now sent beforeDBGV_CAPTURE_KERNELso stale events from a previous session aren't replayed into the new one. DbgView does the same on startup.DBGV_UNCAPTURE_KERNEL, andm_pBufis nulled afterfreeto avoid a dangling pointer if start/stop is repeated.I'm very open to changing any of this — I've only been in this code briefly, so if any of the above is wrong about the design intent, or there's a different direction you'd prefer, I'd be happy to revise. Thanks again for taking a look!