Skip to content

flock(): stack-use-after-return when coroutine is cancelled mid-flock (real ASAN finding) #146

@EdmondDantes

Description

@EdmondDantes

Bug

main/streams/plain_wrapper.c:1192 allocates the flock task struct on
the caller's stack, queues it to a libuv worker thread, then suspends:

```c
php_stdiop_flock_task_data_t flock_data = { .fd = fd, .operation = value };
zend_async_task_t *task = ZEND_ASYNC_NEW_TASK(php_stdiop_flock_task_run, &flock_data);

if (UNEXPECTED(!ZEND_ASYNC_SUSPEND())) { … } // ← can throw AsyncCancellation

if (flock_data.result == 0) { … }
```

If the coroutine is cancelled at ZEND_ASYNC_SUSPEND(), the function
unwinds and flock_data is gone. The worker thread that's running
php_stdiop_flock_task_run (in libuv's thread pool) keeps writing into
the freed stack slot:

```c
static void php_stdiop_flock_task_run(zend_async_task_t *task) {
php_stdiop_flock_task_data_t *flock_data = (...) task->data;
flock_data->result = flock(flock_data->fd, flock_data->operation); // ← UAF

}
```

ASAN trace (chaos repro, scheduler-fifo / ASAN-ZTS)

==…==ERROR: AddressSanitizer: stack-use-after-return on address 0x… thread T4
WRITE of size 4 at 0x… thread T4
    #0 php_stdiop_flock_task_run main/streams/plain_wrapper.c:1097
    #1 libuv_task_work_cb        ext/async/libuv_reactor.c:2474
    #2 uv__queue_work            (/usr/local/lib/libuv.so.1)
    #3 worker                    (/usr/local/lib/libuv.so.1)
SUMMARY: AddressSanitizer: stack-use-after-return

Suggested fix

Heap-allocate the task data and let the task callback / dispose chain own
its lifetime, mirroring how other thread-pool tasks in ext/async do it.
Two viable shapes:

  1. `emalloc(sizeof(flock_data))` + `efree` in a task->dispose hook —
    the dispose runs after the worker callback completes regardless of
    whether the originating coroutine was cancelled.
  2. Reuse `zend_async_io_req_t`-style two-phase release (the "CALLBACK_DONE
    / DISPOSE_PENDING flags" pattern referenced in async I/O: per-request completion events instead of broadcasting on the handle event #130). Same idea.

The caller can then check `flock_data->result` only after the suspend
returned not via cancellation; on cancellation, the task is detached
and dispose-when-done.

Found by

Drafting `fuzzy-tests/io/flock_chaos.feature` (#143). The two cancel-
mid-flock scenarios SEGV instantly on ASAN-ZTS; the two non-cancel
scenarios (lock contention, holder + waiter) pass clean. Filed before
shipping; the cancel scenarios stay commented under `# Blocked: #146`
in the feature.

Acceptance

Cancelling a coroutine parked in `flock()` does not corrupt memory; ASAN
runs clean. The cancel scenarios in `fuzzy-tests/io/flock_chaos.feature`
can be reinstated and run green on the fuzzy CI matrix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions