Code audit & refactoring
A systematic, step-by-step audit of the entire codebase to find and fix technical debt. This is not a single PR — work proceeds incrementally, one area at a time, with each step verified independently (full phpt run, no behavior change, no regression).
Scope note: HTTP/3 is out of scope for now — src/http3/ excluded.
Baseline: 195/197 phpt pass (034 cosmetic curl-deprecation, 044 pre-existing SEGV in php-async closure transfer — both unrelated). Bar for every step: no new failures.
Audit dimensions
- Unnecessary memory allocations · 2. Suboptimal syscall usage · 3. Structural cleanliness · 4. Code duplication · 5. SOLID violations · 6. Data structure optimality · 7. Algorithmic optimality · 8. Comment quality (final pass).
Findings & implementation plan
Ordered: correctness → dead code → low-risk perf → duplication → big-function splits → data structures → comments. Dedup precedes splits (big functions are often big because of duplication).
Phase 1 — Correctness ✅ (commit 6d470a6)
Phase 2 — Dead code / trivial removals ✅ (commit cb57c23)
Phase 3 — Low-risk performance ✅ (commit 7b7de1e) — P1/P2/P6 done; P3/P7/P8 deferred, P4 dropped, P5 done in Phase 4
Phase 4 — Duplication ✅ (4.1 4b35fc8 X2/X9 · 4.2 c90fd38 X1 · 4.3 3ba412b X3/X4 · 4.4 d075b51 X14 · 4.5 223ee83/bf1351a/a67e5d6/289a583 P5/X8/X6/X7 · 4.6 1e851f3 X10/X11/D8 · 4.7 e859186 X13)
Phase 5 — Big-function splits (SOLID) — ❌ won't-do (S1–S6)
Reassessed. Blanket "split a function for SRP" in C systems code is cargo-cult here. S1–S6 are linear, single-caller pipelines already organised with section comments; cutting them into one-call helpers threaded through a god-struct reads worse, not better, and adds churn with no correctness or perf gain. S1–S6 dropped. Genuine duplication inside them was already removed in Phase 4 (X12 closed the last item). S7 is a real build-hygiene win but a separately-scoped change → split out as its own issue.
Phase 6 — Data structures
Phase 7 — Comment quality
Other (low priority / measure first) — separate-issue candidates
These four are out of scope for this refactor branch — candidates for their own issues if later benchmarking justifies them.
- O1
http1_stream.c — 3 sends per streaming chunk (medium risk; vectored write).
- O2
http_body_stream.c — per-chunk node emalloc.
- O3
http_connection_tls.c — read_buffer never shrinks after a large burst.
- O4
http_parser.c — strncasecmp prefix-match for TE/Connection tokens (token-aware compare).
Acceptance
- No behavior change; phpt suite green (current baseline 196/198;
034/044 pre-existing, unrelated — no new failures).
- No benchmark regression against the release baseline (perf-affecting phases benchmarked).
- Comment quality enforced per-commit (CM1 dedicated pass dropped — see Phase 7).
Status — branch ready to merge
Phases 1–4 plus the remaining actionable items (X12, DS2, DS4) are done and pushed to 37-code-refactoring-and-technical-improvements. Everything else is resolved as won't-do (S1–S6, P8, DS1, DS3, CM1 — each with rationale above) or deferred to a separate issue (S7, P3, P7, O1–O4). No open work remains on this branch.
Code audit & refactoring
A systematic, step-by-step audit of the entire codebase to find and fix technical debt. This is not a single PR — work proceeds incrementally, one area at a time, with each step verified independently (full phpt run, no behavior change, no regression).
Baseline: 195/197 phpt pass (
034cosmetic curl-deprecation,044pre-existing SEGV in php-async closure transfer — both unrelated). Bar for every step: no new failures.Audit dimensions
Findings & implementation plan
Ordered: correctness → dead code → low-risk perf → duplication → big-function splits → data structures → comments. Dedup precedes splits (big functions are often big because of duplication).
Phase 1 — Correctness ✅ (commit 6d470a6)
http_parser.c—pausednot reset inhttp_parser_reset_for_reuse. Verified: not an active bug — the sole consumer (http_parser_execute) clears it andllhttp_resumeno-ops on a freshly-init'd parser. Latent inconsistency only → folded into X2 (teardown dedup makes divergence impossible).send_file.c— engine open lackedO_NOFOLLOW; on an open-file-cache hit (lstat pre-flight skipped) a symlink swapped in within the TTL was followed, leaking files outside the docroot. Fixed in6d470a6+ regression teststatic/021.Phase 2 — Dead code / trivial removals ✅ (commit cb57c23)
http2_static_response.c:296— deadncounter (computed, never read).http_static.c:300/310—http_static_cache_acquirecalled twice; hoist to one.http_static.c:347-353— deadcache != NULLre-checks (implied byprefer_inline).http_compression_response.c:48-49— dead struct fieldsunderlying_ops/underlying_ctx.http_compression_negotiate.h:43— stale doc comment (preference order mismatch).http_server.c:284— deadcan_reuse_string_buffer(static inline, never called).tls_layer.c:352— dead pre-branchmac_params[1]write.http_response_internal.h:41— duplicate accessor declarations. Done in 4.6 (commit 1e851f3): dropped theget_status_code/get_headers_tablealiases, callers renamed to the publicget_status/get_headers, removed internal-header re-declarations.Phase 3 — Low-risk performance ✅ (commit 7b7de1e) — P1/P2/P6 done; P3/P7/P8 deferred, P4 dropped, P5 done in Phase 4
http2_session.c:1403—h2_stream_pending_bytesrescans the chunk ring O(N) per data-provider tick; use the already-maintainedchunk_queue_bytes.http_compression_response.c:407— gzip buffered path undersizes output (+64); use+body_len/1000+....stats), needs its own test; out of scope for this refactor branch.http_static.c:227— directory-index resolution does N blockingstats bypassing the negative cache.http_parser.c:552— redundantconnectionhash lookup inon_headers_complete.snprintf→shared-digit-writer dedup.http2_session.c:1654/1751—snprintffor:status; reuse the digit writer.http_request.c:380/396—getContentType/getContentLengthalloc azend_stringfor a literal header name.http_request.c:704/uploaded_file.c:402—emalloc(sizeof(zval))wrapper alloc/free per request/upload.start()-only (one-time, not a hot path); the ~25 VM re-entries cost nothing measurable. A critical-function refactor for zero gain.http_server.cstart()— ~25 PHP-VM re-entries to read scalar config already present in the C struct.Phase 4 — Duplication ✅ (4.1
4b35fc8X2/X9 · 4.2c90fd38X1 · 4.33ba412bX3/X4 · 4.4d075b51X14 · 4.5223ee83/bf1351a/a67e5d6/289a583P5/X8/X6/X7 · 4.61e851f3X10/X11/D8 · 4.7e859186X13)encoder_drain_write/finish.http_parser.c—reset/destroy/reset_for_reuseteardown ×3 →parser_teardown_common.out_pending_append. (4.33ba412b)http_send_batched_finish. (4.33ba412b)core —
tls_/http_absorb_io_submission_exceptionidentical → one helper.h2_flatten_response_headers(two two-pass blocks; the audit's third was a miscount). (4.5a67e5d6)h2_emit_streaming_bodyvsh2_dp_streaming_copyring-walk →h2_chunk_queue_walk+ slice sink. (4.5289a583)h2_stream_dispose_tailrepeated ×3. (4.5bf1351a)http_parser.c—extract_boundarytwo alloc tails.http_response.c—emit_headers_onlygained askip_framingflag; the chunked formatter reuses it. (4.61e851f3)http_response.c— 3 reset-body paths →response_set_body_bytes. (4.61e851f3)http_static_try_serve, via in-TUstatic_emit_not_modified/static_emit_ok_headers/static_emit_validatorshelpers. The original deferral feared a 9-param cross-TU helper; in practice both paths live in the same TU and share a clean data model once the content-type is resolved locally. (commitd9d6580)cache_disabledpredicate. (4.7e859186)tls_layer.c—ticket_key_cbMAC-params construction ×2 →tls_ticket_mac_set_key. (4.4d075b51)Phase 5 — Big-function splits (SOLID) — ❌ won't-do (S1–S6)
engine_handle_stat(~275 lines).http_static_try_serve(~500 lines).http2_feed(~185 lines).h2_stream_send_static_response(~215 lines).http_connection_dispatch_request(~160 lines).http_connection_destroy(~175 lines).http_response.c(2194 lines): split static-handler C setters + formatters into separate TUs. Genuine build-hygiene win, separately scoped.Phase 6 — Data structures
http_request_tfields,body_h2_session+body_h2_stream_idare now live (H2 backpressurenghttp2_session_consumewired inhttp_body_stream.c). The remainder (body_h2_consume_pending,body_h3_stream,body_paused, macroHTTP_BODY_QUEUE_WATERMARK) are documented placeholders for the in-flight backpressure follow-up PR — removing them is churn the follow-up re-adds. The "~28 bytes" estimate was already stale.http2_emit_record_t.body.lenisuint32_t—ZEND_ASSERT(len <= UINT32_MAX)added inh2_emit_state_append_body. H2 DATA slices are frame-size bounded so the cast never truncates today; the assert guards a future caller. (commit972eb4c)engine_state_t— mutually-exclusive error-defer/range fields could share a union.chunk_queue— compacting array; 3 comments that called it a "ring" corrected. (commit972eb4c)Phase 7 — Comment quality
CODING_STANDARDSshort 1-2 line comments enforced per-edit).Other (low priority / measure first) — separate-issue candidates
These four are out of scope for this refactor branch — candidates for their own issues if later benchmarking justifies them.
http1_stream.c— 3 sends per streaming chunk (medium risk; vectored write).http_body_stream.c— per-chunk nodeemalloc.http_connection_tls.c—read_buffernever shrinks after a large burst.http_parser.c—strncasecmpprefix-match for TE/Connection tokens (token-aware compare).Acceptance
034/044pre-existing, unrelated — no new failures).Status — branch ready to merge
Phases 1–4 plus the remaining actionable items (X12, DS2, DS4) are done and pushed to
37-code-refactoring-and-technical-improvements. Everything else is resolved as won't-do (S1–S6, P8, DS1, DS3, CM1 — each with rationale above) or deferred to a separate issue (S7, P3, P7, O1–O4). No open work remains on this branch.