Skip to content

Fix UBSAN issue by switching to smart_str#83

Merged
kjdev merged 2 commits into
kjdev:masterfrom
ndossche:use-smart-str
May 25, 2026
Merged

Fix UBSAN issue by switching to smart_str#83
kjdev merged 2 commits into
kjdev:masterfrom
ndossche:use-smart-str

Conversation

@ndossche

@ndossche ndossche commented May 22, 2026

Copy link
Copy Markdown
Contributor

When a smart_string is never updated, it's c field remains NULL. This is possible with a brotli library failure or with an empty output. This is triggerable via e.g. uncompress_add().
When that happens, a NULL pointer is passed to zend_string_init() which is not allowed and will cause undefined behaviour in the subsequent memcpy() call that zend_string_init() performs.

One way to fix this would be to guard the invocation of RETVAL_STRINGL(). However, a cleaner way is to switch to the smart_str API which handles everything already nicely for us.
As a bonus, smart_str will avoid reallocation of the string, so the new code is not only cleaner but also faster.

Example UBSAN report:

/usr/local/include/php/Zend/zend_string.h:191:2: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7c8d9852a291 in zend_string_init /usr/local/include/php/Zend/zend_string.h:191
    #1 0x7c8d98535adf in zif_brotli_uncompress_add /work/php-ext-brotli/brotli.c:1830
    #2 0x623731f73e26 in zend_test_execute_internal /work/php-src/ext/zend_test/observer.c:306
    #3 0x6237323a22bd in ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER /work/php-src/Zend/zend_vm_execute.h:2154
    #4 0x62373259894d in execute_ex /work/php-src/Zend/zend_vm_execute.h:116519
    #5 0x6237325a9126 in zend_execute /work/php-src/Zend/zend_vm_execute.h:121962
    #6 0x623732779b56 in zend_execute_script /work/php-src/Zend/zend.c:1980
    #7 0x623731fa8cce in php_execute_script_ex /work/php-src/main/main.c:2645
    #8 0x623731fa8fd8 in php_execute_script /work/php-src/main/main.c:2685
    #9 0x623732791599 in main /work/php-src/sapi/cgi/cgi_main.c:2529
    #10 0x7c8d9a8381c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #11 0x7c8d9a83828a in __libc_start_main_impl ../csu/libc-start.c:360
    #12 0x623730c09ad4 in _start (/work/php-src/build-dbg-ubsan/sapi/cgi/php-cgi+0x1409ad4) (BuildId: 4ad96c24d1e97297ff4db76de2af34c057b26433)

Note: this was found by a hybrid static-dynamic analyzer I'm developing.

Summary by CodeRabbit

  • Refactor
    • Internal overhaul of Brotli compression/decompression buffering for more robust, maintainable, and slightly more efficient output handling. No changes to public APIs, function signatures, or behavior visible to callers.

Review Change Stack

When a `smart_string` is never updated, it's `c` field remains NULL.
This is possible with a brotli library failure or with an empty output.
This is triggerable via e.g. `uncompress_add()`.
When that happens, a NULL pointer is passed to `zend_string_init()`
which is not allowed and will cause undefined behaviour in the
subsequent `memcpy()` call that `zend_string_init()` performs.

One way to fix this would be to guard the invocation of `RETVAL_STRINGL()`.
However, a cleaner way is to switch to the `smart_str` API which handles
everything already nicely for us.
As a bonus, `smart_str` will avoid reallocation of the string, so the new
code is not only cleaner but also faster.

Example UBSAN report:
```
/usr/local/include/php/Zend/zend_string.h:191:2: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7c8d9852a291 in zend_string_init /usr/local/include/php/Zend/zend_string.h:191
    kjdev#1 0x7c8d98535adf in zif_brotli_uncompress_add /work/php-ext-brotli/brotli.c:1830
    kjdev#2 0x623731f73e26 in zend_test_execute_internal /work/php-src/ext/zend_test/observer.c:306
    kjdev#3 0x6237323a22bd in ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER /work/php-src/Zend/zend_vm_execute.h:2154
    kjdev#4 0x62373259894d in execute_ex /work/php-src/Zend/zend_vm_execute.h:116519
    kjdev#5 0x6237325a9126 in zend_execute /work/php-src/Zend/zend_vm_execute.h:121962
    kjdev#6 0x623732779b56 in zend_execute_script /work/php-src/Zend/zend.c:1980
    kjdev#7 0x623731fa8cce in php_execute_script_ex /work/php-src/main/main.c:2645
    kjdev#8 0x623731fa8fd8 in php_execute_script /work/php-src/main/main.c:2685
    kjdev#9 0x623732791599 in main /work/php-src/sapi/cgi/cgi_main.c:2529
    kjdev#10 0x7c8d9a8381c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    kjdev#11 0x7c8d9a83828a in __libc_start_main_impl ../csu/libc-start.c:360
    kjdev#12 0x623730c09ad4 in _start (/work/php-src/build-dbg-ubsan/sapi/cgi/php-cgi+0x1409ad4) (BuildId: 4ad96c24d1e97297ff4db76de2af34c057b26433)
```
@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7f5cee1b-ba63-40c2-a256-94a8bd88a0c1

📥 Commits

Reviewing files that changed from the base of the PR and between a4828f5 and ac13412.

📒 Files selected for processing (1)
  • brotli.c

📝 Walkthrough

Walkthrough

This PR modernizes the Brotli extension's buffer handling by replacing the legacy smart_string API with the Zend Engine's smart_str API. The change updates module includes, adds a compatibility extractor for older PHP, and refactors the four main compression/decompression functions (sync and incremental) to use smart_str for accumulation, cleanup, and result extraction.

Changes

Buffer API Modernization

Layer / File(s) Summary
Header and include updates
brotli.c
Adds zend_smart_str.h and removes prior conditional php_smart_string.h/zend_smart_string.h handling.
Compatibility helper
brotli.c
Adds inline smart_str_extract(smart_str *str) for PHP_VERSION_ID < 70200 to produce zend_string* and null internal storage.
Compression buffering modernization
brotli.c
Updates brotli_compress and brotli_compress_add to use smart_str accumulators, smart_str_appendl for chunk appends, smart_str_free on failure, and RETVAL_STR(smart_str_extract(&out)) on success.
Decompression buffering modernization
brotli.c
Updates brotli_uncompress and brotli_uncompress_add to use smart_str accumulators, smart_str_appendl for chunk appends, smart_str_free on failure, and RETVAL_STR(smart_str_extract(&out)) on success.

Possibly related PRs

  • kjdev/php-ext-brotli#63: Refactors brotli_compress streaming loop and incremental buffering; touches similar compression return/accumulation code paths.
  • kjdev/php-ext-brotli#76: Adjusts smart string include/usage paths between ext/standard and Zend smart string implementations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
From old smart_string I hop anew,
smart_str now collects each compressed chew,
Append, extract, tidy — neat and spry,
Brotli bytes dance as I bounce by,
A little rabbit clap — hop, high, bye!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and accurately summarizes the main change: switching from smart_string to smart_str API to fix a UBSAN issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
brotli.c (1)

1806-1822: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle decoder error/finish state before returning output.

brotli_uncompress_add() returns smart_str_extract(&out) regardless of BrotliDecoderResult. If decompression ends with BROTLI_DECODER_RESULT_ERROR (or mode == BROTLI_OPERATION_FINISH but not SUCCESS), this can return partial/empty data as a successful call.

Proposed fix
     BrotliDecoderResult result = BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT;
     while (result == BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT) {
         ctx->available_out = buffer_size;
         ctx->next_out = buffer;
         result = BrotliDecoderDecompressStream(ctx->decoder,
                                                &ctx->available_in,
                                                &ctx->next_in,
                                                &ctx->available_out,
                                                &ctx->next_out,
                                                0);
         size_t buffer_used = buffer_size - ctx->available_out;
         if (buffer_used) {
             smart_str_appendl(&out, buffer, buffer_used);
         }
     }

+    if (result == BROTLI_DECODER_RESULT_ERROR ||
+        (mode == BROTLI_OPERATION_FINISH && result != BROTLI_DECODER_RESULT_SUCCESS)) {
+        efree(buffer);
+        smart_str_free(&out);
+        php_error_docref(NULL, E_WARNING, "failed to incremental uncompress");
+        RETURN_FALSE;
+    }
+
     RETVAL_STR(smart_str_extract(&out));

     efree(buffer);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@brotli.c` around lines 1806 - 1822, The function brotli_uncompress_add
currently returns RETVAL_STR(smart_str_extract(&out)) without checking the final
BrotliDecoderResult; after the loop that calls BrotliDecoderDecompressStream,
inspect the final result and handle non-success cases: if result !=
BROTLI_DECODER_RESULT_SUCCESS (including BROTLI_DECODER_RESULT_ERROR or
incomplete/NEEDS_MORE_INPUT states), release/clear the smart_str out buffer and
return an error/false (or raise an appropriate PHP error) instead of returning
partial data; only call RETVAL_STR(smart_str_extract(&out)) when the result is
BROTLI_DECODER_RESULT_SUCCESS. Ensure you reference ctx, out,
BrotliDecoderDecompressStream, and RETVAL_STR in the fix so the check is
implemented right after the decompression loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@brotli.c`:
- Around line 1806-1822: The function brotli_uncompress_add currently returns
RETVAL_STR(smart_str_extract(&out)) without checking the final
BrotliDecoderResult; after the loop that calls BrotliDecoderDecompressStream,
inspect the final result and handle non-success cases: if result !=
BROTLI_DECODER_RESULT_SUCCESS (including BROTLI_DECODER_RESULT_ERROR or
incomplete/NEEDS_MORE_INPUT states), release/clear the smart_str out buffer and
return an error/false (or raise an appropriate PHP error) instead of returning
partial data; only call RETVAL_STR(smart_str_extract(&out)) when the result is
BROTLI_DECODER_RESULT_SUCCESS. Ensure you reference ctx, out,
BrotliDecoderDecompressStream, and RETVAL_STR in the fix so the check is
implemented right after the decompression loop.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d997d699-9e43-458e-b234-b8926f5c697c

📥 Commits

Reviewing files that changed from the base of the PR and between 4924acc and a4828f5.

📒 Files selected for processing (1)
  • brotli.c

@kjdev

kjdev commented May 24, 2026

Copy link
Copy Markdown
Owner

Since version 7.0 is currently supported, this appears to be an error.

PHP Startup: Unable to load dynamic library '/path/to/brotli.so' - Error relocating /path/to/brotli.so: smart_str_extract: symbol not found in Unknown on line 0

@ndossche

Copy link
Copy Markdown
Contributor Author

Alright, I've added a compatibility ifdef with an equivalent implementation to fix this.

@kjdev kjdev merged commit 40c7526 into kjdev:master May 25, 2026
90 checks passed
@kjdev

kjdev commented May 25, 2026

Copy link
Copy Markdown
Owner

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants