Skip to content

Add native batch text URL rewrite API#282

Open
adamziel wants to merge 4 commits into
trunkfrom
codex-native-structured-rewrite
Open

Add native batch text URL rewrite API#282
adamziel wants to merge 4 commits into
trunkfrom
codex-native-structured-rewrite

Conversation

@adamziel
Copy link
Copy Markdown
Collaborator

@adamziel adamziel commented May 18, 2026

What it does

Adds wp_native_apis_rewrite_text_url_bases() to the wp_native_apis extension. The function rewrites URL bases across a whole text value in one native call using a compact from\x1fto\x1e... mapping string.

Rationale

Reprint currently has to drive URL detection and replacement from PHP one URL at a time. In PHP.wasm, that keeps the hot URL rewrite path split across many PHP/native boundary crossings. This adds a coarse native primitive that can rewrite one decoded value in a single call.

Implementation

The host PHP extension exposes the function through the Rust-backed ext-php-rs module and reuses the existing Rust URL-in-text candidate scanner and validation logic. It applies base URL replacements in one pass over the original text and preserves bare-domain and protocol-relative URL forms.

The PHP.wasm extension still builds through native_apis_shim.c, so this PR also adds the same public function to that C shim. That gives Playground/Reprint a real PHP.wasm fast path instead of only smoke-test classes.

The verifier and Playground smoke blueprint now check that the extension registers the function and that a simple base replacement works.

Benchmarks

Measured from Reprint PR adamziel/reprint#216, CI run https://github.com/adamziel/reprint/actions/runs/26067523070/attempts/3.

Focused case: StructuredDataUrlRewriter rewrites 10,000 URLs across 556,690 scanned bytes.

Runtime Userland Native extension Delta
Host PHP 2968.918 ms 6.743 ms -99.8%, about 440x faster
PHP.wasm 8.4 7616.311 ms 7.632 ms -99.9%, about 998x faster

These are single-run CI measurements, so treat them as directional.

Testing instructions

Run locally:

cd extensions/native-apis
cargo fmt --check
cargo test
cd ../..
php -l extensions/native-apis/tests/verify-native-apis.php
php -r '$json=json_decode(file_get_contents("extensions/native-apis/playground/blueprint.json"), true); if (!is_array($json)) { fwrite(STDERR, json_last_error_msg()."\n"); exit(1); } $tmp=tempnam(sys_get_temp_dir(), "native-blueprint-"); file_put_contents($tmp, $json["steps"][0]["data"]); passthru("php -l " . escapeshellarg($tmp), $code); unlink($tmp); exit($code);'
git diff --check

Local results from this branch:

cargo test: 89 passed
php -l extensions/native-apis/tests/verify-native-apis.php: no syntax errors
playground blueprint embedded PHP: no syntax errors
git diff --check: passed

I could not build the PHP extension locally because this environment has no php-config and no package manager to install PHP development headers. GitHub Actions builds and verifies both the host PHP extension and the PHP.wasm extension.

@adamziel adamziel force-pushed the codex-native-structured-rewrite branch from 4903777 to 684a29c Compare May 19, 2026 20:17
@adamziel
Copy link
Copy Markdown
Collaborator Author

Rebased this stack on current trunk after #283 landed and re-ran the same post-content sample benchmark.

Sample: 5,000 post_content values, 16.85 MB decoded, PHP.wasm 8.4, wp_native_apis loaded, output hash 44c9750cb310ad822ce73840290c647e609b78ee047365b60f9f123aeed5d9bc for both branches.

Branch Runs Median Delta
#282 rebased on trunk/#283 3.773 s, 3.655 s, 3.642 s 3.655 s baseline
#284 stacked on rebased #282 2.632 s, 2.532 s, 2.448 s 2.532 s 30.7% faster

This did not confirm the hypothesis that the stacked native batch text rewrite has no meaningful improvement after #283. It is still materially faster on this content sample.

Caveat: the rebased native-extension CI is currently failing URL text parity tests in the native-extension job, so this stack is not merge-ready as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant