Conversation
2cf2e2a to
7cc6104
Compare
| out.push(`if (${ctx.endPosArg()} - ${ctx.posArg()} >= 16) {`); | ||
| out.push(' __m128i ranges;'); | ||
| out.push(' __m128i input;'); | ||
| out.push(' int avail;'); |
| "mocha": "^9.2.2", | ||
| "ts-node": "^9.0.0", | ||
| "typescript": "^4.0.3" | ||
| "typescript": "^5.0.3" |
There was a problem hiding this comment.
The package wasn't compiling without typescript update.
| const hex: string[] = []; | ||
| for (let j = i; j < limit; j++) { | ||
| const value = buffer[j]; | ||
| assert(value !== undefined); |
There was a problem hiding this comment.
With typescript version change - new errors popped up.
190bc50 to
9e9e699
Compare
ShogunPanda
left a comment
There was a problem hiding this comment.
LGTM!
That's just amazing!
|
@indutny go ahead and release! :D |
|
Let me take a deeper look at assembly before merging, but if I don't come back let's merge in two days! |
|
For posterity SIMD assembly on ARM64: llvm-mca -iterations=1 No SIMD llvm-mca -iterations=16 |
This makes wasm version of llhttp around 5-6% faster on real-world
payloads. The produced code looks roughly like this:
/* Load input */
input = wasm_v128_load(p);
/* Find first character that does not match `ranges` */
single = wasm_i8x16_eq(input, wasm_u8x16_const_splat(0x9));
mask = single;
single = wasm_v128_and(
wasm_i8x16_ge(input, wasm_u8x16_const_splat(' ')),
wasm_i8x16_le(input, wasm_u8x16_const_splat('~'))
);
mask = wasm_v128_or(mask, single);
single = wasm_v128_or(
wasm_i8x16_ge(input, wasm_u8x16_const_splat(0x80)),
wasm_i8x16_le(input, wasm_u8x16_const_splat(0xff))
);
mask = wasm_v128_or(mask, single);
match_len = __builtin_ctz(
~wasm_i8x16_bitmask(mask)
);
It is conceptually similar to SSE vectorization that we already support
except that we can't multiple comparisons at once and have to check
ranges individually.
9e9e699 to
8f30ff9
Compare
|
Found slightly more optimal code path. |
|
I'm suddenly feeling a bit apprehensive about merging this since I currently have no way of running llhttp test suite on wasm file. Might need to look into it first. |
|
Okay, with nodejs/llparse-test-fixture#22 I managed to test this PR and confirm that it passes all existing llhttp tests. Merging. |
|
@mcollina llparse@7.2.0 was released. I believe |
This makes wasm version of llhttp around 5-6% faster on real-world payloads. The produced code looks roughly like this:
It is conceptually similar to SSE vectorization that we already support except that we can't multiple comparisons at once and have to check ranges individually.
See previous attempt #72
Benchmark: https://gist.github.com/indutny/d18d56fe3c7254d888a79eca98f39145
cc @nodejs/llhttp @mcollina