[WIP] Optimize RESP protocol parser and encoder for performance by Copilot · Pull Request #2 · 1jmdev/BetterKV

Copilot · 2026-03-10T17:52:52Z

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.

Original prompt

Problem

The RESP protocol parser and encoder in crates/protocol/ are too slow and need to be optimized for maximum performance (target: 5-10x faster across parsing and encoding).

Current Benchmark Numbers

parse_command_into/inline:       70-77 ns   (~498 MiB/s)
parse_command_into/array/3x8:    37 ns       (~1.15 GiB/s)
parse_command_into/array/8x16:   158-160 ns  (~1.10 GiB/s)
parse_command_into/array/32x24:  561-563 ns  (~1.65 GiB/s)
parse_frame/simple:              22-23 ns    (~293 MiB/s)
parse_frame/bulk:                16-17 ns    (~1.35 GiB/s)
parse_frame/nested:              155-156 ns  (~330 MiB/s)
encode/simple:                   19-20 ns    (~344 MiB/s)
encode/bulk_values_8x16:         109-114 ns  (~1.58 GiB/s)
encode/array_16x24:              247-248 ns  (~1.88 GiB/s)
encode/map:                      54-56 ns    (~695 MiB/s)

Files to Optimize

crates/protocol/src/encoder.rs - RESP frame encoder
crates/protocol/src/parser.rs - RESP frame and command parser
crates/protocol/src/types.rs - Protocol types (BulkData, RespFrame)

Key Optimization Strategies to Apply

Encoder (`encoder.rs`)

Replace multiple extend_from_slice calls with single raw pointer writes - Each extend_from_slice call has overhead (bounds check, potential reallocation check). Instead, pre-calculate the total needed size, reserve once, then write directly with raw pointers using unsafe.
Use itoa crate for integer formatting - The itoa crate (already in workspace dependencies) is highly optimized for integer-to-string conversion and is faster than the manual write_int/write_uint implementations.
Batch small writes into a single contiguous write - For encoding patterns like "$" + len + "\r\n" + data + "\r\n", build the entire thing in a stack buffer (for small lengths) and write once.
Eliminate the Encoder struct entirely - It has no state. Make encode a free function or use #[inline(always)] on the method. The &mut self parameter adds unnecessary indirection.
Add more pre-encoded static byte constants - For common patterns like "$0\r\n\r\n", ":0\r\n", ":1\r\n", "*0\r\n", "*1\r\n", "*2\r\n", etc.
Use BufMut trait's put_slice or direct chunk_mut() + advance_mut() - Write directly into BytesMut's spare capacity without intermediate copies.

Parser (`parser.rs`)

Use memchr for CRLF scanning - The memchr crate (already in workspace dependencies) uses SIMD instructions to find \n bytes much faster than a byte-by-byte loop, especially for longer lines.
Remove checked_mul/checked_add from parse_uint/parse_int - In RESP, integer values are small (lengths, counts). Use unchecked arithmetic or simply use wrapping arithmetic with a post-check, or just use wrapping_mul/wrapping_add since overflow is practically impossible for valid RESP.
Inline parse_uint aggressively with a lookup-table approach - For 1-digit and 2-digit numbers (which are the vast majority in RESP), use a fast path.
Use src.advance() instead of src.split_to() - split_to does extra work to create a new BytesMut; advance just moves the read cursor forward (already partially done in your code but some places still use split_to).
Specialize read_line with memchr::memchr - Use SIMD-accelerated \n scanning from the memchr crate.
Avoid Option<u8> wrapping in peek() - Use a direct unsafe access pattern or return the byte with a separate bounds check.

Types (`types.rs`)

Add #[inline(always)] to BulkData::as_slice() - This is called in every tight encode loop and should be fully inlined.
Consider removing the profiler scope calls from hot-path methods - The profiler::scope() calls in as_slice(), from_vec(), etc. add overhead even when profiling is disabled (they still call the function which may or may not be inlined away).

General

Make sure all hot-path functions are #[inline(always)] not just #[inline].
Ensure the benchmarks remain working and passing after optimizations.
All existing tests must continue to pass.
Keep using existing workspace dependencies (itoa, memchr, bytes, smallvec) rather than adding new ones.

Current Source Code

`crates/protocol/src/encoder.rs`

use crate::types::RespFrame;
use bytes::BytesMut;

#[inline]
fn write_int(buf: &mut BytesMut, val: i64) {
    let mut tmp = [0u8; 20];
    let mut pos = 20usize;

    if val == 0 {
        buf.extend_from_slice(b"0");
        return;
    }

    let neg = val < 0;
    let mut v: u64 = if neg {
        (!(val as u64)).wrapping_add(1)
    } else {
        val as u64
    };

    while v > 0 {
        pos -= 1;
        unsafe { *tmp.get_unchecked_mut(pos) = (v % 10) as u8 + b'0' };
        v /= 10;
    }
    if neg {
        pos -= 1;
        unsafe { *tmp.get_unchecked_mut(pos) = b'-' };
    }
    buf.extend_from_slice(unsafe...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.

Initial plan

cfb4971

Copilot AI assigned Copilot and 1jmdev Mar 10, 2026

Copilot started work on behalf of 1jmdev March 10, 2026 17:53 View session

Copilot stopped work on behalf of 1jmdev due to an error March 10, 2026 17:55
The session was cancelled by the user.

1jmdev closed this Mar 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Optimize RESP protocol parser and encoder for performance#2

[WIP] Optimize RESP protocol parser and encoder for performance#2
Copilot wants to merge 1 commit into
mainfrom
copilot/optimize-resp-protocol-parser

Copilot AI commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 10, 2026

Problem

Current Benchmark Numbers

Files to Optimize

Key Optimization Strategies to Apply

Encoder (encoder.rs)

Parser (parser.rs)

Types (types.rs)

General

Current Source Code

crates/protocol/src/encoder.rs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Encoder (`encoder.rs`)

Parser (`parser.rs`)

Types (`types.rs`)

`crates/protocol/src/encoder.rs`