Skip to content

JSONC parser fails to correctly parse non-BMP escape sequences #31

@KiloJuliett

Description

@KiloJuliett

In accordance with RFC 8258 § 7, the non-BMP character 𝄞 (U+1D11E) should be escaped as the escaped surrogate pair \uD834\uDD1E. Therefore, I expect the following Rust code to compile and run successfully:

use jsonc_parser::JsonValue;
use jsonc_parser::parse_to_value;

fn main() {
    let src = r#""\uD834\uDD1E""#;
    let v = parse_to_value(src, &Default::default()).unwrap().unwrap();
    if let JsonValue::String(s) = v {
        assert_eq!("\u{1D11E}", s)
    }
    else {
        panic!();
    }
}

However, on the latest version of jsonc-parser (as of writing, this is version 0.21.0), this code panics at the unwrap on line 6 with the message "Invalid unicode escape sequence. 'D834' is not a valid UTF8 character".

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions