Escape quality string characters in JSON output#49
Escape quality string characters in JSON output#49a113n wants to merge 2 commits intoOpenGene:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR ensures generated JSON reports remain valid by escaping special characters in string fields written by the JSON reporter and read serialization helpers.
Changes:
- Added
escapeJsonString(const std::string&)utility incommon.*to JSON-escape strings. - Updated
JsonReporter::run()to escape command/time and various fusion string fields before writing JSON. - Updated
Match::printReadToJson()to escape read sequence and quality strings before writing JSON.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/match.cpp |
Escapes read seq and qual fields before writing JSON. |
src/jsonreporter.cpp |
Escapes command/time and fusion-related string fields (including object keys) in JSON output. |
src/common.h |
Declares escapeJsonString and includes <string> for the declaration. |
src/common.cpp |
Implements JSON string escaping helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| std::string output; | ||
| output.reserve(input.size()); | ||
|
|
||
| for(size_t i = 0; i < input.size(); i++) { | ||
| const unsigned char c = static_cast<unsigned char>(input[i]); | ||
| switch(c) { | ||
| case '"': | ||
| output += "\\\""; | ||
| break; | ||
| case '\\': | ||
| output += "\\\\"; | ||
| break; | ||
| case '\b': | ||
| output += "\\b"; | ||
| break; | ||
| case '\f': | ||
| output += "\\f"; | ||
| break; | ||
| case '\n': | ||
| output += "\\n"; | ||
| break; | ||
| case '\r': | ||
| output += "\\r"; | ||
| break; | ||
| case '\t': | ||
| output += "\\t"; | ||
| break; | ||
| default: | ||
| if(c < 0x20) { | ||
| const char* hex = "0123456789abcdef"; | ||
| output += "\\u00"; | ||
| output += hex[(c >> 4) & 0x0F]; | ||
| output += hex[c & 0x0F]; | ||
| } else { | ||
| output += static_cast<char>(c); | ||
| } | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| return output; |
There was a problem hiding this comment.
escapeJsonString is formatted with tab indentation, while most of the codebase uses spaces (e.g., other .cpp files in src/). Please reformat this function to avoid mixed whitespace and keep style consistent.
| std::string output; | |
| output.reserve(input.size()); | |
| for(size_t i = 0; i < input.size(); i++) { | |
| const unsigned char c = static_cast<unsigned char>(input[i]); | |
| switch(c) { | |
| case '"': | |
| output += "\\\""; | |
| break; | |
| case '\\': | |
| output += "\\\\"; | |
| break; | |
| case '\b': | |
| output += "\\b"; | |
| break; | |
| case '\f': | |
| output += "\\f"; | |
| break; | |
| case '\n': | |
| output += "\\n"; | |
| break; | |
| case '\r': | |
| output += "\\r"; | |
| break; | |
| case '\t': | |
| output += "\\t"; | |
| break; | |
| default: | |
| if(c < 0x20) { | |
| const char* hex = "0123456789abcdef"; | |
| output += "\\u00"; | |
| output += hex[(c >> 4) & 0x0F]; | |
| output += hex[c & 0x0F]; | |
| } else { | |
| output += static_cast<char>(c); | |
| } | |
| break; | |
| } | |
| } | |
| return output; | |
| std::string output; | |
| output.reserve(input.size()); | |
| for(size_t i = 0; i < input.size(); i++) { | |
| const unsigned char c = static_cast<unsigned char>(input[i]); | |
| switch(c) { | |
| case '"': | |
| output += "\\\""; | |
| break; | |
| case '\\': | |
| output += "\\\\"; | |
| break; | |
| case '\b': | |
| output += "\\b"; | |
| break; | |
| case '\f': | |
| output += "\\f"; | |
| break; | |
| case '\n': | |
| output += "\\n"; | |
| break; | |
| case '\r': | |
| output += "\\r"; | |
| break; | |
| case '\t': | |
| output += "\\t"; | |
| break; | |
| default: | |
| if(c < 0x20) { | |
| const char* hex = "0123456789abcdef"; | |
| output += "\\u00"; | |
| output += hex[(c >> 4) & 0x0F]; | |
| output += hex[c & 0x0F]; | |
| } else { | |
| output += static_cast<char>(c); | |
| } | |
| break; | |
| } | |
| } | |
| return output; |
| mFile << "\t\"command\":\"" << escapeJsonString(command) << "\"," << endl; | ||
| mFile << "\t\"version\":\"" << FUSIONSCAN_VER << "\"," << endl; | ||
| mFile << "\t\"time\":\"" << getCurrentSystemTime() << "\"," << endl; | ||
| mFile << "\t\"time\":\"" << escapeJsonString(getCurrentSystemTime()) << "\"," << endl; |
There was a problem hiding this comment.
escapeJsonString(...) allocates a new std::string for every call. Since JsonReporter::run() calls it inside nested loops (per fusion and then per read via printReadToJson), this can add overhead for large JSON reports. Consider adding a helper that writes escaped content directly to the stream (or reuses a buffer) to avoid repeated allocations/copies.
Escape special characters properly in the jsonreporter.cpp output.