Skip to content

Added Unicode Support via utf-8 specification#60

Open
KennethNero wants to merge 2 commits into
maxuser0:mainfrom
KennethNero:unicode-fix
Open

Added Unicode Support via utf-8 specification#60
KennethNero wants to merge 2 commits into
maxuser0:mainfrom
KennethNero:unicode-fix

Conversation

@KennethNero
Copy link
Copy Markdown

Fix Unicode encoding issues in NBT data transmission to Python

Summary

Fixes incorrect encoding when transmitting NBT data containing Unicode characters from Java to Python subprocess, which caused mojibake (e.g., appearing as ⚎).

Problem

Fixes #59

When SubprocessTask writes JSON data containing Unicode characters to Python's stdin, it uses OutputStreamWriter without specifying a charset. On Windows, this defaults to Windows-1252/CP1252 instead of UTF-8, causing Unicode characters in book NBT data to be corrupted.

Example:

  • Expected: ⚎⚎⚎ (U+26CE)
  • Actual: ⚎⚎⚎ (UTF-8 bytes misinterpreted as Windows-1252)

Solution

Explicitly specify StandardCharsets.UTF_8 when creating:

  1. OutputStreamWriter for subprocess stdin (writing to Python)
  2. InputStreamReader for subprocess stdout (reading from Python)
  3. InputStreamReader for subprocess stderr (reading from Python)

Changes

  • common/src/main/java/net/minescript/common/SubprocessTask.java:
    • Added import java.nio.charset.StandardCharsets;
    • Modified line 48: new OutputStreamWriter(process.getOutputStream(), StandardCharsets.UTF_8)
    • Modified line 54: new InputStreamReader(process.getInputStream(), StandardCharsets.UTF_8)
    • Modified line 87: new InputStreamReader(process.getErrorStream(), StandardCharsets.UTF_8)

All Unicode characters now transmit correctly to Python scripts.

Additional Notes

This is a platform-specific bug that primarily affects Windows users, where the default charset is not UTF-8. Linux/Mac users were unaffected as their default is already UTF-8.

@KennethNero KennethNero marked this pull request as draft February 23, 2026 03:10
@KennethNero
Copy link
Copy Markdown
Author

KennethNero commented Feb 23, 2026

After thorough testing - also modified minescript_runtime.py to ensure that the java encoding doesn't fall off due to python's interpretation.

@KennethNero KennethNero marked this pull request as ready for review February 23, 2026 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NBT Encoding Issue: Unicode Characters in Written Books

1 participant