Skip to content

Can not get rid of BOM in powershell piping to external program #780

@lGSMl

Description

@lGSMl

Hi, we use SDK image for some CI related stuff and there is a need to pipe output of pwsh cmdlet to our internal tool, but for some reason text from pipe is encoded with prepended BOM and I can not get rid of it in image pwsh anyhow.

Example scenario:

Name             : ConsoleHost
Version          : 5.1.19041.1023
InstanceId       : 4d5f918b-c5d0-432b-a880-2a9b5a0f1028
UI               : System.Management.Automation.Internal.Host.InternalHostUserInterface
CurrentCulture   : en-US
CurrentUICulture : en-US
PrivateData      : Microsoft.PowerShell.ConsoleHost+ConsoleColorProxy
DebuggerEnabled  : True
IsRunspacePushed : False
Runspace         : System.Management.Automation.Runspaces.LocalRunspace
class Program
    {
        static async Task Main(string[] args)
        {
            byte[] nextLineBuffer = new byte[2048];

            await using var stdin = Console.OpenStandardInput();
            int bytes;
            while ((bytes = await stdin.ReadAsync(nextLineBuffer.AsMemory(0, nextLineBuffer.Length))) > 0)
            {
                var slicedBuffer = nextLineBuffer[..bytes];
                var hexedString = BitConverter.ToString(slicedBuffer);

                await Console.Out.WriteLineAsync($"Stdin hex: {hexedString}");

                Console.OutputEncoding = Encoding.ASCII;
                await Console.Out.WriteLineAsync($"Stdin parsed as ASCII: {Encoding.ASCII.GetString(slicedBuffer)}");

                Console.OutputEncoding = Encoding.UTF8;
                await Console.Out.WriteLineAsync($"Stdin parsed as UTF-8: {Encoding.UTF8.GetString(slicedBuffer)}");
                
                Console.OutputEncoding = Encoding.Unicode;
                await Console.Out.WriteLineAsync($"Stdin parsed as Unicode: {Encoding.Unicode.GetString(slicedBuffer)}");
            
            }
        }
    }
PS C:\tmp> echo "test" | .\bin\Debug\net5.0\pwsh5-bom-tests.exe
Stdin hex: EF-BB-BF-74-65-73-74-0D-0A
Stdin parsed as ASCII: ???test

Stdin parsed as UTF-8: test

Stdin parsed as Unicode: 믯璿獥൴�
PS C:\tmp>  & { $OutputEncoding = [Text.Utf8Encoding]::new($false); echo "test" | .\bin\Debug\net5.0\pwsh5-bom-tests.exe }
Stdin hex: EF-BB-BF-74-65-73-74-0D-0A
Stdin parsed as ASCII: ???test

Stdin parsed as UTF-8: test

Stdin parsed as Unicode: 믯璿獥൴�

I have tried to modify $OutputEncoding - but it did not work inside image, while worked for my local windows pwsh.

Any idea how I can bypass piping BOM?

image: mcr.microsoft.com/dotnet/framework/sdk:4.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions