Skip to content

Use multibyte character path for bundle_probe#44466

Merged
janvorli merged 1 commit into
dotnet:masterfrom
am11:feature/cleanups/purge-codecvt
Nov 19, 2020
Merged

Use multibyte character path for bundle_probe#44466
janvorli merged 1 commit into
dotnet:masterfrom
am11:feature/cleanups/purge-codecvt

Conversation

@am11
Copy link
Copy Markdown
Member

@am11 am11 commented Nov 10, 2020

Fixes #44098.

@am11
Copy link
Copy Markdown
Member Author

am11 commented Nov 10, 2020

cc @janvorli

Waiting on a clean CI run. Checkout leg is failing: #44472.

Comment thread src/coreclr/src/dlls/mscoree/unixinterface.cpp Outdated
@ghost
Copy link
Copy Markdown

ghost commented Nov 10, 2020

Tagging subscribers to this area: @vitek-karas, @agocke
See info in area-owners.md if you want to be subscribed.

Details


Issue meta data
Issue content: Fixes #44098.
Issue author: am11
Assignees: -
Milestone: -

@am11 am11 marked this pull request as ready for review November 11, 2020 06:24
@am11 am11 requested a review from janvorli November 11, 2020 06:24
Comment thread src/coreclr/src/vm/bundle.cpp Outdated
Comment thread src/coreclr/src/vm/bundle.cpp Outdated
Comment thread src/coreclr/src/vm/bundle.cpp Outdated
@dotnet dotnet deleted a comment from janvorli Nov 11, 2020
Comment thread src/coreclr/src/vm/bundle.cpp Outdated
Copy link
Copy Markdown
Member

@vitek-karas vitek-karas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - let's wait for @janvorli to take a second look

@am11
Copy link
Copy Markdown
Member Author

am11 commented Nov 11, 2020

Internal to CoreCLR; SString has a concept of internal representation, so if the string was originally UTF8, it will be preserved and GetUTF8 will return it as is (without conversion). I was trying to keep the diff smaller, as it is a deep rabbit hole to update all instances; but there seems to be few places where we have both wide and narrow variants of the same string available at the call sites of bundle probes. If it deemed profitable, we can update those in a separate PR by using narrow variant when constructing SString for probe call.

Comment thread src/coreclr/src/vm/bundle.cpp Outdated
Comment thread src/coreclr/src/vm/bundle.cpp Outdated
Comment thread src/coreclr/src/vm/bundle.cpp Outdated
Comment thread src/coreclr/src/vm/bundle.cpp
Copy link
Copy Markdown
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Comment thread src/coreclr/src/vm/bundle.cpp
@janvorli
Copy link
Copy Markdown
Member

The win x86 leg was failing due to something that looks like a bug in xunit (xunit code modifying a collection while it was being iterated). I am re-running this leg.

Copy link
Copy Markdown
Member

@elinor-fung elinor-fung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also make sure the single-file doc in dotnet/designs reflects this change.

Edit: Actually, apparently the design doc does have char, but the implementation didn't match.

@am11
Copy link
Copy Markdown
Member Author

am11 commented Nov 13, 2020

When runtime repo itself is cloned in a path containing multibyte character, I found two issues.

Steps to repro (using three-bytes characters):

$ git clone https://github.com/dotnet/runtime runtime-ⒾⓇⓇⒺⒼⓊⓁⒶⓇ␣ⓉⒺⓍⓉ
$ cd runtime-ⒾⓇⓇⒺⒼⓊⓁⒶⓇ␣ⓉⒺⓍⓉ
$ ./build.sh -c Release
$ ./build.sh Host.Tests -c Release -test
  1. First the build errors, that were trivial to fix:
    --- a/src/coreclr/src/scripts/genDummyProvider.py
    +++ b/src/coreclr/src/scripts/genDummyProvider.py
    @@ -111,7 +111,7 @@ def generateDummyFiles(etwmanifest, out_dirname, extern, dryRun):
            providerName = trimProvName(providerNode.getAttribute('name'))
            providerName_File = escapeProvFilename(providerName)
    
    -        dummyevntprov = os.path.join(out_dirname, dummyevntprovPre + providerName_File + ".cpp")
    +        dummyevntprov = os.path.join(out_dirname.decode('utf-8'), dummyevntprovPre + providerName_File + ".cpp").encode('utf-8')
    
            if dryRun:
                print(dummyevntprov)
    diff --git a/src/coreclr/src/scripts/genEventPipe.py b/src/coreclr/src/scripts/genEventPipe.py
    index d8c6cdca24f..54aa704148d 100644
    --- a/src/coreclr/src/scripts/genEventPipe.py
    +++ b/src/coreclr/src/scripts/genEventPipe.py
    @@ -384,7 +384,7 @@ def generateEventPipeImplFiles(
            providerName_File = providerPrettyName.replace('-', '')
            providerName_File = providerName_File.lower()
            providerPrettyName = providerPrettyName.replace('-', '_')
    -        eventpipefile = os.path.join(eventpipe_directory, providerName_File + ".cpp")
    +        eventpipefile = os.path.join(eventpipe_directory.decode('utf-8'), providerName_File + ".cpp").encode('utf-8')
            if dryRun:
                print(eventpipefile)
            else:
  2. ToLower() in C# is using ICU transofrmation, so HostFxrPath.ToLower() in this case becomes: /users/am11/projects/regular_text-ⓘⓡⓡⓔⓖⓤⓛⓐⓡ␣ⓣⓔⓧⓣ/runtime_pr/artifacts/tests/release/ha/nativehosting/0/valid/dotnet/host/fxr/2.3.0/libhostfxr.dylib, while C++ transformation is ASCII only, so it becomes /users/am11/projects/regular_text-ⒾⓇⓇⒺⒼⓊⓁⒶⓇ␣ⓉⒺⓍⓉ/runtime_pr/artifacts/tests/release/ha/nativehosting/0/valid/dotnet/host/fxr/2.3.0/libhostfxr.dylib and fails the Assert.Contains instances in Nethost.cs. The fix for this was to match C++, ASCII only tolower transformation: http://sprunge.us/UPR16s.

With these two patches, full runtime repo build succeeded and all installer tests passed on macOS. It seems like there are other ToLower() cases in MSBuild scripts (which also uses .NET's ICU-backed transformation), but those are unrelated to PR changes and can be repro'd with the current master branch (and probably ToLower() transformation for hostfxr). Perhaps we should try to get rid of ToLower transformation entirely and rely on underlying filesystem's case-sensitivity choices; i.e. don't care about the casing explicitly in code like the other components of runtime repo.

@janvorli janvorli merged commit 0897d4a into dotnet:master Nov 19, 2020
@am11 am11 deleted the feature/cleanups/purge-codecvt branch November 19, 2020 21:05
@vitek-karas
Copy link
Copy Markdown
Member

Thanks a lot @am11 !

ThadHouse pushed a commit to ThadHouse/runtime that referenced this pull request Nov 20, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove use of codecvt in PAL

5 participants