Add support for escaping OsStr#9
Conversation
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
| if cfg!(unix) || env::var("MSYSTEM").is_ok() { | ||
| unix::escape_os_str(s) | ||
| } else { | ||
| unimplemented!("windows::escape_os_str") |
There was a problem hiding this comment.
I don't think this can land without an implementation here.
There was a problem hiding this comment.
Yes, I think you can use std::os::windows::ffi::OsStrExt::encode_wide here to do the escaping then convert it back using std::os::windows::ffi::OsStringExt::from_wide.
There was a problem hiding this comment.
@NobodyXu I've added a first draft of an implementation of escape_os_str for Windows. Whenever you can find the time, I'd really appreciate a review and some assistance with a test case for test_escape_os_str_from_bytes.
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
| /// assert!(!needs_escape(b'\\' as u16)); | ||
| /// ``` | ||
| pub fn needs_escape(wide_byte: u16) -> bool { | ||
| let (high, low) = ((wide_byte >> 8) as u8, (wide_byte & 0xFF) as u8); |
There was a problem hiding this comment.
Can we use char::from_u16 here instead?
| /// ``` | ||
| /// [msdn]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx | ||
| pub fn escape_os_str(s: &OsStr) -> Cow<'_, OsStr> { | ||
| let encoded: Vec<u16> = s.encode_wide().collect(); |
There was a problem hiding this comment.
| let encoded: Vec<u16> = s.encode_wide().collect(); | |
| let encoded: Vec<u16> = s.encode_wide(); |
Let's avoid this collect and instead clone the iterator.
| let mut chars = encoded.into_iter().peekable(); | ||
| loop { | ||
| let mut nslashes = 0; | ||
| while let Some(&c) = chars.peek() { |
There was a problem hiding this comment.
You could use Peekable::next_if_eq here
| break; | ||
| } | ||
| } | ||
| match chars.next() { |
There was a problem hiding this comment.
You could rewrite this into a for loop and execute the None branch after the for loop.
There was a problem hiding this comment.
hmm. Do you think this refactoring improves readability significantly?
Also, aren't we iterating at a dynamic rate (i.e. we skip through runs of \\s but process non-backslashes one character at a time? I could be wrong but I feel like this "run-length encoding" kind of implementation is probably more readable as it stands.
There was a problem hiding this comment.
Oh I just realize that you cannot refactor into a for loop due to use of peekable.
|
Thanks for the quick review. I've applied some suggested changes and finished the tests. lmk what you think. |
| /// [msdn]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx | ||
| pub fn escape(s: Cow<str>) -> Cow<str> { | ||
| let needs_escape = s.chars().any(|c| matches!(c, '"' | '\t' | '\n' | ' ')); | ||
| if s.is_empty() || !needs_escape { |
There was a problem hiding this comment.
I'm a little bit confused on why we should for s.is_empty() || !needs_escape here in windows, but check for !s.is_empty() && needs_escape on unix.
There was a problem hiding this comment.
I think they should be functionally equivalent but I've refactored them to look pretty similar now. lmk what you think.
|
|
||
| use std::borrow::Cow; | ||
| use std::env; | ||
| #[cfg(unix)] |
There was a problem hiding this comment.
Making these modules conditionally compiled is a breaking change.
| let encoded = s.encode_wide(); | ||
| let needs_escaping = encoded.clone().any(needs_escape); | ||
|
|
||
| if s.is_empty() || !needs_escaping { |
There was a problem hiding this comment.
There's a typo
| if s.is_empty() || !needs_escaping { | |
| if !s.is_empty() && !needs_escaping { |
Hi, thanks for this project!
A recent PR in
openssh-rust/opensshprovides an implementation forescape(&OsStr) -> Cow<'_, OsStr>that I think could be a good fit here.It was suggested in a code review that it is preferable to use
shell_escapefor all the escaping needs rather than rolling out a custom implementation. However, currenlyshell_escapeonly works withCow<str>. This PR is an attempt to extend the escaping toOsStrs as well.iiuc #5 is a relevant issue that we may be able to resolve with this PR.
Let me know if you have any suggestions/comments about these changes and I'll be happy to improve them accordingly.