printf: Consistently handle negative widths/precision#7620
printf: Consistently handle negative widths/precision#7620sylvestre merged 3 commits intouutils:mainfrom
Conversation
528d1a2 to
f56ad61
Compare
|
GNU testsuite comparison: |
f56ad61 to
43310fe
Compare
|
GNU testsuite comparison: |
| }; | ||
| match next { | ||
| FormatArgument::SignedInt(n) if !n.is_negative() => *n, | ||
| FormatArgument::Unparsed(s) if !s.starts_with('-') => { |
There was a problem hiding this comment.
!s.starts_with('-') Can the string start with spaces? I think you should probably parse first...
Actually, you might be better off moving that logic to resolve_asterisk_precision, I feel it'd be easier to handle there.
There was a problem hiding this comment.
Oh I see, you even want to handle negative values < i64::MIN... Still need to figure out how to handle leading white spaces though (and please add a test for that ,-P)
There was a problem hiding this comment.
Well, GNU coreutils doesn't handle such large values, so maybe we don't have to either:
env printf "|%.*d|" " -7777777777777777777777777" "10"
|printf: ‘ -7777777777777777777777777’: Numerical result out of range
10|
There was a problem hiding this comment.
Interesting, I didn't realize GNU printf gave that error for significantly negative precision's - actually, it seems like I can remove get_i64_non_negative and still handle the case in the GNU tests with a precision of -2147483649.
Ignoring whitespace is unexpected for me, I don't think that's universal or documented in man printf? Either way, this is now tested.
There was a problem hiding this comment.
Ignoring whitespace is unexpected for me, I don't think that's universal or documented in
man printf? Either way, this is now tested.
Yeah, there are a lot of undocumented corner cases, especially when it comes to parsing/formatting. I'm still new here, but I think the idea is to be as compatible as possible with GNU coreutils (unless GNU behavior cannot really be justified as "correct"), that means reading the manual, but also feeding creative inputs to the GNU implementation and seeing how it behaves.
Thanks for fixing and testing that.
| .succeeds() | ||
| .stdout_only("|0|"); | ||
|
|
||
| new_ucmd!() |
There was a problem hiding this comment.
Mind adding a comment here that explains that negative widths get aligned to the left? It's a bit tricky to see.
There was a problem hiding this comment.
Sure, just pushed (also bumped the width to 3 to make it a little more visible
|
GNU testsuite comparison: |
| let s = args.get_str(); | ||
| let mut parsed = Vec::new(); | ||
| for c in parse_escape_only(s.as_bytes(), OctalParsing::default()) { | ||
| for c in parse_escape_only(s.as_bytes(), OctalParsing::ThreeDigits) { |
There was a problem hiding this comment.
Just to make sure, this doesn't break stuff like this right?
env printf "%b" "\\05llo" | hexdump -C
00000000 05 6c 6c 6f |.llo|
env printf "%b" "\\5llo" | hexdump -C
00000000 05 6c 6c 6f |.llo|
There was a problem hiding this comment.
It doesn't break it (this is with GNU coreutils 9.6 in /bin):
$ ./target/debug/printf '%b' "\\05llo"|od -c
0000000 005 l l o
0000004
$ ./target/debug/printf '%b' "\\5llo"|od -c
0000000 005 l l o
0000004
$ /bin/printf '%b' "\\05llo"|od -c
0000000 005 l l o
0000004
$ /bin/printf '%b' "\\5llo"|od -c
0000000 005 l l o
0000004I added some tests to show that octal characters are still accepted at different lengths (1-3 numbers normally and 1-4 with leading zero for %b)
There was a problem hiding this comment.
please avoid screenshot.
They are terrible for accessibility and search
There was a problem hiding this comment.
Thank you for the reminder, I replaced the screenshot
|
I think this is mostly ok (just one small question on |
|
GNU testsuite comparison: |
|
LGTM thanks! I don't think the cargo clippy issues are your fault... (#7640) |
|
Thanks for the reviews! Yeah, although I guess the clippy warnings are a good way to find out there's new rust version :) |
|
GNU testsuite comparison: |
impressive, bravo! |
| if let Some((0, _)) = chars.peek() { | ||
| if integral_only { | ||
| return Err(ExtendedParserError::NotNumeric); | ||
| return if integral_only { |
There was a problem hiding this comment.
Sweet
this might be a new clippy warning
cc @samueltardieu ;)
There was a problem hiding this comment.
Didn't redundant_else_block trigger on the original code (this covers if { …; return …; } else { … })?
Concerning the lint you're proposing, this could be a new lint indeed, see rust-lang/rust-clippy#14545 (I haven't found an existing proposal, maybe this is a duplicate).
There was a problem hiding this comment.
My IDE actually warned about it, I didn't know it wasn't already in clippy: https://www.jetbrains.com/help/inspectopedia/RsLift.html#locating-this-inspection
| } | ||
| } | ||
|
|
||
| fn resolve_asterisk_precision<'a>( |
There was a problem hiding this comment.
would be nice to add a comment explaining what this function is doing
and write unit tests (if not too complex)
There was a problem hiding this comment.
Sure thing, added doc and unit tests. Thanks for the review!
Also allows character constants with " instead of ', and for interpolated values with %b to use \0XXX notation for octal bytes
|
GNU testsuite comparison: |
This pull request fixes the tests/printf/printf.sh tests. The main discrepancies between the uutil's implementation and GNU's appear to be the handling of negative widths or precisions when provided for
%*.*- this was implemented for strings, but this PR brings other format specifications to also handle negative widths/precisions.This PR also allows character constants with " instead of '. I had trouble finding any documentation about this; as far as I can tell GNU printf treats " and ' the same in the situations tested.
There is also a bug fix for interpolated values with %b to use \0XXX notation for octal bytes - this comes in for something like
printf '\0001', which should be the null byte and an ASCII '1', vsprintf '%b' '\0001', which is the single 0x01 byte because interpolated values use \0XXX formatted octal bytes.