Rewrite integer formatting by brendanzab · Pull Request #12382 · rust-lang/rust

brendanzab · 2014-02-19T00:37:27Z

This is PR is the beginning of a complete rewrite and ultimate removal of the std::num::strconv module (see #6220), and the removal of the ToStrRadix trait in favour of using the std::fmt functionality directly. This should make for a cleaner API, encourage less allocation, and make the implementation more comprehensible .

The Formatter::{pad_integral, with_padding} methods have also been refactored make things easier to understand.

The formatting tests for integers have been moved out of run-pass/ifmt.rs in order to provide more immediate feedback when building using make check-stage2-std NO_REBUILD=1.

Arbitrary radixes are now easier to use in format strings. For example:

assert_eq!(format!("{:04}", radix(3, 2)), ~"0011");

The benchmarks have been standardised between std::num::strconv and std::num::fmt to make it easier to compare the performance of the different implementations.

 type | radix | std::num::strconv      | std::num::fmt
======|=======|========================|======================
 int  | bin   | 1748 ns/iter (+/- 150) | 321 ns/iter (+/- 25)
 int  | oct   |  706 ns/iter (+/- 53)  | 179 ns/iter (+/- 22)
 int  | dec   |  640 ns/iter (+/- 59)  | 207 ns/iter (+/- 10)
 int  | hex   |  637 ns/iter (+/- 77)  | 205 ns/iter (+/- 19)
 int  | 36    |  446 ns/iter (+/- 30)  | 309 ns/iter (+/- 20)
------|-------|------------------------|----------------------
 uint | bin   | 1724 ns/iter (+/- 159) | 322 ns/iter (+/- 13)
 uint | oct   |  663 ns/iter (+/- 25)  | 175 ns/iter (+/- 7)
 uint | dec   |  613 ns/iter (+/- 30)  | 186 ns/iter (+/- 6)
 uint | hex   |  519 ns/iter (+/- 44)  | 207 ns/iter (+/- 20)
 uint | 36    |  418 ns/iter (+/- 16)  | 308 ns/iter (+/- 32)

huonw · 2014-02-19T00:39:38Z

src/doc/complement-cheatsheet.md

brendanzab · 2014-02-19T00:43:27Z

r? @alexcrichton

huonw · 2014-02-19T00:44:02Z

src/libstd/num/fmt.rs

Neat trick!

However, it doesn't seem to support arbitrary bases. What about these being normal methods fn base(&self) -> u8 (etc.) and having

pub struct Binary; impl Radix for Binary { fn base(&self) -> u8 { 2 } /* ... */ } pub struct Octal; // ... pub struct GenericRadix { priv base: u8 } impl Radix for GenerixRadix { fn radix(&self) { self.base } /* ... */ }

which would give the best of both worlds: static dispatch for the common cases but still enough flexibility to handle arbitrary bases. (And also allows for reducing code-bloat via choosing to use the GenericRadix version over monomorphising for the others, similar to a &Radix trait object, but without the extra cost of virtual calls.)

Hey that's a cool idea. Will do.

brson · 2014-02-19T02:09:01Z

Sweet perf wins. 🚀

zkamsler · 2014-02-19T03:27:45Z

src/libstd/num/fmt.rs

It may be a bit faster if you accumulate the digits backwards starting at the end of the buffer. That way you only need to make one write call to the underlying Writer.

WOAH. Mega improvement. Writing an int with a decimal radix now takes 204 ns, down from 640 ns with strconv.

Awesome! (Remember to update the PR/commit messages before it lands :) )

zkamsler · 2014-02-19T14:24:17Z

Cool stuff. It reminds me that I ought to dust off the floating point formatting code that I had written a few weeks ago.

brendanzab · 2014-02-19T16:39:14Z

Yeah, I'm looking at the Dragon4 (Steele) and Grisu3 (Loitsch) algorithms when it comes to formatting floating point. But you're welcome to look at it too. It's much more complicated than the integer stuff :)

brendanzab · 2014-02-19T16:41:04Z

@alexcrichton Ok, I think I'm done.

alexcrichton · 2014-02-19T17:34:17Z

src/libstd/fmt/mod.rs

This comment seems copied from below, there seems to be no check against the "maximum width" which I think in the case of integers is precision.

alexcrichton · 2014-02-19T18:03:50Z

This is super awesome. Perf wins, cleanups, hurrah!

I'd like to bikeshed the interface for a brief amount of time if that's ok. Some thoughts I have:

num::fmt => fmt::num?
These Radix structs have a bit of an odd api. It may be nicer to have clearly defined top-level functions inside of the module which use the Radix structures/traits internally. Basically it would be nice to avoid exposing one more trait.

What do you think?

brendanzab · 2014-02-19T22:02:05Z

Ok, updated the API and moved it to std::fmt::num.

You can now do:

assert_eq!(format!("{:04}", radix(3, 2)), ~"0011");

I haven't made the other things private, because I still think its useful to be able to write the numbers via a Formatter directly.

alexcrichton · 2014-02-19T23:51:37Z

src/libstd/fmt/mod.rs

This use may need to be updated. I think it'd be useful to show it in the documentation as well.

Could you also expand the documentation a bit about how the value can be passed to format! to format the specified integer with the specified base?

huonw reviewed Feb 19, 2014
View reviewed changes

src/doc/complement-cheatsheet.md Outdated

Copy link

Contributor

huonw Feb 19, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This text?

huonw reviewed Feb 19, 2014
View reviewed changes

zkamsler reviewed Feb 19, 2014
View reviewed changes

alexcrichton reviewed Feb 19, 2014
View reviewed changes

Uh oh!

Conversation

brendanzab commented Feb 19, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brendanzab commented Feb 19, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brson commented Feb 19, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zkamsler commented Feb 19, 2014

Uh oh!

brendanzab commented Feb 19, 2014

Uh oh!

brendanzab commented Feb 19, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Feb 19, 2014

Uh oh!

brendanzab commented Feb 19, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants