Skip to content

Fix handling of escape codes in paths#118

Merged
ahl merged 4 commits into
mainfrom
escapism
Jun 23, 2021
Merged

Fix handling of escape codes in paths#118
ahl merged 4 commits into
mainfrom
escapism

Conversation

@ahl

@ahl ahl commented Jun 22, 2021

Copy link
Copy Markdown
Collaborator

Previously we haven't had any special handling for HTTP path escape codes. This implicitly required consumers to do all their own decoding... which they surely were not. This PR adds proper handling for input paths i.e. paths as the results of a query, and it enforces stricter requirements for paths specified by consumers for endpoints. For example, we strip out "//" for a query, but panic!() if a consumer were to specify a path that contained "//" (note we also panic if a consumer tries to register two handlers for the same path).

@smklein smklein left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of those patches where I really appreciate having all the tests - thank you!

Looks good, just a handful of comments.

Comment thread dropshot/src/api_description.rs Outdated
Comment thread dropshot/src/api_description.rs Outdated
Comment thread dropshot/src/router.rs Outdated
Comment thread dropshot/src/router.rs Outdated
Comment thread dropshot/src/router.rs
* We use this type to avoid confusion with paths used to define routes.
*/
#[derive(Debug)]
pub struct InputPath<'a>(&'a str);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no issue with this current use of typing, but I feel like I've usually seen the strong types used as an "output" of parsing functions, rather than input. IMO that seems more useful, since it propagates the notion that "this &str has been validated", whereas I think InputPath is currently truly indistinguishable from an arbitrary &str.

(I'm kinda eyeing route_path_to_segments and input_path_to_segments below - it does seem like both could take an arbitrary &str as input, but the results kinda have different implications)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about what would be most idiomatic here. I wanted to "quarantine" the input type to avoid accidental use of it. I agree that it's all a little meh. I'd love any input regarding how to improve it.

@smklein smklein Jun 23, 2021

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna give the caveat - I think this patch is decidedly a net improvement, so this comment shouldn't be a blocker!

The reason I'm a little skeptical of the current usage is that I don't think the quarantining is actually happening very clearly (aside from requiring a bunch of .into() calls). Any function that operates on an InputPath may as well operate on a &str, since there is (seemingly?) no check or consideration to distinguish them.

IMO, the path that's the result of a "user input" actually is a str. If we want to distinguish that from a route-defining path, which has a higher burden of validation, perhaps that is the type we should abstract more cautiously.

As an example, route_path_to_segments returns a Vec<&str> - but it implicitly carries the implication that "this is actually a non-empty &str type, without / characters.

We could make that more explicit by returning a Vec<Segment>, where Segment is a &str, but it can be only constructed post-validation.

Similarly, input_path_to_segments could return a Vec<Segment>, to indicate that it represents a percent-decoded, non "./.." str.

(This actually raises the question - should these functions actually be returning distinct Segment types? They appear to be performing slightly different validations)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: I think this pattern is generally the one Cliff calls "the typestate pattern".

@ahl

ahl commented Jun 23, 2021

Copy link
Copy Markdown
Collaborator Author

@smklein thanks for the thorough review. I've pushed a new commit that I think addresses your feedback.

@smklein

smklein commented Jun 23, 2021

Copy link
Copy Markdown
Contributor

@smklein thanks for the thorough review. I've pushed a new commit that I think addresses your feedback.

One last comment on the InputPath type, but LGTM!

@ahl ahl merged commit feea258 into main Jun 23, 2021
@ahl

ahl commented Jun 23, 2021

Copy link
Copy Markdown
Collaborator Author

I'm going to attempt to revisit the discussion of InputPath in #110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants