Skip to content

Fix non-exact match of pattern with trailing slash. #3923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 6, 2016
Merged

Fix non-exact match of pattern with trailing slash. #3923

merged 5 commits into from
Oct 6, 2016

Conversation

aaugustin
Copy link
Contributor

Currently, in the v4 branch, <Match pattern="/foo/" component={Foo} /> fails to match at /foo/bar/, because the URL is truncated to /foo/bar instead of /foo/ prior to being matched.

@aaugustin
Copy link
Contributor Author

For better or worse, some communities (notably Django) have chosen to include trailing slashes in URLs.

@aaugustin
Copy link
Contributor Author

I didn't include any documentation changes because the v4 branch doesn't appear to contain a changelog yet (until it's deemed stable, I guess).

Copy link
Member

@timdorr timdorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ryanflorence
Copy link
Member

in a browser foo and foo/ are different, I want to be like the browser, not django, I need to look deeper at this commit, on a phone right now.

@ryanflorence
Copy link
Member

With this code will a location of "foo" match pattern of "foo/"?

@timdorr
Copy link
Member

timdorr commented Sep 21, 2016

In a browser foo and foo/ are different.

Just ran into this yesterday. More specifically, in any RFC 3986 compliant URL parser will distinguish with the trailing slash.

But in terms of what this API is attempting to do, which is match a URL pattern, I can confirm if you go into something like the Basic example and change the /about Match pattern to /about/ and the Link to /about/foo/ (or /about/foo) , it no longer matches for either URL. Here's something more complete:

<Match pattern="/about/" component={About} />

// These match
<Link to="/about">About</Link> // Wrong!
<Link to="/about/">About</Link>
// These don't match :(
<Link to="/about/foo">About</Link>
<Link to="/about/foo/">About</Link>

This looks to be fixing a specific pattern condition.

The spec violation is another issue entirely, but should be fixed. Could that be coming from path-to-regexp?

@ryanflorence
Copy link
Member

ryanflorence commented Sep 21, 2016

<Match pattern="/foo/bar/"/>
// SHOULD match
"/foo/bar/"
// SHOULD NOT match
"/foo/bar"

<Match pattern="/foo/bar"/>
// SHOULD match
"/foo/bar/"
"/foo/bar"

@ryanflorence
Copy link
Member

@timdorr I fixed my message and deleted our other ones to keep this thread less crazy than it was heading :)

@timdorr
Copy link
Member

timdorr commented Sep 21, 2016

No worries. Again, that's sort of a separate issue from this PR. This is fixing matching for any non-exact pattern that has a trailing slash (E.g., /foo/ pattern doesn't match /foo/bar URL).

I say we merge this in as a start and then look at the issue of exact-ish non-trailing URLs matching a trailing slash pattern. (E.g., /foo/ pattern matches /foo URL)

@aaugustin
Copy link
Contributor Author

aaugustin commented Sep 21, 2016

Indeed browsers don' t give slashes any special treatment. Web servers serving static files off a filesystem might have filesystem-like semantics (e.g. collapse duplicate slashes, handle ../ for navigating to a parent directory, etc.) but that's unrelated to HTTP.

path-to-regex has an option related to trailing slashes:

strict When false the trailing slash is optional. (default: false)

I'm not sure if or how it comes into play here, but it's something to keep in mind.

I agree with the behavior @ryanflorence specified two comments above. Basically react-router should do a prefix match when exactly isn't specified.

I'm happy to investigate further and add more tests, either in this PR or in a follow-up PR. Whatever works best for you :-)

@aaugustin
Copy link
Contributor Author

@timdorr There's a good chance that setting the strict option to true in path-to-regex will prevent /foo/ pattern from matching /foo URL. Probably best to do that in another PR since it looks related to a different piece of code.

@timdorr
Copy link
Member

timdorr commented Sep 21, 2016

@aaugustin Yep, that's definitely what we need. Good find!

@aaugustin
Copy link
Contributor Author

Should I add this fix to this PR or start a new PR?

@timdorr
Copy link
Member

timdorr commented Sep 21, 2016

If it's that simple, I say add it here. Might as well tackle all the trailing slash stuff in one go.

@@ -1,21 +1,30 @@
import pathToRegexp from 'path-to-regexp'

const cache = {}
const pathToRegexpOptions = {strict: true}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just inline this. No need to extract it out. We're not using it anywhere else and there are only a few potential options anyways.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a small nitpick: Can you put spaces inside of the brackets for an object? It's not officially enforced, but it's kind of the standard around the code and looks good inline with everything else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Pulling the constant out was a nano-optimization. Sorry about not checking the coding style.

Copy link
Member

@timdorr timdorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes noted

@aaugustin
Copy link
Contributor Author

I just added another commit to prevent URL /foo from incorrectly matching pattern /foo/.

FYI while we're looking at path-to-regex options:

  • If you set {end: false}, it makes a prefix match instead of an exact match. That would allow react-router to get rid of the whole truncatePathnameToPattern method. Now that we have a good test coverage, I'd like to try using that option. That should be slightly more efficient as it's implemented purely as a regexp.
> pathToRegexp('/foo', keys, {strict: true})
{ /^\/foo$/i keys: [] }
> pathToRegexp('/foo', keys, {end: false, strict: true})
{ /^\/foo(?=\/|$)/i keys: [] }
  • Unless you set {sensitive: true}, it creates case-insensitive regexps. I don't have a strong opinion about that. Most projects use lower-case URLs. Then case sensitivity doesn't matter. I suppose case-insensitive matching could prevent some projects from implementing fully backwards compatible routing with pre-existing URL schemes that require telling apart upper case and lower case paths, but this sounds rather esoteric.

@aaugustin
Copy link
Contributor Author

Let's keep this PR on hold for today, I'll try {end: false} tomorrow and see if that allows simplifying the code.

@aaugustin
Copy link
Contributor Author

I just added a commit that removes the truncatePathnameToPattern function entirely and relies on features of pathToRegexp instead. I think that's much better than complicating truncatePathnameToPattern like my previous commit did. In fact, the PR simplifies matchPattern.js (-15, +11).

As far as I can tell, I have addressed all comments in the thread. This should be ready to merge. Thanks!

@aaugustin
Copy link
Contributor Author

While this PR still carries @timdorr's "Changes requested" mark, I made the requested changes, and then some.

It's ready for a new round of review and hopefully a merge. Please let me know if there's any way I can help.

@timdorr timdorr added this to the v4.0.0 milestone Sep 28, 2016
Copy link
Member

@timdorr timdorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads a little oddly, but I'm fine with it as-is.

@@ -1,22 +1,20 @@
import pathToRegexp from 'path-to-regexp'

const cache = {}
// cache[exactly][pattern] contains getMatcher(pattern, exactly)
const cache = {true: {}, false: {}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is the only weird part that bothers me. But implicit type conversion always does that for me...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! As a Pythonista, I failed to notice that these keys are strings / symbols, not booleans. I agree that it feels sloppy...

That said, adding explicit type conversions doesn't make the code more readable. Perhaps it's best to live with the implicit conversion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a small ternary to use proper string keys. To help with that, you can store that as a separate variable, cache_type for instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I added a commit with that change.

@ryanflorence ryanflorence merged commit e5be03a into remix-run:v4 Oct 6, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jan 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants