-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-16285: Update urllib quoting to RFC 3986 #173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
41439a0
3e8ac1b
a4bd54b
1881cef
c67169c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -704,7 +704,7 @@ def unquote_plus(string, encoding='utf-8', errors='replace'): | |
_ALWAYS_SAFE = frozenset(b'ABCDEFGHIJKLMNOPQRSTUVWXYZ' | ||
b'abcdefghijklmnopqrstuvwxyz' | ||
b'0123456789' | ||
b'_.-') | ||
b'_.-~') | ||
_ALWAYS_SAFE_BYTES = bytes(_ALWAYS_SAFE) | ||
_safe_quoters = {} | ||
|
||
|
@@ -736,15 +736,18 @@ def quote(string, safe='/', encoding=None, errors=None): | |
Each part of a URL, e.g. the path info, the query, etc., has a | ||
different set of reserved characters that must be quoted. | ||
|
||
RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax lists | ||
RFC 3986 Uniform Resource Identifiers (URI): Generic Syntax lists | ||
the following reserved characters. | ||
|
||
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | | ||
"$" | "," | ||
"$" | "," | "~" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is wrong: "~" is in the set of _UN_reserved chars in RFC 3986, please see https://bugs.python.org/issue12910 and its PR #2568 |
||
|
||
Each of these characters is reserved in some component of a URL, | ||
but not necessarily in all of them. | ||
|
||
Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings. | ||
Now, "~" is included in the set of reserved characters. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I missed this one earlier: there's no need to have the version change info in the docstring, so the change in the RFC reference and the addition of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's also wrong as stated above. |
||
By default, the quote function is intended for quoting the path | ||
section of a URL. Thus, it will not encode '/'. This character | ||
is reserved, but in typical usage the quote function is being | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this is the test case update I missed in my earlier review :)