Skip to content

bpo-12910: update and correct quote docstring #2568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 10, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 20 additions & 13 deletions Lib/urllib/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -746,25 +746,32 @@ def quote(string, safe='/', encoding=None, errors=None):
"""quote('abc def') -> 'abc%20def'

Each part of a URL, e.g. the path info, the query, etc., has a
different set of reserved characters that must be quoted.
different set of reserved characters that must be quoted. The
quote function offers a cautious (not minimal) way to quote a
string for most of these parts.

RFC 3986 Uniform Resource Identifiers (URI): Generic Syntax lists
the following reserved characters.
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax lists
the following (un)reserved characters.

reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
"$" | "," | "~"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the "/" character to "|" first. Even in the comments, if we preserved the previous doc, it won't intuitively break our reading. At other places, "|" is used for or condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it was used as per the RFC example:
https://tools.ietf.org/html/rfc3986#appendix-A

This is okay!

sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="

Each of these characters is reserved in some component of a URL,
Each of the reserved characters is reserved in some component of a URL,
but not necessarily in all of them.

Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings.
Now, "~" is included in the set of reserved characters.
The quote function %-escapes all characters that are neither in the
unreserved chars ("always safe") nor the additional chars set via the
safe arg.

The default for the safe arg is '/'. The character is reserved, but in
typical usage the quote function is being called on a path where the
existing slash characters are to be preserved.

By default, the quote function is intended for quoting the path
section of a URL. Thus, it will not encode '/'. This character
is reserved, but in typical usage the quote function is being
called on a path where the existing slash characters are used as
reserved characters.
Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings.
Now, "~" is included in the set of unreserved characters.

string and safe may be either str or bytes objects. encoding and errors
must not be specified if string is a bytes object.
Expand Down