Skip to content

bpo-35975: Support parsing earlier minor versions of Python 3 #12086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 7, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions Doc/library/ast.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ The abstract grammar is currently defined as follows:
Apart from the node classes, the :mod:`ast` module defines these utility functions
and classes for traversing abstract syntax trees:

.. function:: parse(source, filename='<unknown>', mode='exec', *, type_comments=False)
.. function:: parse(source, filename='<unknown>', mode='exec', *, type_comments=False, feature_version=-1)

Parse the source into an AST node. Equivalent to ``compile(source,
filename, mode, ast.PyCF_ONLY_AST)``.
Expand All @@ -145,13 +145,19 @@ and classes for traversing abstract syntax trees:
modified to correspond to :pep:`484` "signature type comments",
e.g. ``(str, int) -> List[str]``.

Also, setting ``feature_version`` to the minor version of an
earlier Python 3 version will attempt to parse using that version's
grammar. For example, setting ``feature_version=4`` will allow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add that 4 is the lowest supported value?

the use of ``async`` and ``await`` as variable names. The lowest
supported value is 4; the highest is ``sys.version_info[1]``.

.. warning::
It is possible to crash the Python interpreter with a
sufficiently large/complex string due to stack depth limitations
in Python's AST compiler.

.. versionchanged:: 3.8
Added ``type_comments=True`` and ``mode='func_type'``.
Added ``type_comments``, ``mode='func_type'`` and ``feature_version``.


.. function:: literal_eval(node_or_string)
Expand Down
4 changes: 4 additions & 0 deletions Doc/library/token-list.inc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions Doc/library/token.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,6 @@ the :mod:`tokenize` module.

.. versionchanged:: 3.8
Added :data:`TYPE_COMMENT`.
Added :data:`AWAIT` and :data:`ASYNC` tokens back (they're needed
to support parsing older Python versions for :func:`ast.parse` with
``feature_version`` set to 6 or lower).
8 changes: 4 additions & 4 deletions Grammar/Grammar
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+
decorated: decorators (classdef | funcdef | async_funcdef)

async_funcdef: 'async' funcdef
async_funcdef: ASYNC funcdef
funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] func_body_suite

parameters: '(' [typedargslist] ')'
Expand Down Expand Up @@ -70,7 +70,7 @@ nonlocal_stmt: 'nonlocal' NAME (',' NAME)*
assert_stmt: 'assert' test [',' test]

compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
async_stmt: 'async' (funcdef | with_stmt | for_stmt)
async_stmt: ASYNC (funcdef | with_stmt | for_stmt)
if_stmt: 'if' namedexpr_test ':' suite ('elif' namedexpr_test ':' suite)* ['else' ':' suite]
while_stmt: 'while' namedexpr_test ':' suite ['else' ':' suite]
for_stmt: 'for' exprlist 'in' testlist ':' [TYPE_COMMENT] suite ['else' ':' suite]
Expand Down Expand Up @@ -106,7 +106,7 @@ arith_expr: term (('+'|'-') term)*
term: factor (('*'|'@'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power
power: atom_expr ['**' factor]
atom_expr: ['await'] atom trailer*
atom_expr: [AWAIT] atom trailer*
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' |
Expand Down Expand Up @@ -144,7 +144,7 @@ argument: ( test [comp_for] |

comp_iter: comp_for | comp_if
sync_comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_for: ['async'] sync_comp_for
comp_for: [ASYNC] sync_comp_for
comp_if: 'if' test_nocond [comp_iter]

# not used in grammar, but may appear in "node" passed from Parser to Compiler
Expand Down
2 changes: 2 additions & 0 deletions Grammar/Tokens
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ ELLIPSIS '...'
COLONEQUAL ':='

OP
AWAIT
ASYNC
TYPE_IGNORE
TYPE_COMMENT
ERRORTOKEN
Expand Down
1 change: 1 addition & 0 deletions Include/Python-ast.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Include/compile.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ PyAPI_FUNC(PyCodeObject *) PyNode_Compile(struct _node *, const char *);
#ifndef Py_LIMITED_API
typedef struct {
int cf_flags; /* bitmask of CO_xxx flags relevant to future */
int cf_feature_version; /* minor Python version (PyCF_ONLY_AST) */
} PyCompilerFlags;
#endif

Expand Down
1 change: 1 addition & 0 deletions Include/parsetok.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ typedef struct {
#define PyPARSE_IGNORE_COOKIE 0x0010
#define PyPARSE_BARRY_AS_BDFL 0x0020
#define PyPARSE_TYPE_COMMENTS 0x0040
#define PyPARSE_ASYNC_HACKS 0x0080

PyAPI_FUNC(node *) PyParser_ParseString(const char *, grammar *, int,
perrdetail *);
Expand Down
10 changes: 6 additions & 4 deletions Include/token.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions Lib/ast.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@
from _ast import *


def parse(source, filename='<unknown>', mode='exec', *, type_comments=False):
def parse(source, filename='<unknown>', mode='exec', *,
type_comments=False, feature_version=-1):
"""
Parse the source into an AST node.
Equivalent to compile(source, filename, mode, PyCF_ONLY_AST).
Expand All @@ -36,7 +37,8 @@ def parse(source, filename='<unknown>', mode='exec', *, type_comments=False):
flags = PyCF_ONLY_AST
if type_comments:
flags |= PyCF_TYPE_COMMENTS
return compile(source, filename, mode, flags)
return compile(source, filename, mode, flags,
feature_version=feature_version)


def literal_eval(node_or_string):
Expand Down
6 changes: 4 additions & 2 deletions Lib/keyword.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@
'and',
'as',
'assert',
'async',
'await',
'break',
'class',
'continue',
Expand Down Expand Up @@ -52,6 +50,10 @@
#--end keywords--
]

kwlist.append('async')
kwlist.append('await')
kwlist.sort()

iskeyword = frozenset(kwlist).__contains__

def main():
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -916,7 +916,7 @@ def XXXROUNDUP(n):
return (n + 3) & ~3
return 1 << (n - 1).bit_length()

basesize = support.calcobjsize('Pii')
basesize = support.calcobjsize('Piii')
nodesize = struct.calcsize('hP3iP0h2i')
def sizeofchildren(node):
if node is None:
Expand Down
143 changes: 102 additions & 41 deletions Lib/test/test_type_comments.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import ast
import sys
import unittest


Expand All @@ -20,6 +21,29 @@ async def bar(): # type: () -> int
return await bar()
"""

asyncvar = """\
async = 12
await = 13
"""

asynccomp = """\
async def foo(xs):
[x async for x in xs]
"""

matmul = """\
a = b @ c
"""

fstring = """\
a = 42
f"{a}"
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add underscores in numeric literals for completeness?


underscorednumber = """\
a = 42_42_42
"""

redundantdef = """\
def foo(): # type: () -> int
# type: () -> str
Expand Down Expand Up @@ -155,80 +179,117 @@ def favk(

class TypeCommentTests(unittest.TestCase):

def parse(self, source):
return ast.parse(source, type_comments=True)
lowest = 4 # Lowest minor version supported
highest = sys.version_info[1] # Highest minor version

def parse(self, source, feature_version=highest):
return ast.parse(source, type_comments=True,
feature_version=feature_version)

def parse_all(self, source, minver=lowest, maxver=highest, expected_regex=""):
for feature_version in range(self.lowest, self.highest + 1):
if minver <= feature_version <= maxver:
try:
yield self.parse(source, feature_version)
except SyntaxError as err:
raise SyntaxError(str(err) + f" feature_version={feature_version}")
else:
with self.assertRaisesRegex(SyntaxError, expected_regex,
msg=f"feature_version={feature_version}"):
self.parse(source, feature_version)

def classic_parse(self, source):
return ast.parse(source)

def test_funcdef(self):
tree = self.parse(funcdef)
self.assertEqual(tree.body[0].type_comment, "() -> int")
self.assertEqual(tree.body[1].type_comment, "() -> None")
for tree in self.parse_all(funcdef):
self.assertEqual(tree.body[0].type_comment, "() -> int")
self.assertEqual(tree.body[1].type_comment, "() -> None")
tree = self.classic_parse(funcdef)
self.assertEqual(tree.body[0].type_comment, None)
self.assertEqual(tree.body[1].type_comment, None)

def test_asyncdef(self):
tree = self.parse(asyncdef)
self.assertEqual(tree.body[0].type_comment, "() -> int")
self.assertEqual(tree.body[1].type_comment, "() -> int")
for tree in self.parse_all(asyncdef, minver=5):
self.assertEqual(tree.body[0].type_comment, "() -> int")
self.assertEqual(tree.body[1].type_comment, "() -> int")
tree = self.classic_parse(asyncdef)
self.assertEqual(tree.body[0].type_comment, None)
self.assertEqual(tree.body[1].type_comment, None)

def test_asyncvar(self):
for tree in self.parse_all(asyncvar, maxver=6):
pass

def test_asynccomp(self):
for tree in self.parse_all(asynccomp, minver=6):
pass

def test_matmul(self):
for tree in self.parse_all(matmul, minver=5):
pass

def test_fstring(self):
for tree in self.parse_all(fstring, minver=6):
pass

def test_underscorednumber(self):
for tree in self.parse_all(underscorednumber, minver=6):
pass

def test_redundantdef(self):
with self.assertRaisesRegex(SyntaxError, "^Cannot have two type comments on def"):
tree = self.parse(redundantdef)
for tree in self.parse_all(redundantdef, maxver=0,
expected_regex="^Cannot have two type comments on def"):
pass

def test_nonasciidef(self):
tree = self.parse(nonasciidef)
self.assertEqual(tree.body[0].type_comment, "() -> àçčéñt")
for tree in self.parse_all(nonasciidef):
self.assertEqual(tree.body[0].type_comment, "() -> àçčéñt")

def test_forstmt(self):
tree = self.parse(forstmt)
self.assertEqual(tree.body[0].type_comment, "int")
for tree in self.parse_all(forstmt):
self.assertEqual(tree.body[0].type_comment, "int")
tree = self.classic_parse(forstmt)
self.assertEqual(tree.body[0].type_comment, None)

def test_withstmt(self):
tree = self.parse(withstmt)
self.assertEqual(tree.body[0].type_comment, "int")
for tree in self.parse_all(withstmt):
self.assertEqual(tree.body[0].type_comment, "int")
tree = self.classic_parse(withstmt)
self.assertEqual(tree.body[0].type_comment, None)

def test_vardecl(self):
tree = self.parse(vardecl)
self.assertEqual(tree.body[0].type_comment, "int")
for tree in self.parse_all(vardecl):
self.assertEqual(tree.body[0].type_comment, "int")
tree = self.classic_parse(vardecl)
self.assertEqual(tree.body[0].type_comment, None)

def test_ignores(self):
tree = self.parse(ignores)
self.assertEqual([ti.lineno for ti in tree.type_ignores], [2, 5])
for tree in self.parse_all(ignores):
self.assertEqual([ti.lineno for ti in tree.type_ignores], [2, 5])
tree = self.classic_parse(ignores)
self.assertEqual(tree.type_ignores, [])

def test_longargs(self):
tree = self.parse(longargs)
for t in tree.body:
# The expected args are encoded in the function name
todo = set(t.name[1:])
self.assertEqual(len(t.args.args),
len(todo) - bool(t.args.vararg) - bool(t.args.kwarg))
self.assertTrue(t.name.startswith('f'), t.name)
for c in t.name[1:]:
todo.remove(c)
if c == 'v':
arg = t.args.vararg
elif c == 'k':
arg = t.args.kwarg
else:
assert 0 <= ord(c) - ord('a') < len(t.args.args)
arg = t.args.args[ord(c) - ord('a')]
self.assertEqual(arg.arg, c) # That's the argument name
self.assertEqual(arg.type_comment, arg.arg.upper())
assert not todo
for tree in self.parse_all(longargs):
for t in tree.body:
# The expected args are encoded in the function name
todo = set(t.name[1:])
self.assertEqual(len(t.args.args),
len(todo) - bool(t.args.vararg) - bool(t.args.kwarg))
self.assertTrue(t.name.startswith('f'), t.name)
for c in t.name[1:]:
todo.remove(c)
if c == 'v':
arg = t.args.vararg
elif c == 'k':
arg = t.args.kwarg
else:
assert 0 <= ord(c) - ord('a') < len(t.args.args)
arg = t.args.args[ord(c) - ord('a')]
self.assertEqual(arg.arg, c) # That's the argument name
self.assertEqual(arg.type_comment, arg.arg.upper())
assert not todo
tree = self.classic_parse(longargs)
for t in tree.body:
for arg in t.args.args + [t.args.vararg, t.args.kwarg]:
Expand All @@ -247,8 +308,8 @@ def test_inappropriate_type_comments(self):

def check_both_ways(source):
ast.parse(source, type_comments=False)
with self.assertRaises(SyntaxError):
ast.parse(source, type_comments=True)
for tree in self.parse_all(source, maxver=0):
pass

check_both_ways("pass # type: int\n")
check_both_ways("foo() # type: int\n")
Expand Down
Loading