Skip to content

bpo-40636: PEP 618 implementation #20921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Jun 19, 2020
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a36a6e5
Implement zip(..., total=True)
brandtbucher Apr 29, 2020
b59fe70
total -> strict
brandtbucher Apr 30, 2020
3ed7f0a
Tests for zip(total=True)
cool-RR Apr 21, 2020
3955ca7
assert -> self.assertEqual, total -> strict
brandtbucher May 1, 2020
e4dfe05
Support all pickle protocols.
brandtbucher May 12, 2020
4d41e21
Use __reduce__/__setstate__ for pickling.
brandtbucher May 12, 2020
892961c
Clean up.
brandtbucher May 13, 2020
f13a3a5
Catch up with master
brandtbucher May 18, 2020
2d19de5
Handle StopIteration raised from Pythonland
brandtbucher May 20, 2020
f2235f2
Clean up error handling
brandtbucher May 20, 2020
f38ed4f
Fix signature in docstring
brandtbucher Jun 17, 2020
8e9b4e6
Reword error messages
brandtbucher Jun 17, 2020
f2f3e03
blurb add
brandtbucher Jun 17, 2020
3ad4c92
Improve error messages
brandtbucher Jun 17, 2020
0fa507c
Fix NEWS
brandtbucher Jun 18, 2020
dca9af0
Add strict member to zipobject
brandtbucher Jun 18, 2020
aa5c8f6
Simplify error message construction
brandtbucher Jun 18, 2020
7d24e12
Use PyTuple_Pack for pickling
brandtbucher Jun 18, 2020
05183d9
Add round-trip pickle tests
brandtbucher Jun 18, 2020
28edf45
Add tests for pickle stability
brandtbucher Jun 18, 2020
baff9c8
Add comments and improve error handling.
brandtbucher Jun 18, 2020
2ac3036
Add tests for error handling
brandtbucher Jun 18, 2020
ea2a7cf
Reword NEWS
brandtbucher Jun 18, 2020
3613e34
Refactor tests
brandtbucher Jun 18, 2020
329afd9
Clean up tests
brandtbucher Jun 18, 2020
e86daab
Remove comment
brandtbucher Jun 18, 2020
d77134e
Rename/reorder tests
brandtbucher Jun 18, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions Lib/test/test_builtin.py
Original file line number Diff line number Diff line change
Expand Up @@ -1585,6 +1585,25 @@ def __iter__(self):

self.assertIs(cm.exception, exception)

def test_zip_strict(self):
self.assertEqual(tuple(zip((1, 2, 3), 'abc', strict=True)),
((1, 'a'), (2, 'b'), (3, 'c')))
self.assertRaises(ValueError, tuple,
zip((1, 2, 3, 4), 'abc', strict=True))
self.assertRaises(ValueError, tuple,
zip((1, 2), 'abc', strict=True))
self.assertRaises(ValueError, tuple,
zip((1, 2), (1, 2), 'abc', strict=True))

def test_zip_strict_iterators(self):
x = iter(range(5))
y = [0]
z = iter(range(5))
self.assertRaises(ValueError, list,
(zip(x, y, z, strict=True)))
self.assertEqual(next(x), 2)
self.assertEqual(next(z), 1)

def test_format(self):
# Test the basic machinery of the format() builtin. Don't test
# the specifics of the various formatters
Expand Down
77 changes: 69 additions & 8 deletions Python/bltinmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -2517,8 +2517,8 @@ builtin_issubclass_impl(PyObject *module, PyObject *cls,

typedef struct {
PyObject_HEAD
Py_ssize_t tuplesize;
PyObject *ittuple; /* tuple of iterators */
Py_ssize_t tuplesize; /* negative when strict=True */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using a separated member for strict? I expected "int strict".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Encoding an extra bit in the sign is so 1975. :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this encoding (size + strict) makes sense for getstate/setstate: it provides backward compatibility for pickle which is nice!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was feeling a bit too clever here... and preempting the argument that I was bloating zip instances. ;)

Changes coming soon.

PyObject *ittuple; /* tuple of iterators */
PyObject *result;
} zipobject;

Expand All @@ -2530,9 +2530,21 @@ zip_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
PyObject *ittuple; /* tuple of iterators */
PyObject *result;
Py_ssize_t tuplesize;
int strict = 0;

if (type == &PyZip_Type && !_PyArg_NoKeywords("zip", kwds))
return NULL;
if (kwds) {
PyObject *empty = PyTuple_New(0);
if (empty == NULL) {
return NULL;
}
static char *kwlist[] = {"strict", NULL};
int parsed = PyArg_ParseTupleAndKeywords(
empty, kwds, "|$p:zip", kwlist, &strict);
Py_DECREF(empty);
if (!parsed) {
return NULL;
}
}

/* args must be a tuple */
assert(PyTuple_Check(args));
Expand Down Expand Up @@ -2571,7 +2583,7 @@ zip_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
return NULL;
}
lz->ittuple = ittuple;
lz->tuplesize = tuplesize;
lz->tuplesize = strict ? -tuplesize : tuplesize;
lz->result = result;

return (PyObject *)lz;
Expand All @@ -2598,7 +2610,7 @@ static PyObject *
zip_next(zipobject *lz)
{
Py_ssize_t i;
Py_ssize_t tuplesize = lz->tuplesize;
Py_ssize_t tuplesize = Py_ABS(lz->tuplesize);
PyObject *result = lz->result;
PyObject *it;
PyObject *item;
Expand All @@ -2613,6 +2625,9 @@ zip_next(zipobject *lz)
item = (*Py_TYPE(it)->tp_iternext)(it);
if (item == NULL) {
Py_DECREF(result);
if (lz->tuplesize < 0) {
goto check;
}
return NULL;
}
olditem = PyTuple_GET_ITEM(result, i);
Expand All @@ -2628,24 +2643,67 @@ zip_next(zipobject *lz)
item = (*Py_TYPE(it)->tp_iternext)(it);
if (item == NULL) {
Py_DECREF(result);
if (lz->tuplesize < 0) {
goto check;
}
return NULL;
}
PyTuple_SET_ITEM(result, i, item);
}
}
return result;
check:
if (PyErr_Occurred() && !PyErr_ExceptionMatches(PyExc_StopIteration)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in the error label is non-trivial. You may add comments. Here I propose:

// next() on argument i raised an exception (not StopIteration)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error path is not tested: would you mind to write an unit test for it?

return NULL;
}
if (i) {
PyErr_Clear();
return PyErr_Format(PyExc_ValueError, "%s() argument %d is too short",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It bothers me that the implementation gives different errors depending on whether it's the first iterator that's longer or any of the later ones. It seems to assume that the first iterator is "right" and the others are either too short or too long. But isn't it more the case that all we know is that they aren't all of the same length -- we shouldn't judge which one is too long or too short. Perhaps the errors could be changed to something like "iterator N is shorter/longer than the preceding iterators"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, perhaps these are a bit too terse. I'll think on some other options (I also agree that unifying the messages is probably a good idea).

Py_TYPE(lz)->tp_name, i + 1);
}
for (i = 1; i < tuplesize; i++) {
it = PyTuple_GET_ITEM(lz->ittuple, i);
item = (*Py_TYPE(it)->tp_iternext)(it);
if (item) {
Py_DECREF(item);
PyErr_Clear();
return PyErr_Format(PyExc_ValueError, "%s() argument %d is too long",
Py_TYPE(lz)->tp_name, i + 1);
}
if (PyErr_Occurred()) {
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This errror path is not tested. Would you mind to add two unit tests for it?

  • One test where an iterator raises StopIteration
  • One test where an iterator raises an exception other than StopIteration

... This is bug in this code: if StopIteration is raised, ValueError is not raised whereas a following argument can be not exhausted yet.

zip(a, b,c, strict=True): if a is exhausted, b raises StopIteration, but c is not exhausted => StopIteration is passed through, whereas a ValueError must be raised because c is not exhausted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always trip on the fact that Python code raises StopIteration but C code usually doesn't. Nice catch.

}
}
return NULL;
}

static PyObject *
zip_reduce(zipobject *lz, PyObject *Py_UNUSED(ignored))
{
/* Just recreate the zip with the internal iterator tuple */
if (lz->tuplesize < 0) {
return Py_BuildValue("OOO", Py_TYPE(lz), lz->ittuple, Py_True);
}
return Py_BuildValue("OO", Py_TYPE(lz), lz->ittuple);
}

PyDoc_STRVAR(setstate_doc, "Set state information for unpickling.");

static PyObject *
zip_setstate(zipobject *lz, PyObject *state)
{
int strict = PyObject_IsTrue(state);
if (strict < 0) {
return NULL;
}
lz->tuplesize = strict ? -Py_ABS(lz->tuplesize) : Py_ABS(lz->tuplesize);
Py_RETURN_NONE;
}

static PyMethodDef zip_methods[] = {
{"__reduce__", (PyCFunction)zip_reduce, METH_NOARGS, reduce_doc},
{NULL, NULL} /* sentinel */
{"__setstate__", (PyCFunction)zip_setstate, METH_O, setstate_doc},
{NULL} /* sentinel */
};

PyDoc_STRVAR(zip_doc,
Expand All @@ -2657,7 +2715,10 @@ PyDoc_STRVAR(zip_doc,
The zip object yields n-length tuples, where n is the number of iterables\n\
passed as positional arguments to zip(). The i-th element in every tuple\n\
comes from the i-th iterable argument to zip(). This continues until the\n\
shortest argument is exhausted.");
shortest argument is exhausted.\n\
\n\
If strict is true and one of the arguments is exhausted before the others,\n\
raise a ValueError.");

PyTypeObject PyZip_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
Expand Down