Skip to content

[2.7] bpo-33038: gzip.GzipFile assumes non-None name attribute #6095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 9, 2018
Merged
6 changes: 5 additions & 1 deletion Lib/gzip.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,11 @@ def __init__(self, filename=None, mode=None,
if filename is None:
# Issue #13781: os.fdopen() creates a fileobj with a bogus name
# attribute. Avoid saving this in the gzip header's filename field.
if hasattr(fileobj, 'name') and fileobj.name != '<fdopen>':
if (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to backport the code from 3.x (with adding a special case for '<fdopen>').

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting looks uncommon. I suggest to remove suspicious newlines. Could it fit in one line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with basestring, it's one char too long to fit on a line. This format is in PEP8 (and flake8 and company are fine with it). I prefer it to this:

if (not isinstance(filename, basestring) or
    filename == '<fdopen>'):
    filename = ''

because it is hard to see from the next line. If you'd rather some other style that's fine, but there doesn't seem to be a precedent in this file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can write it in the line of length 79. This is in the PEP 8 limit. Even if it be out of the limit, PEP 8 suggests the following options:

  • No extra indentation.
            if (not isinstance(filename, basestring) or
                filename == '<fdopen>'):
                filename = ''
  • Add a comment, which will provide some distinction in editors supporting syntax highlighting.
            if (not isinstance(filename, basestring) or
                filename == '<fdopen>'):
                # comment
                filename = ''
  • Add some extra indentation on the conditional continuation line.
            if (not isinstance(filename, basestring)
                    or filename == '<fdopen>'):
                filename = ''

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm familiar with the PEP. I got rid of the parens and put everything on one line.

hasattr(fileobj, 'name') and
fileobj.name is not None and
fileobj.name != '<fdopen>'
):
filename = fileobj.name
else:
filename = ''
Expand Down
8 changes: 8 additions & 0 deletions Lib/test/test_gzip.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import os
import io
import struct
import tempfile
gzip = test_support.import_module('gzip')

data1 = """ int length=DEFAULTALLOC, err = Z_OK;
Expand Down Expand Up @@ -359,6 +360,13 @@ def test_read_with_extra(self):
with gzip.GzipFile(fileobj=io.BytesIO(gzdata)) as f:
self.assertEqual(f.read(), b'Test')

def test_fileobj_without_name(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to backport test_fileobj_from_fdopen from 3.x.

And add a similar test with fileobj=io.open(fd, 'wb') instead of fileobj=os.fdopen(fd, 'wb').

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to backport test_fileobj_from_fdopen from 3.x.

I'm not sure I follow. The only difference between 2.7's test_fileobj_from_fdopen and 3.7's is the assert:
2.7
3.7

Could you clarify?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right then.

# Issue #33038: GzipFile should not assume that file objects that have
# a .name attribute use a non-None value.
with tempfile.SpooledTemporaryFile() as f:
with gzip.GzipFile(fileobj=f, mode='wb') as archive:
archive.write(b'data')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an assertion for name here?


def test_main(verbose=None):
test_support.run_unittest(TestGzip)

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Using a tempfile.SpooledTemporaryFile object with gzip.GzipFile no longer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention also the case of io.open() with a file descriptor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't think this is a new addition? That wasn't what the bug was aimed at - only for special file objects that (a) have a name attribute, but (b) don't set it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes a case when fileobj has a non-string name attribute. tempfile.SpooledTemporaryFile is one example. io.open() with a file descriptor is other example. Maybe there are other examples in the stdlib. And it can fix also cases with third-party file-like objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I understand - in that case the name attribute is an int.

produces an AttributeError exception.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "Patch by yourname."