Skip to content

bpo-30103: Allow Uuencode in Python using backtick as zero instead of space #1326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 3, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions Doc/library/binascii.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,14 @@ The :mod:`binascii` module defines the following functions:
data may be followed by whitespace.


.. function:: b2a_uu(data)
.. function:: b2a_uu(data, *, backtick=False)

Convert binary data to a line of ASCII characters, the return value is the
converted line, including a newline char. The length of *data* should be at most
45.
45. If *backtick* is true, zeros are represented by ``'`'`` instead of spaces.

.. versionchanged:: 3.7
Added the *backtick* parameter.


.. function:: a2b_base64(string)
Expand All @@ -53,7 +56,7 @@ The :mod:`binascii` module defines the following functions:
than one line may be passed at a time.


.. function:: b2a_base64(data, \*, newline=True)
.. function:: b2a_base64(data, *, newline=True)

Convert binary data to a line of ASCII characters in base64 coding. The return
value is the converted line, including a newline char if *newline* is
Expand Down
8 changes: 6 additions & 2 deletions Doc/library/uu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,16 @@ This code was contributed by Lance Ellinghouse, and modified by Jack Jansen.
The :mod:`uu` module defines the following functions:


.. function:: encode(in_file, out_file, name=None, mode=None)
.. function:: encode(in_file, out_file, name=None, mode=None, *, backtick=False)

Uuencode file *in_file* into file *out_file*. The uuencoded file will have
the header specifying *name* and *mode* as the defaults for the results of
decoding the file. The default defaults are taken from *in_file*, or ``'-'``
and ``0o666`` respectively.
and ``0o666`` respectively. If *backtick* is true, zeros are represented by
``'`'`` instead of spaces.

.. versionchanged:: 3.7
Added the *backtick* parameter.


.. function:: decode(in_file, out_file=None, mode=None, quiet=False)
Expand Down
4 changes: 4 additions & 0 deletions Doc/tools/susp-ignored.csv
Original file line number Diff line number Diff line change
Expand Up @@ -328,3 +328,7 @@ whatsnew/3.5,,:exception,ERROR:root:exception
whatsnew/changelog,,:version,import sys; I = version[:version.index(' ')]
whatsnew/changelog,,`,"for readability (was ""`"")."
whatsnew/changelog,,:end,str[start:end]
library/binascii,,`,'`'
library/uu,,`,'`'
whatsnew/3.7,,`,'`'
whatsnew/changelog,,`,'`'
14 changes: 14 additions & 0 deletions Doc/whatsnew/3.7.rst
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,13 @@ New Modules
Improved Modules
================

binascii
--------

The :func:`~binascii.b2a_uu` function now accepts an optional *backtick*
keyword argument. When it's true, zeros are represented by ``'`'``
instead of spaces. (Contributed by Xiang Zhang in :issue:`30103`.)

contextlib
----------

Expand Down Expand Up @@ -159,6 +166,13 @@ urllib.parse
adding `~` to the set of characters that is never quoted by default.
(Contributed by Christian Theune and Ratnadeep Debnath in :issue:`16285`.)

uu
--

Function :func:`~uu.encode` now accepts an optional *backtick*
keyword argument. When it's true, zeros are represented by ``'`'``
instead of spaces. (Contributed by Xiang Zhang in :issue:`30103`.)


Optimizations
=============
Expand Down
36 changes: 24 additions & 12 deletions Lib/test/test_binascii.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,29 +112,41 @@ def addnoise(line):

def test_uu(self):
MAX_UU = 45
lines = []
for i in range(0, len(self.data), MAX_UU):
b = self.type2test(self.rawdata[i:i+MAX_UU])
a = binascii.b2a_uu(b)
lines.append(a)
res = bytes()
for line in lines:
a = self.type2test(line)
b = binascii.a2b_uu(a)
res += b
self.assertEqual(res, self.rawdata)
for backtick in (True, False):
lines = []
for i in range(0, len(self.data), MAX_UU):
b = self.type2test(self.rawdata[i:i+MAX_UU])
a = binascii.b2a_uu(b, backtick=backtick)
lines.append(a)
res = bytes()
for line in lines:
a = self.type2test(line)
b = binascii.a2b_uu(a)
res += b
self.assertEqual(res, self.rawdata)

self.assertEqual(binascii.a2b_uu(b"\x7f"), b"\x00"*31)
self.assertEqual(binascii.a2b_uu(b"\x80"), b"\x00"*32)
self.assertEqual(binascii.a2b_uu(b"\xff"), b"\x00"*31)
self.assertRaises(binascii.Error, binascii.a2b_uu, b"\xff\x00")
self.assertRaises(binascii.Error, binascii.a2b_uu, b"!!!!")

self.assertRaises(binascii.Error, binascii.b2a_uu, 46*b"!")

# Issue #7701 (crash on a pydebug build)
self.assertEqual(binascii.b2a_uu(b'x'), b'!> \n')

self.assertEqual(binascii.b2a_uu(b''), b' \n')
self.assertEqual(binascii.b2a_uu(b'', backtick=True), b'`\n')
self.assertEqual(binascii.a2b_uu(b' \n'), b'')
self.assertEqual(binascii.a2b_uu(b'`\n'), b'')
self.assertEqual(binascii.b2a_uu(b'\x00Cat'), b'$ $-A= \n')
self.assertEqual(binascii.b2a_uu(b'\x00Cat', backtick=True),
b'$`$-A=```\n')
self.assertEqual(binascii.a2b_uu(b'$`$-A=```\n'),
binascii.a2b_uu(b'$ $-A= \n'))
with self.assertRaises(TypeError):
binascii.b2a_uu(b"", True)

def test_crc_hqx(self):
crc = binascii.crc_hqx(self.type2test(b"Test the CRC-32 of"), 0)
crc = binascii.crc_hqx(self.type2test(b" this string."), crc)
Expand Down
83 changes: 49 additions & 34 deletions Lib/test/test_uu.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@
import uu
import io

plaintext = b"The smooth-scaled python crept over the sleeping dog\n"
plaintext = b"The symbols on top of your keyboard are !@#$%^&*()_+|~\n"

encodedtext = b"""\
M5&AE('-M;V]T:\"US8V%L960@<'ET:&]N(&-R97!T(&]V97(@=&AE('-L965P
(:6YG(&1O9PH """
M5&AE('-Y;6)O;',@;VX@=&]P(&]F('EO=7(@:V5Y8F]A<F0@87)E("% (R0E
*7B8J*"E?*WQ^"@ """

# Stolen from io.py
class FakeIO(io.TextIOWrapper):
Expand Down Expand Up @@ -44,9 +44,14 @@ def getvalue(self):
return self.buffer.getvalue().decode(self._encoding, self._errors)


def encodedtextwrapped(mode, filename):
return (bytes("begin %03o %s\n" % (mode, filename), "ascii") +
encodedtext + b"\n \nend\n")
def encodedtextwrapped(mode, filename, backtick=False):
if backtick:
res = (bytes("begin %03o %s\n" % (mode, filename), "ascii") +
encodedtext.replace(b' ', b'`') + b"\n`\nend\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the only space in encodedtext is the padding space. It would be worth to change examples so that they include inner spaces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I found it when write the test too but didn't change it.

else:
res = (bytes("begin %03o %s\n" % (mode, filename), "ascii") +
encodedtext + b"\n \nend\n")
return res

class UUTest(unittest.TestCase):

Expand All @@ -59,20 +64,27 @@ def test_encode(self):
out = io.BytesIO()
uu.encode(inp, out, "t1", 0o644)
self.assertEqual(out.getvalue(), encodedtextwrapped(0o644, "t1"))
inp = io.BytesIO(plaintext)
out = io.BytesIO()
uu.encode(inp, out, "t1", backtick=True)
self.assertEqual(out.getvalue(), encodedtextwrapped(0o666, "t1", True))
with self.assertRaises(TypeError):
uu.encode(inp, out, "t1", 0o644, True)

def test_decode(self):
inp = io.BytesIO(encodedtextwrapped(0o666, "t1"))
out = io.BytesIO()
uu.decode(inp, out)
self.assertEqual(out.getvalue(), plaintext)
inp = io.BytesIO(
b"UUencoded files may contain many lines,\n" +
b"even some that have 'begin' in them.\n" +
encodedtextwrapped(0o666, "t1")
)
out = io.BytesIO()
uu.decode(inp, out)
self.assertEqual(out.getvalue(), plaintext)
for backtick in True, False:
inp = io.BytesIO(encodedtextwrapped(0o666, "t1", backtick=backtick))
out = io.BytesIO()
uu.decode(inp, out)
self.assertEqual(out.getvalue(), plaintext)
inp = io.BytesIO(
b"UUencoded files may contain many lines,\n" +
b"even some that have 'begin' in them.\n" +
encodedtextwrapped(0o666, "t1", backtick=backtick)
)
out = io.BytesIO()
uu.decode(inp, out)
self.assertEqual(out.getvalue(), plaintext)

def test_truncatedinput(self):
inp = io.BytesIO(b"begin 644 t1\n" + encodedtext)
Expand All @@ -94,25 +106,33 @@ def test_missingbegin(self):

def test_garbage_padding(self):
# Issue #22406
encodedtext = (
encodedtext1 = (
b"begin 644 file\n"
# length 1; bits 001100 111111 111111 111111
b"\x21\x2C\x5F\x5F\x5F\n"
b"\x20\n"
b"end\n"
)
encodedtext2 = (
b"begin 644 file\n"
# length 1; bits 001100 111111 111111 111111
b"\x21\x2C\x5F\x5F\x5F\n"
b"\x60\n"
b"end\n"
)
plaintext = b"\x33" # 00110011

with self.subTest("uu.decode()"):
inp = io.BytesIO(encodedtext)
out = io.BytesIO()
uu.decode(inp, out, quiet=True)
self.assertEqual(out.getvalue(), plaintext)
for encodedtext in encodedtext1, encodedtext2:
with self.subTest("uu.decode()"):
inp = io.BytesIO(encodedtext)
out = io.BytesIO()
uu.decode(inp, out, quiet=True)
self.assertEqual(out.getvalue(), plaintext)

with self.subTest("uu_codec"):
import codecs
decoded = codecs.decode(encodedtext, "uu_codec")
self.assertEqual(decoded, plaintext)
with self.subTest("uu_codec"):
import codecs
decoded = codecs.decode(encodedtext, "uu_codec")
self.assertEqual(decoded, plaintext)

class UUStdIOTest(unittest.TestCase):

Expand Down Expand Up @@ -250,11 +270,6 @@ def test_decodetwice(self):
finally:
self._kill(f)

def test_main():
support.run_unittest(UUTest,
UUStdIOTest,
UUFileTest,
)

if __name__=="__main__":
test_main()
unittest.main()
13 changes: 8 additions & 5 deletions Lib/uu.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@

"""Implementation of the UUencode and UUdecode functions.

encode(in_file, out_file [,name, mode])
decode(in_file [, out_file, mode])
encode(in_file, out_file [,name, mode], *, backtick=False)
decode(in_file [, out_file, mode, quiet])
"""

import binascii
Expand All @@ -39,7 +39,7 @@
class Error(Exception):
pass

def encode(in_file, out_file, name=None, mode=None):
def encode(in_file, out_file, name=None, mode=None, *, backtick=False):
"""Uuencode file"""
#
# If in_file is a pathname open it and change defaults
Expand Down Expand Up @@ -79,9 +79,12 @@ def encode(in_file, out_file, name=None, mode=None):
out_file.write(('begin %o %s\n' % ((mode & 0o777), name)).encode("ascii"))
data = in_file.read(45)
while len(data) > 0:
out_file.write(binascii.b2a_uu(data))
out_file.write(binascii.b2a_uu(data, backtick=backtick))
data = in_file.read(45)
out_file.write(b' \nend\n')
if backtick:
out_file.write(b'`\nend\n')
else:
out_file.write(b' \nend\n')
finally:
for f in opened_files:
f.close()
Expand Down
3 changes: 3 additions & 0 deletions Misc/NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,9 @@ Extension Modules
Library
-------

- bpo-30103: binascii.b2a_uu() and uu.encode() now support using ``'`'``
as zero instead of space.

- bpo-28556: Various updates to typing module: add typing.NoReturn type, use
WrapperDescriptorType, minor bug-fixes. Original PRs by
Jim Fasarakis-Hilliard and Ivan Levkivskyi.
Expand Down
16 changes: 12 additions & 4 deletions Modules/binascii.c
Original file line number Diff line number Diff line change
Expand Up @@ -335,13 +335,15 @@ binascii.b2a_uu

data: Py_buffer
/
*
backtick: bool(accept={int}) = False

Uuencode line of data.
[clinic start generated code]*/

static PyObject *
binascii_b2a_uu_impl(PyObject *module, Py_buffer *data)
/*[clinic end generated code: output=0070670e52e4aa6b input=00fdf458ce8b465b]*/
binascii_b2a_uu_impl(PyObject *module, Py_buffer *data, int backtick)
/*[clinic end generated code: output=b1b99de62d9bbeb8 input=b26bc8d32b6ed2f6]*/
{
unsigned char *ascii_data;
const unsigned char *bin_data;
Expand All @@ -367,7 +369,10 @@ binascii_b2a_uu_impl(PyObject *module, Py_buffer *data)
return NULL;

/* Store the length */
*ascii_data++ = ' ' + (bin_len & 077);
if (backtick && !bin_len)
*ascii_data++ = '`';
else
*ascii_data++ = ' ' + bin_len;

for( ; bin_len > 0 || leftbits != 0 ; bin_len--, bin_data++ ) {
/* Shift the data (or padding) into our buffer */
Expand All @@ -381,7 +386,10 @@ binascii_b2a_uu_impl(PyObject *module, Py_buffer *data)
while ( leftbits >= 6 ) {
this_ch = (leftchar >> (leftbits-6)) & 0x3f;
leftbits -= 6;
*ascii_data++ = this_ch + ' ';
if (backtick && !this_ch)
*ascii_data++ = '`';
else
*ascii_data++ = this_ch + ' ';
}
}
*ascii_data++ = '\n'; /* Append a courtesy newline */
Expand Down
18 changes: 11 additions & 7 deletions Modules/clinic/binascii.c.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,27 +34,31 @@ binascii_a2b_uu(PyObject *module, PyObject *arg)
}

PyDoc_STRVAR(binascii_b2a_uu__doc__,
"b2a_uu($module, data, /)\n"
"b2a_uu($module, data, /, *, backtick=False)\n"
"--\n"
"\n"
"Uuencode line of data.");

#define BINASCII_B2A_UU_METHODDEF \
{"b2a_uu", (PyCFunction)binascii_b2a_uu, METH_O, binascii_b2a_uu__doc__},
{"b2a_uu", (PyCFunction)binascii_b2a_uu, METH_FASTCALL, binascii_b2a_uu__doc__},

static PyObject *
binascii_b2a_uu_impl(PyObject *module, Py_buffer *data);
binascii_b2a_uu_impl(PyObject *module, Py_buffer *data, int backtick);

static PyObject *
binascii_b2a_uu(PyObject *module, PyObject *arg)
binascii_b2a_uu(PyObject *module, PyObject **args, Py_ssize_t nargs, PyObject *kwnames)
{
PyObject *return_value = NULL;
static const char * const _keywords[] = {"", "backtick", NULL};
static _PyArg_Parser _parser = {"y*|$i:b2a_uu", _keywords, 0};
Py_buffer data = {NULL, NULL};
int backtick = 0;

if (!PyArg_Parse(arg, "y*:b2a_uu", &data)) {
if (!_PyArg_ParseStackAndKeywords(args, nargs, kwnames, &_parser,
&data, &backtick)) {
goto exit;
}
return_value = binascii_b2a_uu_impl(module, &data);
return_value = binascii_b2a_uu_impl(module, &data, backtick);

exit:
/* Cleanup for data */
Expand Down Expand Up @@ -558,4 +562,4 @@ binascii_b2a_qp(PyObject *module, PyObject **args, Py_ssize_t nargs, PyObject *k

return return_value;
}
/*[clinic end generated code: output=35821bce7e0e4714 input=a9049054013a1b77]*/
/*[clinic end generated code: output=9db57e86dbe7b2fa input=a9049054013a1b77]*/