Skip to content

Commit bc6f74a

Browse files
matthewbelisle-wfvstinner
authored andcommitted
bpo-34866: Add max_num_fields to cgi.FieldStorage (GH-9660) (GH-9969)
Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by limiting the number of `MiniFieldStorage` objects created by `FieldStorage`. (cherry picked from commit 2091448)
1 parent 64ffee7 commit bc6f74a

File tree

6 files changed

+127
-19
lines changed

6 files changed

+127
-19
lines changed

Doc/library/cgi.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -292,12 +292,12 @@ algorithms implemented in this module in other circumstances.
292292
passed to :func:`urlparse.parse_qs` unchanged.
293293

294294

295-
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]])
295+
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
296296

297297
This function is deprecated in this module. Use :func:`urlparse.parse_qs`
298298
instead. It is maintained here only for backward compatibility.
299299

300-
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]])
300+
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
301301

302302
This function is deprecated in this module. Use :func:`urlparse.parse_qsl`
303303
instead. It is maintained here only for backward compatibility.

Doc/library/urlparse.rst

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ The :mod:`urlparse` module defines the following functions:
126126
Added IPv6 URL parsing capabilities.
127127

128128

129-
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]])
129+
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
130130

131131
Parse a query string given as a string argument (data of type
132132
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a
@@ -143,14 +143,20 @@ The :mod:`urlparse` module defines the following functions:
143143
parsing errors. If false (the default), errors are silently ignored. If true,
144144
errors raise a :exc:`ValueError` exception.
145145

146+
The optional argument *max_num_fields* is the maximum number of fields to
147+
read. If set, then throws a :exc:`ValueError` if there are more than
148+
*max_num_fields* fields read.
149+
146150
Use the :func:`urllib.urlencode` function to convert such dictionaries into
147151
query strings.
148152

149153
.. versionadded:: 2.6
150154
Copied from the :mod:`cgi` module.
151155

156+
.. versionchanged:: 2.7.16
157+
Added *max_num_fields* parameter.
152158

153-
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]])
159+
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
154160

155161
Parse a query string given as a string argument (data of type
156162
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of
@@ -166,12 +172,18 @@ The :mod:`urlparse` module defines the following functions:
166172
parsing errors. If false (the default), errors are silently ignored. If true,
167173
errors raise a :exc:`ValueError` exception.
168174

175+
The optional argument *max_num_fields* is the maximum number of fields to
176+
read. If set, then throws a :exc:`ValueError` if there are more than
177+
*max_num_fields* fields read.
178+
169179
Use the :func:`urllib.urlencode` function to convert such lists of pairs into
170180
query strings.
171181

172182
.. versionadded:: 2.6
173183
Copied from the :mod:`cgi` module.
174184

185+
.. versionchanged:: 2.7.16
186+
Added *max_num_fields* parameter.
175187

176188
.. function:: urlunparse(parts)
177189

Lib/cgi.py

Lines changed: 36 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -184,11 +184,12 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
184184
return urlparse.parse_qs(qs, keep_blank_values, strict_parsing)
185185

186186

187-
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
187+
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
188188
"""Parse a query given as a string argument."""
189189
warn("cgi.parse_qsl is deprecated, use urlparse.parse_qsl instead",
190190
PendingDeprecationWarning, 2)
191-
return urlparse.parse_qsl(qs, keep_blank_values, strict_parsing)
191+
return urlparse.parse_qsl(qs, keep_blank_values, strict_parsing,
192+
max_num_fields)
192193

193194
def parse_multipart(fp, pdict):
194195
"""Parse multipart input.
@@ -393,7 +394,8 @@ class FieldStorage:
393394
"""
394395

395396
def __init__(self, fp=None, headers=None, outerboundary="",
396-
environ=os.environ, keep_blank_values=0, strict_parsing=0):
397+
environ=os.environ, keep_blank_values=0, strict_parsing=0,
398+
max_num_fields=None):
397399
"""Constructor. Read multipart/* until last part.
398400
399401
Arguments, all optional:
@@ -420,10 +422,14 @@ def __init__(self, fp=None, headers=None, outerboundary="",
420422
If false (the default), errors are silently ignored.
421423
If true, errors raise a ValueError exception.
422424
425+
max_num_fields: int. If set, then __init__ throws a ValueError
426+
if there are more than n fields read by parse_qsl().
427+
423428
"""
424429
method = 'GET'
425430
self.keep_blank_values = keep_blank_values
426431
self.strict_parsing = strict_parsing
432+
self.max_num_fields = max_num_fields
427433
if 'REQUEST_METHOD' in environ:
428434
method = environ['REQUEST_METHOD'].upper()
429435
self.qs_on_post = None
@@ -606,10 +612,9 @@ def read_urlencoded(self):
606612
qs = self.fp.read(self.length)
607613
if self.qs_on_post:
608614
qs += '&' + self.qs_on_post
609-
self.list = list = []
610-
for key, value in urlparse.parse_qsl(qs, self.keep_blank_values,
611-
self.strict_parsing):
612-
list.append(MiniFieldStorage(key, value))
615+
query = urlparse.parse_qsl(qs, self.keep_blank_values,
616+
self.strict_parsing, self.max_num_fields)
617+
self.list = [MiniFieldStorage(key, value) for key, value in query]
613618
self.skip_lines()
614619

615620
FieldStorageClass = None
@@ -621,19 +626,38 @@ def read_multi(self, environ, keep_blank_values, strict_parsing):
621626
raise ValueError, 'Invalid boundary in multipart form: %r' % (ib,)
622627
self.list = []
623628
if self.qs_on_post:
624-
for key, value in urlparse.parse_qsl(self.qs_on_post,
625-
self.keep_blank_values, self.strict_parsing):
626-
self.list.append(MiniFieldStorage(key, value))
629+
query = urlparse.parse_qsl(self.qs_on_post,
630+
self.keep_blank_values,
631+
self.strict_parsing,
632+
self.max_num_fields)
633+
self.list.extend(MiniFieldStorage(key, value)
634+
for key, value in query)
627635
FieldStorageClass = None
628636

637+
# Propagate max_num_fields into the sub class appropriately
638+
max_num_fields = self.max_num_fields
639+
if max_num_fields is not None:
640+
max_num_fields -= len(self.list)
641+
629642
klass = self.FieldStorageClass or self.__class__
630643
part = klass(self.fp, {}, ib,
631-
environ, keep_blank_values, strict_parsing)
644+
environ, keep_blank_values, strict_parsing,
645+
max_num_fields)
646+
632647
# Throw first part away
633648
while not part.done:
634649
headers = rfc822.Message(self.fp)
635650
part = klass(self.fp, headers, ib,
636-
environ, keep_blank_values, strict_parsing)
651+
environ, keep_blank_values, strict_parsing,
652+
max_num_fields)
653+
654+
if max_num_fields is not None:
655+
max_num_fields -= 1
656+
if part.list:
657+
max_num_fields -= len(part.list)
658+
if max_num_fields < 0:
659+
raise ValueError('Max number of fields exceeded')
660+
637661
self.list.append(part)
638662
self.skip_lines()
639663

Lib/test/test_cgi.py

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
from io import BytesIO
12
from test.test_support import run_unittest, check_warnings
23
import cgi
34
import os
@@ -316,6 +317,60 @@ def testQSAndUrlEncode(self):
316317
v = gen_result(data, environ)
317318
self.assertEqual(self._qs_result, v)
318319

320+
def test_max_num_fields(self):
321+
# For application/x-www-form-urlencoded
322+
data = '&'.join(['a=a']*11)
323+
environ = {
324+
'CONTENT_LENGTH': str(len(data)),
325+
'CONTENT_TYPE': 'application/x-www-form-urlencoded',
326+
'REQUEST_METHOD': 'POST',
327+
}
328+
329+
with self.assertRaises(ValueError):
330+
cgi.FieldStorage(
331+
fp=BytesIO(data.encode()),
332+
environ=environ,
333+
max_num_fields=10,
334+
)
335+
336+
# For multipart/form-data
337+
data = """---123
338+
Content-Disposition: form-data; name="a"
339+
340+
3
341+
---123
342+
Content-Type: application/x-www-form-urlencoded
343+
344+
a=4
345+
---123
346+
Content-Type: application/x-www-form-urlencoded
347+
348+
a=5
349+
---123--
350+
"""
351+
environ = {
352+
'CONTENT_LENGTH': str(len(data)),
353+
'CONTENT_TYPE': 'multipart/form-data; boundary=-123',
354+
'QUERY_STRING': 'a=1&a=2',
355+
'REQUEST_METHOD': 'POST',
356+
}
357+
358+
# 2 GET entities
359+
# 1 top level POST entities
360+
# 1 entity within the second POST entity
361+
# 1 entity within the third POST entity
362+
with self.assertRaises(ValueError):
363+
cgi.FieldStorage(
364+
fp=BytesIO(data.encode()),
365+
environ=environ,
366+
max_num_fields=4,
367+
)
368+
cgi.FieldStorage(
369+
fp=BytesIO(data.encode()),
370+
environ=environ,
371+
max_num_fields=5,
372+
)
373+
319374
def testQSAndFormData(self):
320375
data = """
321376
---123

Lib/urlparse.py

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -361,7 +361,7 @@ def unquote(s):
361361
append(item)
362362
return ''.join(res)
363363

364-
def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
364+
def parse_qs(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
365365
"""Parse a query given as a string argument.
366366
367367
Arguments:
@@ -378,16 +378,20 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
378378
strict_parsing: flag indicating what to do with parsing errors.
379379
If false (the default), errors are silently ignored.
380380
If true, errors raise a ValueError exception.
381+
382+
max_num_fields: int. If set, then throws a ValueError if there
383+
are more than n fields read by parse_qsl().
381384
"""
382385
dict = {}
383-
for name, value in parse_qsl(qs, keep_blank_values, strict_parsing):
386+
for name, value in parse_qsl(qs, keep_blank_values, strict_parsing,
387+
max_num_fields):
384388
if name in dict:
385389
dict[name].append(value)
386390
else:
387391
dict[name] = [value]
388392
return dict
389393

390-
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
394+
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
391395
"""Parse a query given as a string argument.
392396
393397
Arguments:
@@ -404,8 +408,19 @@ def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
404408
false (the default), errors are silently ignored. If true,
405409
errors raise a ValueError exception.
406410
411+
max_num_fields: int. If set, then throws a ValueError if there
412+
are more than n fields read by parse_qsl().
413+
407414
Returns a list, as G-d intended.
408415
"""
416+
# If max_num_fields is defined then check that the number of fields
417+
# is less than max_num_fields. This prevents a memory exhaustion DOS
418+
# attack via post bodies with many fields.
419+
if max_num_fields is not None:
420+
num_fields = 1 + qs.count('&') + qs.count(';')
421+
if max_num_fields < num_fields:
422+
raise ValueError('Max number of fields exceeded')
423+
409424
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
410425
r = []
411426
for name_value in pairs:
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Adding ``max_num_fields`` to ``cgi.FieldStorage`` to make DOS attacks harder by
2+
limiting the number of ``MiniFieldStorage`` objects created by ``FieldStorage``.

0 commit comments

Comments
 (0)